Blocks¶
Blocks are a primitive within Prefect that enable the storage of configuration and provide an interface for interacting with external systems.
With blocks, you can securely store credentials for authenticating with services like AWS, GitHub, Slack, and any other system you'd like to orchestrate with Prefect.
Blocks expose methods that provide pre-built functionality for performing actions against an external system. They can be used to download data from or upload data to an S3 bucket, query data from or write data to a database, or send a message to a Slack channel.
You may configure blocks through code or via the Prefect Cloud and the Prefect server UI.
You can access blocks for both configuring flow deployments and directly from within your flow code.
Prefect provides some built-in block types that you can use right out of the box. Additional blocks are available through Prefect Collections. To use these blocks you can pip install
the collection, then register the blocks you want to use with Prefect Cloud or a Prefect server.
Prefect Cloud and the Prefect server UI display a library of block types available for you to configure blocks that may be used by your flows.
Blocks and parameters
Blocks are useful for configuration that needs to be shared across flow runs and between flows.
For configuration that will change between flow runs, we recommend using parameters.
Prefect built-in blocks¶
Prefect provides a broad range of commonly used, built-in block types. These block types are available in Prefect Cloud and the Prefect server UI.
Block | Slug | Description |
---|---|---|
Azure | azure |
Store data as a file on Azure Datalake and Azure Blob Storage. |
Date Time | date-time |
A block that represents a datetime. |
Docker Container | docker-container |
Runs a command in a container. |
Docker Registry | docker-registry |
Connects to a Docker registry. Requires a Docker Engine to be connectable. |
GCS | gcs |
Store data as a file on Google Cloud Storage. |
GitHub | github |
Interact with files stored on public GitHub repositories. |
JSON | json |
A block that represents JSON. |
Kubernetes Cluster Config | kubernetes-cluster-config |
Stores configuration for interaction with Kubernetes clusters. |
Kubernetes Job | kubernetes-job |
Runs a command as a Kubernetes Job. |
Local File System | local-file-system |
Store data as a file on a local file system. |
Microsoft Teams Webhook | ms-teams-webhook |
Enables sending notifications via a provided Microsoft Teams webhook. |
Opsgenie Webhook | opsgenie-webhook |
Enables sending notifications via a provided Opsgenie webhook. |
Pager Duty Webhook | pager-duty-webhook |
Enables sending notifications via a provided PagerDuty webhook. |
Process | process |
Run a command in a new process. |
Remote File System | remote-file-system |
Store data as a file on a remote file system. Supports any remote file system supported by fsspec . |
S3 | s3 |
Store data as a file on AWS S3. |
Secret | secret |
A block that represents a secret value. The value stored in this block will be obfuscated when this block is logged or shown in the UI. |
Slack Webhook | slack-webhook |
Enables sending notifications via a provided Slack webhook. |
SMB | smb |
Store data as a file on a SMB share. |
String | string |
A block that represents a string. |
Twilio SMS | twilio-sms |
Enables sending notifications via Twilio SMS. |
Webhook | webhook |
Block that enables calling webhooks. |
Blocks in Prefect Collections¶
Blocks can also be created by anyone and shared with the community. You'll find blocks that are available for consumption in many of the published Prefect Collections. The following table provides an overview of the blocks available from our most popular Prefect Collections.
Using existing block types¶
Blocks are classes that subclass the Block
base class. They can be instantiated and used like normal classes.
Instantiating blocks¶
For example, to instantiate a block that stores a JSON value, use the JSON
block:
from prefect.blocks.system import JSON
json_block = JSON(value={"the_answer": 42})
Saving blocks¶
If this JSON value needs to be retrieved later to be used within a flow or task, we can use the .save()
method on the block to store the value in a block document on the Prefect database for retrieval later:
json_block.save(name="life-the-universe-everything")
Utilizing the UI
Blocks documents can also be created and updated via the Prefect UI.
Loading blocks¶
The name given when saving the value stored in the JSON block can be used when retrieving the value during a flow or task run:
from prefect import flow
from prefect.blocks.system import JSON
@flow
def what_is_the_answer():
json_block = JSON.load("life-the-universe-everything")
print(json_block.value["the_answer"])
what_is_the_answer() # 42
Blocks can also be loaded with a unique slug that is a combination of a block type slug and a block document name.
To load our JSON block document from before, we can run the following:
from prefect.blocks.core import Block
json_block = Block.load("json/life-the-universe-everything")
print(json_block.value["the-answer"]) #42
Sharing Blocks
Blocks can also be loaded by fellow Workspace Collaborators, available on Prefect Cloud.
Creating new block types¶
To create a custom block type, define a class that subclasses Block
. The Block
base class builds off of Pydantic's BaseModel
, so custom blocks can be declared in same manner as a Pydantic model.
Here's a block that represents a cube and holds information about the length of each edge in inches:
from prefect.blocks.core import Block
class Cube(Block):
edge_length_inches: float
You can also include methods on a block include useful functionality. Here's the same cube block with methods to calculate the volume and surface area of the cube:
from prefect.blocks.core import Block
class Cube(Block):
edge_length_inches: float
def get_volume(self):
return self.edge_length_inches**3
def get_surface_area(self):
return 6 * self.edge_length_inches**2
Now the Cube
block can be used to store different cube configuration that can later be used in a flow:
from prefect import flow
rubiks_cube = Cube(edge_length_inches=2.25)
rubiks_cube.save("rubiks-cube")
@flow
def calculate_cube_surface_area(cube_name):
cube = Cube.load(cube_name)
print(cube.get_surface_area())
calculate_cube_surface_area("rubiks-cube") # 30.375
Secret fields¶
All block values are encrypted before being stored, but if you have values that you would not like visible in the UI or in logs, then you can use the SecretStr
field type provided by Pydantic to automatically obfuscate those values. This can be useful for fields that are used to store credentials like passwords and API tokens.
Here's an example of an AWSCredentials
block that uses SecretStr
:
from typing import Optional
from prefect.blocks.core import Block
from pydantic import SecretStr
class AWSCredentials(Block):
aws_access_key_id: Optional[str] = None
aws_secret_access_key: Optional[SecretStr] = None
aws_session_token: Optional[str] = None
profile_name: Optional[str] = None
region_name: Optional[str] = None
Because aws_secret_access_key
has the SecretStr
type hint assigned to it, the value of that field will not be exposed if the object is logged:
aws_credentials_block = AWSCredentials(
aws_access_key_id="AKIAJKLJKLJKLJKLJKLJK",
aws_secret_access_key="secret_access_key"
)
print(aws_credentials_block)
# aws_access_key_id='AKIAJKLJKLJKLJKLJKLJK' aws_secret_access_key=SecretStr('**********') aws_session_token=None profile_name=None region_name=None
There's also use the SecretDict
field type provided by Prefect. This type will allow you to add a dictionary field to your block that will have values at all levels automatically obfuscated in the UI or in logs. This is useful for blocks where typing or structure of secret fields is not known until configuration time.
Here's an example of a block that uses SecretDict
:
from typing import Dict
from prefect.blocks.core import Block
from prefect.blocks.fields import SecretDict
class SystemConfiguration(Block):
system_secrets: SecretDict
system_variables: Dict
system_configuration_block = SystemConfiguration(
system_secrets={
"password": "p@ssw0rd",
"api_token": "token_123456789",
"private_key": "<private key here>",
},
system_variables={
"self_destruct_countdown_seconds": 60,
"self_destruct_countdown_stop_time": 7,
},
)
system_secrets
will be obfuscated when system_configuration_block
is displayed, but system_variables
will be shown in plain-text:
print(system_configuration_block)
# SystemConfiguration(
# system_secrets=SecretDict('{'password': '**********', 'api_token': '**********', 'private_key': '**********'}'),
# system_variables={'self_destruct_countdown_seconds': 60, 'self_destruct_countdown_stop_time': 7}
# )
Blocks metadata¶
The way that a block is displayed can be controlled by metadata fields that can be set on a block subclass.
Available metadata fields include:
Property | Description |
---|---|
_block_type_name | Display name of the block in the UI. Defaults to the class name. |
_block_type_slug | Unique slug used to reference the block type in the API. Defaults to a lowercase, dash-delimited version of the block type name. |
_logo_url | URL pointing to an image that should be displayed for the block type in the UI. Default to None . |
_description | Short description of block type. Defaults to docstring, if provided. |
_code_example | Short code snippet shown in UI for how to load/use block type. Default to first example provided in the docstring of the class, if provided. |
Nested blocks¶
Block are composable. This means that you can create a block that uses functionality from another block by declaring it as an attribute on the block that you're creating. It also means that configuration can be changed for each block independently, which allows configuration that may change on different time frames to be easily managed and configuration can be shared across multiple use cases.
To illustrate, here's a an expanded AWSCredentials
block that includes the ability to get an authenticated session via the boto3
library:
from typing import Optional
import boto3
from prefect.blocks.core import Block
from pydantic import SecretStr
class AWSCredentials(Block):
aws_access_key_id: Optional[str] = None
aws_secret_access_key: Optional[SecretStr] = None
aws_session_token: Optional[str] = None
profile_name: Optional[str] = None
region_name: Optional[str] = None
def get_boto3_session(self):
return boto3.Session(
aws_access_key_id = self.aws_access_key_id
aws_secret_access_key = self.aws_secret_access_key
aws_session_token = self.aws_session_token
profile_name = self.profile_name
region_name = self.region
)
The AWSCredentials
block can be used within an S3Bucket block to provide authentication when interacting with an S3 bucket:
import io
class S3Bucket(Block):
bucket_name: str
credentials: AWSCredentials
def read(self, key: str) -> bytes:
s3_client = self.credentials.get_boto3_session().client("s3")
stream = io.BytesIO()
s3_client.download_fileobj(Bucket=self.bucket_name, key=key, Fileobj=stream)
stream.seek(0)
output = stream.read()
return output
def write(self, key: str, data: bytes) -> None:
s3_client = self.credentials.get_boto3_session().client("s3")
stream = io.BytesIO(data)
s3_client.upload_fileobj(stream, Bucket=self.bucket_name, Key=key)
You can use this S3Bucket
block with previously saved AWSCredentials
block values in order to interact with the configured S3 bucket:
my_s3_bucket = S3Bucket(
bucket_name="my_s3_bucket",
credentials=AWSCredentials.load("my_aws_credentials")
)
my_s3_bucket.save("my_s3_bucket")
Saving block values like this links the values of the two blocks so that any changes to the values stored for the AWSCredentials
block with the name my_aws_credentials
will be seen the next time that block values for the S3Bucket
block named my_s3_bucket
is loaded.
Values for nested blocks can also be hard coded by not first saving child blocks:
my_s3_bucket = S3Bucket(
bucket_name="my_s3_bucket",
credentials=AWSCredentials(
aws_access_key_id="AKIAJKLJKLJKLJKLJKLJK",
aws_secret_access_key="secret_access_key"
)
)
my_s3_bucket.save("my_s3_bucket")
In the above example, the values for AWSCredentials
are saved with my_s3_bucket
and will not be usable with any other blocks.
Handling updates to custom Block
types¶
Let's say that you now want to add a bucket_folder
field to your custom S3Bucket
block that represents the default path to read and write objects from (this field exists on our implementation).
We can add the new field to the class definition:
class S3Bucket(Block):
bucket_name: str
credentials: AWSCredentials
bucket_folder: str = None
...
Then register the updated block type with either Prefect Cloud or your self-hosted Prefect server.
If you have any existing blocks of this type that were created before the update and you'd prefer to not re-create them, you can migrate them to the new version of your block type by adding the missing values:
# Bypass Pydantic validation to allow your local Block class to load the old block version
my_s3_bucket_block = S3Bucket.load("my-s3-bucket", validate=False)
# Set the new field to an appropriate value
my_s3_bucket_block.bucket_path = "my-default-bucket-path"
# Overwrite the old block values and update the expected fields on the block
my_s3_bucket_block.save("my-s3-bucket", overwrite=True)
Registering blocks for use in the Prefect UI¶
Blocks can be registered from a Python module available in the current virtual environment with a CLI command like this:
$ prefect block register --module prefect_aws.credentials
This command is useful for registering all blocks found in the credentials module within Prefect Collections.
Or, if a block has been created in a .py
file, the block can also be registered with the CLI command:
$ prefect block register --file my_block.py
The registered block will then be available in the Prefect UI for configuration.