
Dagster is a system for building modern data applications.
Elegant programming model: Dagster provides a set of abstractions for building self-describing, testable, and reliable data applications. It embraces the principles of functional data programming; gradual, optional typing; and testability as a first-class value.

Flexible & incremental: Dagster integrates with your existing tools and systems, and can invoke any computation–whether it be Spark, Python, a Jupyter notebook, or SQL. It is also designed to work with your existing systems like Kubernetes.

Beautiful tools: Dagster's development environment, dagit, is designed to facilitate local development for data engineers, machine learning engineers, and data scientists. It also can be run as a production service, to support operating, debugging, and maintaining large-scale production data pipelines.

pip install dagster dagit
This installs two modules:
hello_dagster.py
from dagster import execute_pipeline, pipeline, solid
@solid
def get_name(_):
return 'dagster'
@solid
def hello(context, name: str):
context.log.info('Hello, {name}!'.format(name=name))
@pipeline
def hello_pipeline():
hello(get_name())
Save the code above in a file named hello_dagster.py. You can execute the pipeline using any one
of the following methods:
(1) Dagster Python API
if __name__ == "__main__":
execute_pipeline(hello_pipeline) # Hello, dagster!
(2) Dagster CLI
$ dagster pipeline execute -f hello_dagster.py
(3) Dagit web UI
$ dagit -f hello_dagster.py
Next, jump right into our tutorial, or read our complete documentation. If you're actively using Dagster or have questions on getting started, we'd love to hear from you:
For details on contributing or running the project for development, check out our contributing guide.
Dagster works with the tools and systems that you're already using with your data, including:
| Integration | Dagster Library | |
![]() |
Apache Airflow | dagster-airflow Allows Dagster pipelines to be scheduled and executed, either containerized or uncontainerized, as Apache Airflow DAGs. |
![]() |
Apache Spark | dagster-spark · dagster-pyspark Libraries for interacting with Apache Spark and PySpark. |
![]() |
Dask | dagster-dask Provides a Dagster integration with Dask / Dask.Distributed. |
![]() |
Datadog | dagster-datadog Provides a Dagster resource for publishing metrics to Datadog. |
/
|
Jupyter / Papermill | dagstermill Built on the papermill library, dagstermill is meant for integrating productionized Jupyter notebooks into dagster pipelines. |
![]() |
PagerDuty | dagster-pagerduty A library for creating PagerDuty alerts from Dagster workflows. |
![]() |
Snowflake | dagster-snowflake A library for interacting with the Snowflake Data Warehouse. |
| Cloud Providers | ||
|
AWS | dagster-aws A library for interacting with Amazon Web Services. Provides integrations with Cloudwatch, S3, EMR, and Redshift. |
|
Azure | dagster-azure A library for interacting with Microsoft Azure. |
|
GCP | dagster-gcp A library for interacting with Google Cloud Platform. Provides integrations with GCS, BigQuery, and Cloud Dataproc. |
This list is growing as we are actively building more integrations, and we welcome contributions!
$ claude mcp add dagster \
-- python -m otcore.mcp_server <graph>