hub / github.com/dagster-io/dagster

github.com/dagster-io/dagster @nightly-2020.07.17 sqlite

repository ↗ · DeepWiki ↗ · release nightly-2020.07.17 ↗

14,334 symbols 56,957 edges 1,830 files 953 documented · 7% 40 cross-repo links

README

Dagster

Dagster is a system for building modern data applications.

Elegant programming model: Dagster provides a set of abstractions for building self-describing, testable, and reliable data applications. It embraces the principles of functional data programming; gradual, optional typing; and testability as a first-class value.

Flexible & incremental: Dagster integrates with your existing tools and systems, and can invoke any computation–whether it be Spark, Python, a Jupyter notebook, or SQL. It is also designed to work with your existing systems like Kubernetes.

Beautiful tools: Dagster's development environment, dagit, is designed to facilitate local development for data engineers, machine learning engineers, and data scientists. It also can be run as a production service, to support operating, debugging, and maintaining large-scale production data pipelines.

Getting Started

Installation

pip install dagster dagit

This installs two modules:

Dagster: the core programming model and abstraction stack; stateless, single-node, single-process and multi-process execution engines; and a CLI tool for driving those engines.
Dagit: the UI for developing and operating Dagster pipelines, including a DAG browser, a type-aware config editor, and a live execution interface.

Hello dagster 👋

hello_dagster.py

from dagster import execute_pipeline, pipeline, solid


@solid
def get_name(_):
    return 'dagster'


@solid
def hello(context, name: str):
    context.log.info('Hello, {name}!'.format(name=name))


@pipeline
def hello_pipeline():
    hello(get_name())

Save the code above in a file named hello_dagster.py. You can execute the pipeline using any one of the following methods:

(1) Dagster Python API

if __name__ == "__main__":
    execute_pipeline(hello_pipeline)   # Hello, dagster!

(2) Dagster CLI

$ dagster pipeline execute -f hello_dagster.py

(3) Dagit web UI

$ dagit -f hello_dagster.py

Learn

Next, jump right into our tutorial, or read our complete documentation. If you're actively using Dagster or have questions on getting started, we'd love to hear from you:

Contributing

For details on contributing or running the project for development, check out our contributing guide.

Integrations

Dagster works with the tools and systems that you're already using with your data, including:

Integration	Dagster Library
	Apache Airflow	dagster-airflow Allows Dagster pipelines to be scheduled and executed, either containerized or uncontainerized, as Apache Airflow DAGs.
	Apache Spark	dagster-spark · dagster-pyspark Libraries for interacting with Apache Spark and PySpark.
	Dask	dagster-dask Provides a Dagster integration with Dask / Dask.Distributed.
	Datadog	dagster-datadog Provides a Dagster resource for publishing metrics to Datadog.
/	Jupyter / Papermill	dagstermill Built on the papermill library, dagstermill is meant for integrating productionized Jupyter notebooks into dagster pipelines.
	PagerDuty	dagster-pagerduty A library for creating PagerDuty alerts from Dagster workflows.
	Snowflake	dagster-snowflake A library for interacting with the Snowflake Data Warehouse.
Cloud Providers
	AWS	dagster-aws A library for interacting with Amazon Web Services. Provides integrations with Cloudwatch, S3, EMR, and Redshift.
	Azure	dagster-azure A library for interacting with Microsoft Azure.
	GCP	dagster-gcp A library for interacting with Google Cloud Platform. Provides integrations with GCS, BigQuery, and Cloud Dataproc.

This list is growing as we are actively building more integrations, and we welcome contributions!

Extension points exported contracts — how you extend this code

Versions (Interface)

(no doc)

docs/next/src/scripts/updateVersion.ts

Clipboard (Interface)

(no doc)

js_modules/dagit/types/navigator.clipboard.d.ts

Version (Interface)

(no doc)

docs/next/src/scripts/updateVersion.ts

NavigatorClipboard (Interface)

(no doc)

js_modules/dagit/types/navigator.clipboard.d.ts

Navigator (Interface)

(no doc)

js_modules/dagit/types/navigator.clipboard.d.ts

ISidebarSolidInvocationProps (Interface)

(no doc)

js_modules/dagit/src/SidebarSolidInvocation.tsx

GraphQueryInputProps (Interface)

(no doc)

js_modules/dagit/src/GraphQueryInput.tsx

Core symbols most depended-on inside this repo

format

called by 1160

python_modules/dagster/dagster/loggers/__init__.py

execute_pipeline

called by 637

python_modules/dagster/dagster/core/execution/api.py

get

called by 581

python_modules/dagster/dagster/core/instance/__init__.py

join

called by 579

python_modules/dagster/dagster/core/launcher/base.py

output_value

called by 340

python_modules/dagster/dagster/core/execution/results.py

file_relative_path

called by 321

python_modules/dagster/dagster/utils/__init__.py

open

called by 294

js_modules/dagit/src/runs/ComputeLogModal.tsx

result_for_solid

called by 286

python_modules/dagster/dagster/core/execution/results.py

Shape

Function 6,805

Method 3,438

Interface 2,017

Class 1,448

Route 608

Enum 18

Languages

Python80%

TypeScript20%

Modules by API surface

js_modules/dagit/src/types/PipelineExplorerRootQuery.ts159 symbols

python_modules/dagster-graphql/dagster_graphql_tests/graphql/setup.py151 symbols

python_modules/dagster-graphql/dagster_graphql/schema/errors.py139 symbols

python_modules/dagster-graphql/dagster_graphql/schema/runs.py137 symbols

python_modules/dagster-graphql/dagster_graphql/schema/roots.py129 symbols

python_modules/dagster/dagster_tests/core_tests/definitions_tests/test_composition.py115 symbols

python_modules/dagster/dagster_tests/core_tests/test_pipeline_execution.py91 symbols

python_modules/dagster/dagster/core/instance/__init__.py91 symbols

python_modules/dagster/dagster_tests/core_tests/config_types_tests/test_config_type_system.py88 symbols

python_modules/libraries/dagster-pandas/dagster_pandas/constraints.py87 symbols

python_modules/dagster/dagster/core/types/dagster_type.py83 symbols

js_modules/dagit/src/types/SidebarTabbedContainerSolidQuery.ts79 symbols

Dependencies from manifests, versioned

@babel/core7.7.2 · 1×

@babel/plugin-proposal-optional-chaining7.6.0 · 1×

@babel/preset-env7.7.1 · 1×

@babel/preset-typescript7.7.2 · 1×

@babel/traverse7.9.0 · 1×

@blueprintjs/core3.23.0 · 1×

@blueprintjs/icons3.13.0 · 1×

@blueprintjs/select3.11.2 · 1×

@blueprintjs/table3.8.3 · 1×

@fullhuman/postcss-purgecss2.1.0 · 1×

@mdx-js/loader1.5.7 · 1×

@mdx-js/mdx1.6.1 · 1×

For agents

$ claude mcp add dagster \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact