The DAG is dead for data engineering

DataForge uses the software engineering concepts of inversion of control and event-driven architecture to automate data pipeline orchestration. Eliminate the need to define Directed Acyclic Graphs (DAGs) manually and let your functional code decide when and how to execute.

The next generation of data orchestration

Your transformation code is your orchestration code

No code required

Inversion of control combined with functional programming allows your transformation code to also define the order of operations required to process data correctly. No need to manually analyze your pipeline logic and data to determine optimal execution order.

Standardized stages for common tasks

Hyperparameters for data engineering

DataForge Cloud provides predefined and tune-able stages to simplify and generate code for the most common types of data processing. Just input basic configurations and DataForge will combine them with the live incoming data elements to generate efficient code and associated orchestration steps.

Event-driven workflow engine

Just define the start

Use the DataForge scheduling service, file watcher, REST API, or SDK to initialize processing, then the built-in dependency engine handles the rest. It tracks all processes and determines next steps, waits, and retries. Run thousands of concurrent pipelines in parallel and manual one-offs without worry.

Optimize cloud spend with dynamic clusters

Infrastructure integrated

DataForge Cloud provides an automated infrastructure management service combined with orchestration for Databricks customers. By leveraging metadata as well as the most cost effective available cloud products, DataForge helps minimize spend and maximize performance.

DataForge Cloud

Dataforge Cloud is the fastest and most reliable way to deploy DataForge. Develop, orchestrate, operate, and audit functional code pipelines in an all-in-one web-based UI.

DataForge Core

DataForge Core is an open source command line tool that enables teams to write functional data transformation code following software engineering best practices and principles.