Airflow v1 vs v2

Apache Airflow has become the de facto standard for orchestrating data workflows, enabling teams to author, schedule, and monitor complex pipelines with ease.

Originally developed at Airbnb, it has grown into a mature open-source project used by organizations across the globe.

With the release of Airflow v2, the project underwent a significant transformation.

While Airflow v1 laid the foundation, v2 introduced critical enhancements aimed at addressing long-standing issues around scalability, stability, and developer experience.

The shift from v1 to v2 wasn’t just incremental—it redefined how teams build and manage workflows.

If you’re still on Airflow v1 or are evaluating a migration, understanding the key differences between the two versions is essential.

This guide breaks down the architectural, functional, and operational improvements introduced in v2, helping you make an informed decision.

Whether you’re managing ETL pipelines, ML workflows, or infrastructure automation—as discussed in our Airflow vs Terraform and Airflow vs Cron comparisons—knowing what version of Airflow you’re using can have a profound impact on performance and maintainability.

For broader context on Airflow’s ecosystem, check out our related comparison: Airflow vs Rundeck.

High-Level Summary of Changes

Apache Airflow v2 introduced a suite of enhancements that addressed critical pain points in the v1.x series.

While Airflow v1.x laid the groundwork for workflow orchestration, it struggled with scalability, operational complexity, and limited extensibility.

Airflow v2.x significantly improves the architecture and user experience with new core components, better task execution, and a more robust API.

Here’s a high-level comparison of the major differences between the two versions:

Feature / Area	Airflow v1.x	Airflow v2.x
Scheduler	Single-threaded, limited scalability	Multi-scheduler support with better scaling
DAG Parsing	Serialized via pickling	DAG Serialization in JSON
Task Execution	Limited parallelism	Enhanced parallelism via smart sensors and task groups
API	Experimental, limited endpoints	Full REST API with RBAC support
Task Dependency Syntax	`set_upstream()`, `set_downstream()`	Native `>>` and `<<` operators
CLI & UI	Older UI, inconsistent CLI	Modern UI, unified CLI
Security & Access Control	Basic, plugin-dependent	Full RBAC with role support
Scheduler Resilience	Prone to missed runs or bottlenecks	HA-ready, decoupled scheduling
Plugins	Plugin loading inconsistencies	Stable, namespaced plugins
Community Adoption	Legacy, limited support	Active development and community best practices

Airflow 2.x isn’t just an upgrade—it’s a re-architecture.

The improvements make it far more production-ready for data teams, SREs, and platform engineers running complex pipelines at scale.

TaskFlow API in v2

One of the most transformative features introduced in Airflow 2.x is the TaskFlow API, which allows developers to build DAGs using Python functions rather than relying solely on traditional Operators.

This new, functional approach improves readability, testing, and maintainability of Airflow pipelines.

What Is the TaskFlow API?

The TaskFlow API brings native Python function support to Airflow tasks.

It uses the @task decorator to convert regular Python functions into Airflow tasks automatically—handling serialization, logging, and XCom (cross-communication) under the hood. This leads to cleaner DAGs and less boilerplate code.

Task Definition: v1 vs v2

Let’s compare the task definition process in Airflow v1 and v2 using a simple ETL example.

Airflow v1.x (Traditional Operators)

with DAG(‘v1_example’, start_date=datetime(2023, 1, 1), schedule_interval=‘@daily’) as dag:
extract_task = PythonOperator(task_id=‘extract’, python_callable=extract)
transform_task = PythonOperator(task_id=‘transform’, python_callable=transform, provide_context=True)

extract_task >> transform_task

Airflow v2.x (TaskFlow API)

extracted = extract()
transform(extracted)

v2_example()

Benefits of TaskFlow API

Simplified Syntax: Reduces boilerplate and dependency on provide_context or XCom calls.
Improved Testability: Functions are native Python, easier to test outside Airflow context.
Better Modularity: Encourages modular and reusable task design.

The TaskFlow API is a game-changer for data engineers and Python developers who want to create clean, readable, and maintainable DAGs.

Scheduler and Executor Enhancements

One of the key limitations in Airflow 1.x was its single-scheduler architecture, which created a potential single point of failure.

This posed scalability and high availability challenges in production environments.

Airflow 2.x addressed this limitation with a re-architected scheduling system designed for resilience and performance.

Limitations in Airflow v1

In Airflow v1.x:

Only one scheduler could run at a time.
If that scheduler failed, no new tasks would be scheduled.
There was no built-in mechanism for horizontal scaling of the scheduler itself.
Executors like CeleryExecutor and KubernetesExecutor had limited observability and performance tuning options.

This meant that large-scale deployments often required custom workarounds or external monitoring to ensure reliability.

Improvements in Airflow v2

Airflow 2.x introduced a multi-scheduler architecture with native support for High Availability (HA).

Now, you can run multiple schedulers in parallel, and they coordinate safely using database-level locks, ensuring that no DAG or task gets scheduled more than once.

Key Enhancements:

Multi-Scheduler Support
- Multiple schedulers can be deployed concurrently.
- Eliminates the single point of failure from v1.x.
- Greatly improves scalability for environments with many DAGs.
Executor Upgrades
- CeleryExecutor improvements:
  - More efficient task queuing and worker communication.
  - Better integration with observability tools.
- KubernetesExecutor enhancements:
  - More stable pod launching behavior.
  - Improved resource handling for ephemeral tasks.
  - Reduced scheduler-pod communication overhead.
Faster Scheduling Loop
- The new scheduling loop is faster and more efficient, enabling better throughput for large DAGs.

Real-World Impact

For teams running hundreds or thousands of DAGs, these enhancements are crucial.

The multi-scheduler feature ensures resilience, and improved executors make distributed execution more efficient, especially in cloud-native or Kubernetes-based environments.

New REST API

One of the most anticipated improvements in Airflow 2.x is the introduction of a stable, production-ready REST API.

In contrast to Airflow 1.x, which only offered an experimental API with limited functionality and no guarantees of stability, Airflow 2.x provides a robust interface for interacting with your workflows programmatically.

Limitations in v1

Airflow 1.x included an experimental API that:

Lacked comprehensive documentation.
Was prone to breaking changes between versions.
Supported only a limited set of operations (e.g., triggering DAGs).
Offered no authentication or authorization mechanisms out of the box.

This made it difficult for teams to integrate Airflow cleanly into CI/CD pipelines or automation systems.

Improvements in v2

Airflow 2.x introduces a stable, OpenAPI-compliant REST API that is:

Well-documented and versioned.
Secure, supporting authentication (via JWT, basic auth, etc.) and role-based access control.
Extensible, enabling teams to build custom tooling and integrations.

With the new API, you can:

Trigger DAG runs and tasks.
Monitor DAG execution status.
Manage variables, connections, pools, and other metadata programmatically.
Automate workflow deployment, testing, and monitoring in CI/CD pipelines.

Example Use Cases

DevOps teams can trigger DAGs from CI/CD pipelines (e.g., GitHub Actions, Jenkins).
Data engineers can integrate Airflow with data cataloging or quality tools.
Platform teams can automate environment bootstrapping and monitoring.

You can explore the full API spec via the /api/v1/ endpoint or by visiting the official Swagger UI interface.

Deferrable Sensors and Smart Triggering

One of the most impactful changes in Apache Airflow 2.x is the introduction of Deferrable Operators and the Triggerer—a major step toward improving resource efficiency for long-running tasks.

The Problem with Sensors in v1

In Airflow 1.x, Sensors (e.g., TimeSensor, ExternalTaskSensor, S3KeySensor) were blocking tasks.

This means they occupied a worker slot for the entire duration of their wait, often leading to:

Wasted resources, especially when many sensors were active.
Scheduler bottlenecks when the system had to manage thousands of active but idle tasks.
Increased cost and complexity in distributed environments like Kubernetes or Celery.

The Solution in v2: Deferrable Operators

Airflow 2.x solves this inefficiency with Deferrable Operators.

These operators “defer” their execution while waiting and hand off control to a lightweight process called the Triggerer.

This allows the task to:

Free up the worker slot during wait time.
Be reactivated only when the condition is met.
Scale to handle thousands of idle wait conditions without overwhelming the system.

This is especially useful in cloud-native environments, where blocking costs money.

Key Component: The Triggerer

The Triggerer is a new daemon process introduced in Airflow 2.2+ that handles deferred tasks asynchronously.

It efficiently manages large numbers of sleeping sensors without using workers, improving overall scalability and performance.

Example: TimeSensor vs TimeSensorAsync

Airflow v1 – Blocking TimeSensor:

Airflow v2 – Deferrable TimeSensorAsync:

The TimeSensorAsync releases the worker after deferring, and the Triggerer reactivates it when it’s time to resume.

Bottom Line

Use deferrable operators for sensors that may wait minutes or hours.
Greatly improves resource efficiency and task scalability.
A must-have for any production-grade Airflow deployment handling large DAG volumes or event-based scheduling.

UI and DAG Visualization Enhancements

Apache Airflow 2.x introduced a modernized and much more user-friendly UI compared to its v1.x predecessor, significantly improving the developer and operator experience.

New Grid View

One of the most noticeable improvements is the Grid View, which replaces the older Tree View as the default DAG execution visualizer.

The Grid View offers:

A clearer, scalable layout of DAG runs and task states.
Better interactivity—hover over a task to quickly view its status or duration.
Visual cues for retries, task failures, and SLAs.
Fast navigation between DAG runs and individual task instances.

This upgrade simplifies debugging and historical analysis of DAG executions.

Better Task Instance Details

In v2, the Task Instance detail page is redesigned to provide:

Clearer log separation by try number.
Easy access to rendered templates and XCom values.
Streamlined access to metadata, including execution time, duration, queue, and more.
A reschedule or clear button for quick remediation or DAG reruns.

This is a significant improvement over v1, where task details were spread across multiple tabs with limited usability.

Improved Filtering and Search

The Airflow 2.x UI includes:

Enhanced filtering by run status (success, failed, running, etc.).
Search capabilities across DAGs and tasks.
Fast pagination and loading times, even for Airflow environments with hundreds of DAGs.

Accessibility and UX Improvements

Airflow 2.x also features:

A more accessible interface, with improved keyboard navigation and screen reader support.
Responsive UI design for better use on different screen sizes.
A modernized visual theme and dark mode support (via plugins or customization).

These enhancements make Airflow v2 not only more powerful but also more pleasant to use on a daily basis.

DAG Serialization and Versioning

One of the key architectural improvements in Apache Airflow v2 is the introduction of DAG serialization and versioning, which significantly enhances scalability and deployment flexibility—particularly in distributed environments.

What is DAG Serialization?

In Airflow v2, DAGs can be serialized and stored in the metadata database.

This allows the scheduler and web server to access DAG definitions without directly reading from the Python source files every time.

Benefits include:

Improved performance and responsiveness in the web UI.
Reduced file system dependency across the scheduler, workers, and web server.
Centralized DAG metadata, which is particularly useful when using remote or containerized workers.

In contrast, Airflow v1 required all components (scheduler, webserver, workers) to have shared access to the same Python files, often via NFS mounts or Git sync, which introduced operational complexity and performance bottlenecks.

DAG Versioning

Airflow v2 also lays the groundwork for DAG version tracking through the serialized metadata.

While not a full version control system, this allows:

Visibility into changes made to DAG structures over time.
Better traceability and auditability for compliance or debugging purposes.
Future extensibility to support rollback or side-by-side DAG testing in some deployments.

Use with Remote Workers

Serialized DAGs make it easier to scale Airflow using CeleryExecutor, KubernetesExecutor, or custom remote executors, where the workers might not have direct access to the DAG source code.

With serialization, the scheduler and workers can communicate task definitions through the database rather than relying on the file system, enabling more robust cloud-native deployments.

This foundational change aligns Airflow v2 more closely with modern distributed architectures and is a must-have for large-scale or multi-tenant Airflow environments.

Upgrading from Airflow v1 to v2

Transitioning from Apache Airflow v1 to v2 is a significant upgrade that brings enhanced scalability, reliability, and usability—but it also requires careful planning due to breaking changes and new architectural patterns.

Key Considerations Before Upgrading

Before initiating the upgrade, consider:

Python compatibility: Airflow v2 requires Python 3.6+ (v1 supported Python 2.7).
Executor compatibility: Some executors behave differently in v2, especially with new features like deferrable operators and DAG serialization.
Environment setup: Infrastructure like the metadata database, workers, and scheduler must be reviewed and potentially adjusted for the v2 architecture.

Breaking Changes and Deprecated Features

Airflow v2 introduces several breaking changes. Some examples include:

Many operator argument names were changed for consistency (e.g., email_on_failure became keyword-only).
Removal of legacy UI and endpoints.
Deprecation of some core functions, like PythonOperator behavior around context passing.

Refer to the official Airflow upgrade documentation for a full list of breaking changes and deprecated functionality.

Tools and Migration Strategies

The Apache Airflow team provides an essential utility called airflow upgrade_check, which:

Scans your Airflow environment
Identifies deprecated configurations, code patterns, and breaking changes
Generates a report with actionable migration tasks

To run it:

This tool should be your first step in the migration process.

Importance of Testing in Staging

Before deploying v2 into production:

Create a staging environment that mirrors your production setup.
Migrate your DAGs and configurations into staging.
Run integration tests to ensure all workflows execute as expected.
Monitor resource usage, scheduler behavior, and executor performance.

Testing ahead of time reduces downtime risk and ensures compatibility with existing pipelines and tools.

Use Cases and Adoption

Apache Airflow v2 has seen rapid adoption across organizations due to its improved stability, scalability, and developer experience.

While some teams continue to use v1 in legacy environments, the benefits of v2 make it the preferred choice for most new deployments.

Why Organizations Are Adopting v2 for Production Workflows

Scalability: Multi-scheduler support and DAG serialization allow organizations to scale horizontally, handling thousands of tasks per day with reduced operational overhead.
Resilience: Features like deferrable sensors and a robust Triggerer service reduce resource consumption and improve system efficiency.
Integration-readiness: The stable REST API in v2 makes it easier to integrate Airflow with CI/CD systems, external platforms, and custom dashboards.
Enhanced developer experience: The TaskFlow API and improved UI make authoring and debugging DAGs faster and more intuitive.

Real-World Scenarios Where v2 Features Are Critical

Data-intensive pipelines at scale: Companies orchestrating complex data workflows—such as ETL processes for data lakes—benefit from v2’s ability to run distributed tasks efficiently.
Cloud-native deployments: Teams running Airflow on Kubernetes leverage v2’s enhanced KubernetesExecutor support and DAG serialization to streamline dynamic workflows.
Compliance and monitoring: Enterprises that require better observability and audit trails take advantage of v2’s richer logging and SLA monitoring tools.

When It Might Be Okay to Stick with v1 Temporarily

Legacy systems with strict dependencies: Some environments with rigid compatibility constraints may need more time before migrating to v2.
Small-scale DAGs with low complexity: For very lightweight, low-traffic deployments, v1 may still serve basic orchestration needs without introducing migration overhead.
Resource-constrained teams: If a team lacks the bandwidth to test and refactor DAGs, it may choose to defer the upgrade—though this carries long-term technical debt.

However, it’s important to note that v1 reached end of support in early 2023, meaning no security or bug fixes are provided by the core team.

Most organizations are encouraged to begin their migration planning if they haven’t already.

Pros and Cons Comparison

Apache Airflow v1 and v2 each come with their own strengths and limitations.

For teams evaluating whether to stay on v1 or move to v2, it’s important to weigh these trade-offs based on workflow complexity, team expertise, and infrastructure goals.

Pros – Airflow v1

✅ Stability for legacy environments
Trusted and widely used in production for years across many enterprises.
✅ Broad plugin and integration ecosystem
Extensive third-party tools and community-contributed extensions.
✅ Familiar interface and operation model
Long-time users may prefer the predictability and simplicity of v1.

Cons – Airflow v1

❌ Limited scalability and single scheduler bottleneck
Doesn’t support high-availability natively.
❌ No stable REST API
The experimental API was difficult to use and poorly documented.
❌ Blocking sensors and inefficient resource usage
Sensor tasks occupied worker slots, creating bottlenecks.
❌ Cumbersome DAG management at scale
Lack of DAG serialization and versioning added friction.

Pros – Airflow v2

✅ High availability and scalability
Multi-scheduler support, DAG serialization, and executor improvements make it ready for modern distributed systems.
✅ TaskFlow API and async sensors
More Pythonic DAG authoring and resource-efficient execution.
✅ Stable REST API for integrations
Enables robust CI/CD pipelines, tooling, and automation.
✅ Improved UI and monitoring tools
Grid view, enhanced logs, and task instance visibility boost observability.

Cons – Airflow v2

❌ Migration effort
Moving from v1 to v2 may require DAG refactoring and plugin updates.
❌ Learning curve for new concepts
TaskFlow API and async paradigms may require training and experimentation.
❌ Compatibility gaps for custom plugins
Plugins developed for v1 might break or need updates for v2.

Summary Table:

Feature / Capability	Airflow v1	Airflow v2
Scheduler	Single scheduler	Multi-scheduler, High Availability (HA)
API Support	Experimental API	Stable, fully documented REST API
Task Definition	Operator-based	TaskFlow API with decorators
Sensors	Blocking	Deferrable (async) sensors via Triggerer
UI and DAG Visualization	Basic Tree View	Enhanced Grid View, better performance
DAG Serialization	Not supported	Supported – stored in metadata DB
Executor Enhancements	Limited Celery/Kubernetes support	Improved Celery and Kubernetes Executors
Security & Access Control	Basic (via Flask AppBuilder)	Improved RBAC and REST API integration
Plugin Compatibility	Broad, mature plugin ecosystem	Some v1 plugins need adaptation
Best For	Legacy workflows, simpler setups	Modern, scalable data platforms

This table provides a clear snapshot of how Airflow v2 builds on and improves upon its predecessor, helping teams quickly identify which version best suits their needs.

Conclusion

Apache Airflow v2 represents a significant evolution in workflow orchestration.

With improvements in scalability, flexibility, and developer experience, it addresses many of the limitations that teams faced with Airflow v1.

While migrating from v1 to v2 may require effort—especially for large DAG libraries or custom plugins—the long-term benefits are well worth it.

Features like the TaskFlow API, deferrable sensors, multi-scheduler support, and a stable REST API make Airflow v2 a modern solution suited for today’s complex data and DevOps ecosystems.

Teams looking to future-proof their workflow infrastructure should strongly consider upgrading to v2.

With proper planning, staging, and testing, the migration process can be smooth—and the gains in maintainability and performance are substantial.

If you’re still using v1, now is the time to evaluate your stack and make the move to a more robust, scalable orchestration platform.