Compute Integrations

Mage supports multiple compute integrations to scale your data pipelines beyond local execution. Execute blocks and pipelines on distributed compute engines, container orchestration platforms, and cloud-native services.

Available Executors

Mage provides several executor types for running pipelines and blocks:

Spark (PySpark)

Run PySpark pipelines on AWS EMR clusters with automatic cluster management

Kubernetes

Execute pipelines and blocks as Kubernetes Jobs with full resource control

AWS ECS

Run blocks as ECS tasks on AWS Fargate or EC2

GCP Cloud Run

Execute blocks on GCP Cloud Run with serverless container execution

Executor Types

Mage supports the following executor types defined in the codebase:

Executor Type	Description	Use Case
`local_python`	Local Python execution (default)	Development and testing
`pyspark`	PySpark execution on AWS EMR	Large-scale data processing with Spark
`k8s`	Kubernetes Job execution	Container orchestration and resource isolation
`ecs`	AWS ECS task execution	AWS-native container execution
`gcp_cloud_run`	GCP Cloud Run execution	Serverless containers on GCP
`azure_container_instance`	Azure Container Instance execution	Serverless containers on Azure

Configuration Levels

Executor configuration can be set at multiple levels with the following precedence:

Project Level

Configure default executors in your project’s metadata.yaml file. This applies to all pipelines and blocks unless overridden.

# Project metadata.yaml
k8s_executor_config:
  resource_requests:
    cpu: "500m"
    memory: "512Mi"

Pipeline Level

Override executor configuration in a pipeline’s metadata.yaml file.

# Pipeline metadata.yaml
executor_type: k8s
executor_config:
  resource_requests:
    cpu: "1000m"
    memory: "2Gi"

Block Level

Configure executor settings for individual blocks in their YAML configuration.

# Block configuration
executor_type: k8s
executor_config:
  resource_requests:
    cpu: "2000m"
    memory: "4Gi"

Block-level configuration takes highest precedence, followed by pipeline-level, then project-level configuration.

Setting Default Executor Type

You can set a default executor type for all pipelines using the DEFAULT_EXECUTOR_TYPE environment variable:

export DEFAULT_EXECUTOR_TYPE=k8s

This will use the specified executor type for all pipelines unless overridden at the pipeline or block level.

Executor Factory

Mage uses an executor factory pattern to dynamically create the appropriate executor based on:

Pipeline type: PySpark pipelines automatically use the pyspark executor
Explicit configuration: Executor type specified in metadata
Default executor: Set via DEFAULT_EXECUTOR_TYPE environment variable
Fallback: Falls back to local_python if no configuration is found

From the source code at mage_ai/data_preparation/executors/executor_factory.py:

if pipeline.type == PipelineType.PYSPARK:
    executor_type = ExecutorType.PYSPARK
else:
    executor_type = pipeline.get_executor_type()
    if executor_type == ExecutorType.LOCAL_PYTHON or executor_type is None:
        executor_type = DEFAULT_EXECUTOR_TYPE

Data Sources

Data Destinations

Infrastructure

Compute Integrations

Available Executors

Spark (PySpark)

Kubernetes

AWS ECS

GCP Cloud Run

Executor Types

Configuration Levels

Setting Default Executor Type

Executor Factory

Next Steps

Spark Integration

Kubernetes Integration

Build docs developers (and LLMs) love

Data Sources

Data Destinations

Infrastructure

​Available Executors

Spark (PySpark)

Kubernetes

AWS ECS

GCP Cloud Run

​Executor Types

​Configuration Levels

​Setting Default Executor Type

​Executor Factory

​Next Steps

Spark Integration

Kubernetes Integration

Build docs developers (and LLMs) love

Available Executors

Executor Types

Configuration Levels

Setting Default Executor Type

Executor Factory

Next Steps