How It Works
Mage’s data integration system follows a simple pattern:- Source - Extract data from databases, APIs, file storage, or SaaS platforms
- Transform (optional) - Apply transformations using Python, SQL, or dbt
- Destination - Load data into your target data warehouse or database
Mage integrations use the Singer protocol internally, which provides a standard for moving data between systems using JSON-formatted messages.
Replication Methods
Mage supports multiple replication strategies to optimize your data syncs:Full Table
Replicate entire tables on each sync. Best for small tables or complete refreshes.
Incremental
Sync only new or updated records based on a replication key (e.g.,
updated_at timestamp).Log-Based (CDC)
Capture changes from database logs for real-time data replication (PostgreSQL, MySQL).
Key Features
Wide Integration Support
Connect to 50+ data sources and 25+ destinations including:- Databases: PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, MongoDB
- Cloud Storage: Amazon S3, Google Cloud Storage, Azure Blob Storage
- SaaS Applications: Stripe, Salesforce, HubSpot, Zendesk, Google Analytics
- APIs: Custom REST APIs with flexible configuration
Schema Detection
Mage automatically discovers tables and columns from your data sources, generating schemas that can be customized before syncing.Flexible Configuration
Configure integrations using Python dictionaries or YAML files:Architecture
Mage’s integration framework is built on these core components:Source Connectors
Source Connectors
Extract data from external systems. Each source implements:
- Connection management - Establish and maintain connections
- Discovery - Detect available tables/streams and their schemas
- Data extraction - Read data in batches or streams
- Bookmarking - Track sync progress and resume from failures
Destination Connectors
Destination Connectors
Load data into target systems. Each destination handles:
- Schema creation - Automatically create tables if they don’t exist
- Type mapping - Convert data types between systems
- Conflict resolution - Handle duplicate records (upsert, update, or ignore)
- Batch loading - Optimize writes using bulk operations
Catalog & Schema
Catalog & Schema
The catalog defines which streams to sync and how:
- Stream selection - Choose which tables/endpoints to replicate
- Field selection - Pick specific columns to sync
- Replication method - Set full table, incremental, or log-based
- Primary keys - Define unique constraints
Installation
Install Mage with specific integrations:Quick Start
Next Steps
Available Sources
Browse all supported data sources
Available Destinations
Explore destination options
Custom Integrations
Build your own connectors
Configuration Guide
Advanced configuration options