Supported Databases
PostgreSQL
RDBMS with CDC support via logical replication
MySQL
Popular open-source relational database
Snowflake
Cloud data warehouse platform
BigQuery
Google Cloud serverless data warehouse
Redshift
AWS cloud data warehouse
MongoDB
NoSQL document database
MSSQL
Microsoft SQL Server
Oracle
Enterprise relational database
Teradata
Enterprise data warehouse
PostgreSQL
Extract data from PostgreSQL with support for full table, incremental, and log-based (CDC) replication.Configuration
Log-Based Replication (CDC)
Configure PostgreSQL for real-time change data capture:Setup PostgreSQL CDC
Setup PostgreSQL CDC
- Enable logical replication in
postgresql.conf:
- Create a publication:
- Create a replication slot:
- Grant necessary permissions:
Example Usage
MySQL
Connect to MySQL databases with support for SSH tunneling and multiple connection methods.Configuration
SSH Tunnel Connection
Snowflake
Connect to Snowflake data warehouse using password or private key authentication.Configuration
Private Key Authentication
BigQuery
Extract data from Google BigQuery using service account credentials.Configuration
Credentials Info (Alternative)
Redshift
Connect to Amazon Redshift using standard authentication or IAM credentials.Configuration
IAM Authentication
MongoDB
Extract data from MongoDB with automatic schema detection.Configuration
Connection String
MSSQL
Connect to Microsoft SQL Server with Windows or SQL Server authentication.Configuration
Windows Authentication
Oracle Database
Extract data from Oracle databases.Configuration
Other Databases
Teradata
Teradata
Couchbase
Couchbase
DynamoDB
DynamoDB
Doris
Doris
Dremio
Dremio
Replication Methods
All database sources support these replication strategies:Full Table
Complete refresh of all data on each sync:Incremental
Sync only new/updated records based on a timestamp or ID column:Log-Based (PostgreSQL Only)
Real-time CDC using PostgreSQL logical replication:Installation
Install database-specific dependencies:Best Practices
- Use read-only credentials for source databases
- Enable incremental sync when possible to reduce load
- Configure connection pooling for high-volume extractions
- Test connections before scheduling pipelines
- Monitor replication lag for CDC sources
- Use environment variables for sensitive credentials
- Select specific streams to reduce data transfer
Next Steps
API Sources
Connect to REST APIs and SaaS platforms
Cloud Storage
Load data from S3, GCS, and Azure Blob Storage