Skip to main content

ChartsMaze EDL Pipeline

Single command. Complete market intelligence. The ChartsMaze EDL (Extract, Data, Load) Pipeline is a production-grade data aggregation platform that transforms scattered Indian stock market data into a unified, analysis-ready JSON artifact. Built on Dhan ScanX APIs, it processes 2,775+ stocks across 6 pipeline phases to deliver comprehensive fundamental, technical, and event-driven insights.

What It Does

Run one command:
python3 run_full_pipeline.py
Get one file:
all_stocks_fundamental_analysis.json.gz
Containing 86 fields per stock across:
  • Quarterly financial results (Net Profit, EPS, Sales, OPM)
  • Valuation ratios (P/E, PEG, ROE, ROCE, D/E)
  • Technical indicators (RSI, SMA/EMA status, Pivot Points)
  • Price performance (1D/1W/1M/3M/6M/1Y returns, % from 52W High/Low, ATH)
  • Volume analytics (RVOL, 20/50/100-day turnover, ADR)
  • Corporate events (Dividends, Bonus, Splits, Results)
  • Regulatory filings (Top 5 recent announcements with PDF links)
  • Real-time news feed (AI sentiment analysis)
  • Event markers (ASM/GSM surveillance, Insider Trading, Block Deals, Circuit revisions)

Key Features

Smart Incremental Updates

OHLCV data fetches only new/changed records. First run: ~30 min. Subsequent runs: ~2-5 min.

Zero Configuration

No API keys. No database setup. Works out of the box with Dhan’s public endpoints.

Dependency-Aware

Strict 6-phase execution order ensures data integrity. master_isin_map.json flows through 16+ scripts.

Auto-Cleanup

Keeps only .json.gz + ohlcv_data/ after completion. Intermediate files purged automatically.

Pipeline Architecture

The pipeline executes in strict dependency order:

Phase Breakdown

PHASE 1: Core Data (Foundation)
  • fetch_dhan_data.py → Fetches 2,775 stocks, produces master_isin_map.json (critical dependency)
  • fetch_fundamental_data.py → Quarterly results & ratios for all stocks
PHASE 2: Data Enrichment (Parallel Fetching)
  • Company filings (Hybrid: LODR + Legacy endpoints)
  • Live announcements, Advanced indicators (Pivot, EMA, SMA)
  • Market news (AI sentiment), Corporate actions (History + Upcoming)
  • Surveillance lists (ASM/GSM), Circuit stocks, Bulk/Block deals
  • Price bands (Incremental + Complete)
PHASE 2.5: OHLCV Data (Smart Incremental)
  • fetch_all_ohlcv.py → Lifetime daily candles (auto-detects existing data, fetches delta only)
  • fetch_indices_ohlcv.py → Index data (specialized high-speed endpoint)
PHASE 3: Base Analysis
  • bulk_market_analyzer.py → Builds all_stocks_fundamental_analysis.json (BASE structure)
PHASE 4: Enrichment (Order Matters!)
  1. advanced_metrics_processor.py → Adds ADR, RVOL, ATH, Turnover
  2. process_earnings_performance.py → Adds post-earnings returns
  3. enrich_fno_data.py → Adds F&O flag, Lot Size, Next Expiry
  4. process_market_breadth.py → Market-wide analytics
  5. add_corporate_events.py → Adds Event Markers, Announcements, News Feed (MUST BE LAST!)
PHASE 5: Compression
  • Compress to .json.gz (typically ~10 MB, 85-90% compression ratio)
PHASE 6: Optional
  • Standalone data: Indices, ETFs (not included in master JSON)

Output Schema

Each stock in all_stocks_fundamental_analysis.json.gz contains:

Identity & Classification

{
  "Symbol": "RELIANCE",
  "Name": "Reliance Industries Ltd.",
  "Listing Date": "29-Nov-1977",
  "Basic Industry": "Refineries",
  "Sector": "Energy",
  "Index": "NIFTY 50"
}

Fundamentals (Quarterly)

{
  "Latest Quarter": "Dec 2025",
  "Net Profit Latest Quarter": "18500.00",
  "EPS Latest Quarter": "27.30",
  "Sales Latest Quarter": "245000.00",
  "OPM Latest Quarter(%)": "12.5",
  "QoQ Sales Growth(%)": "8.2",
  "YoY Sales Growth(%)": "15.7"
}

Technical Indicators

{
  "RSI (14)": "62.5",
  "SMA Status": "SMA 20: Above (4.9%) | SMA 50: Above (24.1%)",
  "EMA Status": "EMA 20: Above (6.3%) | EMA 200: Above (72.6%)",
  "Technical Sentiment": "RSI: Neutral | MACD: Bearish",
  "Pivot Point": "245.50"
}

Volume & Liquidity

{
  "RVOL": "1.45",
  "Daily Rupee Turnover 20(Cr.)": "850.2",
  "30 Days Average Rupee Volume(Cr.)": "900.5",
  "5 Days MA ADR(%)": "2.3"
}

Event Markers

{
  "Event Markers": "📊: Results Recently Out | 💸: Dividend (15-Mar)"
}

Recent Announcements

{
  "Recent Announcements": [
    {
      "Date": "02-Mar-2026",
      "Headline": "Outcome of Board Meeting",
      "URL": "https://nsearchives.nseindia.com/corporate/..."
    }
  ]
}

News Feed

{
  "News Feed": [
    {
      "Title": "Reliance Q3 results beat estimates",
      "Sentiment": "positive",
      "Date": "02-Mar-2026"
    }
  ]
}

Get Started

Quickstart

Run your first pipeline in 5 minutes

Pipeline Architecture

Understand the 4-phase pipeline design

API Reference

Complete endpoint documentation

API Reference

Endpoint details, payloads, and rate limits

System Requirements

  • Python: 3.8+
  • Dependencies: requests (only external dependency)
  • Storage: ~500 MB for full OHLCV history, ~10 MB for compressed output
  • Network: Stable internet (uses Dhan public APIs, no auth required)
  • Runtime:
    • First run (with OHLCV): ~30-40 minutes
    • Incremental runs: ~4-6 minutes
    • Without OHLCV: ~3-4 minutes

Real-World Performance

Typical output from a full pipeline run:
═══════════════════════════════════════════════════════════
  PIPELINE COMPLETE
═══════════════════════════════════════════════════════════
  Total Time:  245.3s (4.1 min)
  Successful:  18/18
  Failed:      0/18

  📄 Output: all_stocks_fundamental_analysis.json.gz (9.2 MB)
  📦 Compression: 68.5 MB 9.2 MB (87% smaller)
  🧹 Only .json.gz + ohlcv_data/ remain. All intermediate data purged.
═══════════════════════════════════════════════════════════

Use Cases

  • Screeners: Build custom stock screeners with 86+ filter fields
  • Backtesting: Historical OHLCV + fundamentals for strategy validation
  • Alerts: Event-driven triggers (Results, Dividends, ASM additions)
  • Dashboards: Real-time market breadth, sector analytics
  • Research: Correlate corporate actions with price performance
  • Compliance: Track surveillance list changes, insider trading events

Next Steps

1

Run the Pipeline

Follow the Quickstart to execute your first full data refresh
2

Explore the Data

Use single_stock_analyzer.py to inspect individual stock records
3

Customize

Adjust configuration flags (FETCH_OHLCV, CLEANUP_INTERMEDIATE) in run_full_pipeline.py
4

Integrate

Load .json.gz into your analytics platform, database, or visualization tool
No API Keys Required: All endpoints use Dhan’s public APIs. No authentication, no rate limits (respectful usage assumed).
Dependency Order is Critical: fetch_dhan_data.py MUST succeed before Phase 2 scripts. add_corporate_events.py MUST be the last enrichment script. The run_full_pipeline.py enforces this automatically.

Build docs developers (and LLMs) love