Introduction to Retto
Retto is a high-performance OCR (Optical Character Recognition) SDK built in Rust that provides PaddleOCR inference capabilities with WebAssembly support. It enables fast, accurate text detection and recognition across multiple platforms including native Rust applications, command-line tools, and web browsers.What is Retto?
Retto is a complete OCR solution that implements the PaddleOCR v4 pipeline, consisting of three stages:- Text Detection - Locates text regions in images
- Text Classification - Determines text orientation (0°, 90°, 180°, 270°)
- Text Recognition - Recognizes the actual text content
- retto-core - Core Rust library with processor implementations
- retto-cli - Command-line tool for batch OCR processing
- retto-wasm - WebAssembly package for browser-based OCR
Why Use Retto?
High Performance
Built with Rust for maximum performance and memory safety. Utilizes ONNX Runtime for optimized inference with CPU, CUDA, and DirectML support.
Cross-Platform
Run OCR natively on desktop, in command-line tools, or directly in web browsers via WebAssembly - all from the same codebase.
Multiple Backends
Supports multiple execution providers including CPU, CUDA (NVIDIA GPUs), and DirectML (Windows) for optimal performance on different hardware.
Flexible Model Loading
Load models from local files, memory buffers, or automatically download from Hugging Face Hub.
Key Features
- PaddleOCR v4 Models - Implements the latest PaddleOCR v4 pipeline with detection, classification, and recognition
- ONNX Runtime Integration - Leverages ONNX Runtime for efficient model inference
- Streaming Results - Process OCR stages asynchronously with callback support
- Batch Processing - Process multiple images efficiently with parallel processing
- Type Safety - Full Rust type safety with comprehensive error handling
- Serialization Support - Built-in JSON serialization for easy integration
- Hardware Acceleration - Optional CUDA and DirectML support for GPU acceleration
Architecture Overview
Retto processes images through a pipeline architecture:- Image Helper - Handles image loading, resizing, and preprocessing
- Detection Processor - Uses a CNN model to detect text bounding boxes
- Classification Processor - Determines text orientation for each detected region
- Recognition Processor - Converts text regions into actual text using CTC decoding
Session-Based API
Retto uses a session-based design where you create aRettoSession with your desired configuration:
Use Cases
Document Digitization
Document Digitization
Extract text from scanned documents, receipts, and forms for archival and searchability.
Web Applications
Web Applications
Build browser-based OCR tools without server-side processing using the WebAssembly package.
Batch Processing
Batch Processing
Process large volumes of images efficiently using the CLI tool with parallel processing.
Real-time OCR
Real-time OCR
Integrate OCR capabilities into desktop applications with low-latency inference.
Getting Started
Choose Your Platform
Decide whether you need the Rust library, CLI tool, or WebAssembly package based on your use case.
Install Retto
Follow the installation guide to add Retto to your project.
Run Your First OCR
Check out the quickstart guide to run your first OCR in minutes.
Retto is currently in active development (v0.1.5). The API may change in future versions.
License
Retto is licensed under the Apache License 2.0, making it suitable for both open source and commercial projects.Next Steps
Installation
Learn how to install Retto in your project
Quickstart
Run your first OCR in minutes
