Overview
FFmpeg is a collection of libraries and tools designed to process multimedia content including audio, video, subtitles, and related metadata. The architecture is modular, with each library serving a specific purpose in the multimedia processing pipeline.Core Libraries
FFmpeg consists of seven core libraries that work together to provide comprehensive multimedia processing capabilities:libavcodec
Provides implementation of a wide range of codecs for encoding and decoding audio, video, and subtitle streams. Key responsibilities:- Codec implementation (encoders and decoders)
- Bitstream parsing
- Frame encoding/decoding
- Packet handling
- Decoding H.264/H.265 video streams
- Encoding audio to AAC or MP3
- Converting between codec formats
libavformat
Implements streaming protocols, container formats, and basic I/O access. Key responsibilities:- Demuxing (reading) container formats
- Muxing (writing) container formats
- Protocol handling (file, HTTP, RTSP, etc.)
- Stream management
- Reading MP4, MKV, AVI files
- Writing output to various containers
- Streaming over network protocols
libavutil
Includes hashers, decompressors, and miscellaneous utility functions. Key responsibilities:- Common data structures (AVFrame, AVPacket)
- Mathematical utilities
- Memory management helpers
- Logging and error handling
- Pixel format definitions
- Color space utilities
libavfilter
Provides a framework to alter decoded audio and video through a directed graph of connected filters. Key responsibilities:- Filter graph management
- Audio and video filtering
- Filter chain processing
- Format conversion
- Scaling and cropping video
- Audio resampling
- Overlay and watermarking
- Complex filter chains
libavdevice
Provides an abstraction to access capture and playback devices. Key responsibilities:- Device input/output abstraction
- Screen capture
- Camera access
- Audio device handling
libswresample
Implements audio mixing and resampling routines. Key responsibilities:- Sample format conversion
- Channel layout conversion
- Sample rate conversion
- Audio mixing
libswscale
Implements color conversion and scaling routines. Key responsibilities:- Image scaling
- Pixel format conversion
- Color space conversion
- Chroma upsampling/downsampling
Component Relationships
Data Flow
Decoding Pipeline
The typical flow for decoding multimedia content:- Input → libavformat opens input file/stream
- Demux → libavformat demuxes packets from containers
- Decode → libavcodec decodes packets into frames
- Filter → (Optional) libavfilter processes frames
- Output → Frames are ready for display or further processing
Encoding Pipeline
The typical flow for encoding multimedia content:- Input → Raw frames from source
- Filter → (Optional) libavfilter processes frames
- Encode → libavcodec encodes frames into packets
- Mux → libavformat muxes packets into container
- Output → libavformat writes to file/stream
Transcoding Pipeline
Combining both decoding and encoding:Key Data Structures
AVPacket
Represents compressed data (encoded audio/video/subtitle).- Contains compressed bitstream data
- Used between demuxer and decoder (or encoder and muxer)
- Includes timing information (PTS, DTS)
- May contain side data (metadata, subtitles, etc.)
AVFrame
Represents uncompressed data (raw audio/video frames).- Contains decoded pixel/sample data
- Used between decoder and filters (or filters and encoder)
- Includes metadata (format, dimensions, sample rate)
- Reference-counted for efficient memory management
AVCodecContext
Context for encoding or decoding operations.- Holds codec-specific configuration
- Maintains encoder/decoder state
- Contains options for quality, bitrate, etc.
AVFormatContext
Context for format (muxer/demuxer) operations.- Represents input or output file/stream
- Contains stream information
- Manages I/O operations
- Holds format-specific metadata
Threading Model
FFmpeg supports multiple threading strategies for improved performance:Frame Threading
- Decodes multiple frames in parallel
- Each frame is processed by a separate thread
- Adds minimal latency (one frame per thread)
- Best for independent frame codecs
Slice Threading
- Splits a single frame into slices
- Each slice is decoded by a separate thread
- No additional latency
- Requires codec support for slices
For detailed information about threading implementation, see the Multithreading guide.
Memory Management
FFmpeg uses reference counting for efficient memory management:- AVBuffer - Reference-counted buffer system
- AVFrame - Automatically reference-counted
- AVPacket - Reference-counted packet data
- Zero-copy operations where possible
- Safe sharing of data between components
- Automatic memory cleanup
Best Practices
Use Reference Counting
Leverage AVBuffer reference counting to avoid unnecessary data copies
Check Return Values
Always check return values from FFmpeg functions for error handling
Free Resources
Properly free all allocated contexts and frames to prevent memory leaks
Thread Safety
Ensure callbacks are thread-safe when using frame threading
Common Patterns
Opening an Input File
Decoding a Frame
Encoding a Frame
Additional Resources
API Changes
Track API changes and version compatibility
Optimization
Performance optimization techniques
Multithreading
Threading models and best practices
Examples
Official FFmpeg code examples