FFmpeg Architecture

Overview

FFmpeg is a collection of libraries and tools designed to process multimedia content including audio, video, subtitles, and related metadata. The architecture is modular, with each library serving a specific purpose in the multimedia processing pipeline.

Core Libraries

FFmpeg consists of seven core libraries that work together to provide comprehensive multimedia processing capabilities:

libavcodec

Provides implementation of a wide range of codecs for encoding and decoding audio, video, and subtitle streams. Key responsibilities:

Codec implementation (encoders and decoders)
Bitstream parsing
Frame encoding/decoding
Packet handling

Common use cases:

Decoding H.264/H.265 video streams
Encoding audio to AAC or MP3
Converting between codec formats

libavformat

Implements streaming protocols, container formats, and basic I/O access. Key responsibilities:

Demuxing (reading) container formats
Muxing (writing) container formats
Protocol handling (file, HTTP, RTSP, etc.)
Stream management

Common use cases:

Reading MP4, MKV, AVI files
Writing output to various containers
Streaming over network protocols

libavutil

Includes hashers, decompressors, and miscellaneous utility functions. Key responsibilities:

Common data structures (AVFrame, AVPacket)
Mathematical utilities
Memory management helpers
Logging and error handling
Pixel format definitions
Color space utilities

libavfilter

Provides a framework to alter decoded audio and video through a directed graph of connected filters. Key responsibilities:

Filter graph management
Audio and video filtering
Filter chain processing
Format conversion

Common use cases:

Scaling and cropping video
Audio resampling
Overlay and watermarking
Complex filter chains

libavdevice

Provides an abstraction to access capture and playback devices. Key responsibilities:

Device input/output abstraction
Screen capture
Camera access
Audio device handling

libswresample

Implements audio mixing and resampling routines. Key responsibilities:

Sample format conversion
Channel layout conversion
Sample rate conversion
Audio mixing

libswscale

Implements color conversion and scaling routines. Key responsibilities:

Image scaling
Pixel format conversion
Color space conversion
Chroma upsampling/downsampling

Component Relationships

┌─────────────────────────────────────────────────────────┐
│                     FFmpeg Tools                         │
│           (ffmpeg, ffplay, ffprobe)                     │
└─────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                    libavformat                          │
│         (Container format muxing/demuxing)              │
└─────────────────────────────────────────────────────────┘
          │                                    │
          ▼                                    ▼
┌──────────────────────┐            ┌──────────────────────┐
│    libavcodec        │            │   libavfilter        │
│ (Encoding/Decoding)  │◄──────────►│  (Filter graphs)     │
└──────────────────────┘            └──────────────────────┘
          │                                    │
          └────────────────┬───────────────────┘
                          ▼
              ┌────────────────────────┐
              │      libavutil         │
              │  (Common utilities)    │
              └────────────────────────┘
                          │
          ┌───────────────┴───────────────┐
          ▼                               ▼
┌──────────────────┐          ┌──────────────────┐
│  libswresample   │          │   libswscale     │
│ (Audio resample) │          │ (Video scaling)  │
└──────────────────┘          └──────────────────┘
          │                               │
          └───────────────┬───────────────┘
                          ▼
              ┌────────────────────────┐
              │     libavdevice        │
              │   (Device I/O)         │
              └────────────────────────┘

Data Flow

Decoding Pipeline

The typical flow for decoding multimedia content:

Input → libavformat opens input file/stream
Demux → libavformat demuxes packets from containers
Decode → libavcodec decodes packets into frames
Filter → (Optional) libavfilter processes frames
Output → Frames are ready for display or further processing

Input File/Stream
       │
       ▼
┌─────────────┐
│ libavformat │  ← Demuxer
│   (open)    │
└─────────────┘
       │
       ▼ (AVPacket)
┌─────────────┐
│ libavcodec  │  ← Decoder
│  (decode)   │
└─────────────┘
       │
       ▼ (AVFrame)
┌─────────────┐
│ libavfilter │  ← Optional filtering
│  (process)  │
└─────────────┘
       │
       ▼ (Processed AVFrame)
    Output

Encoding Pipeline

The typical flow for encoding multimedia content:

Input → Raw frames from source
Filter → (Optional) libavfilter processes frames
Encode → libavcodec encodes frames into packets
Mux → libavformat muxes packets into container
Output → libavformat writes to file/stream

Input Frames
       │
       ▼ (AVFrame)
┌─────────────┐
│ libavfilter │  ← Optional filtering
│  (process)  │
└─────────────┘
       │
       ▼ (AVFrame)
┌─────────────┐
│ libavcodec  │  ← Encoder
│  (encode)   │
└─────────────┘
       │
       ▼ (AVPacket)
┌─────────────┐
│ libavformat │  ← Muxer
│   (write)   │
└─────────────┘
       │
       ▼
Output File/Stream

Transcoding Pipeline

Combining both decoding and encoding:

Input → Demux → Decode → Filter → Encode → Mux → Output
        (fmt)   (codec)  (filter) (codec)  (fmt)

Key Data Structures

AVPacket

Represents compressed data (encoded audio/video/subtitle).

Contains compressed bitstream data
Used between demuxer and decoder (or encoder and muxer)
Includes timing information (PTS, DTS)
May contain side data (metadata, subtitles, etc.)

AVFrame

Represents uncompressed data (raw audio/video frames).

Contains decoded pixel/sample data
Used between decoder and filters (or filters and encoder)
Includes metadata (format, dimensions, sample rate)
Reference-counted for efficient memory management

AVCodecContext

Context for encoding or decoding operations.

Holds codec-specific configuration
Maintains encoder/decoder state
Contains options for quality, bitrate, etc.

AVFormatContext

Context for format (muxer/demuxer) operations.

Represents input or output file/stream
Contains stream information
Manages I/O operations
Holds format-specific metadata

Threading Model

FFmpeg supports multiple threading strategies for improved performance:

Frame Threading

Decodes multiple frames in parallel
Each frame is processed by a separate thread
Adds minimal latency (one frame per thread)
Best for independent frame codecs

Slice Threading

Splits a single frame into slices
Each slice is decoded by a separate thread
No additional latency
Requires codec support for slices

For detailed information about threading implementation, see the Multithreading guide.

Memory Management

FFmpeg uses reference counting for efficient memory management:

AVBuffer - Reference-counted buffer system
AVFrame - Automatically reference-counted
AVPacket - Reference-counted packet data

This approach allows:

Zero-copy operations where possible
Safe sharing of data between components
Automatic memory cleanup

Best Practices

Use Reference Counting

Leverage AVBuffer reference counting to avoid unnecessary data copies

Check Return Values

Always check return values from FFmpeg functions for error handling

Free Resources

Properly free all allocated contexts and frames to prevent memory leaks

Thread Safety

Ensure callbacks are thread-safe when using frame threading

Common Patterns

Opening an Input File

AVFormatContext *fmt_ctx = NULL;
avformat_open_input(&fmt_ctx, filename, NULL, NULL);
avformat_find_stream_info(fmt_ctx, NULL);

Decoding a Frame

AVPacket *pkt = av_packet_alloc();
AVFrame *frame = av_frame_alloc();

av_read_frame(fmt_ctx, pkt);
avcodec_send_packet(codec_ctx, pkt);
avcodec_receive_frame(codec_ctx, frame);

Encoding a Frame

avcodec_send_frame(codec_ctx, frame);
avcodec_receive_packet(codec_ctx, pkt);
av_interleaved_write_frame(fmt_ctx, pkt);

Additional Resources

API Changes

Track API changes and version compatibility

Optimization

Performance optimization techniques

Multithreading

Threading models and best practices

Examples

Official FFmpeg code examples

Contributing

Developer Resources

​Overview

​Core Libraries

​libavcodec

​libavformat

​libavutil

​libavfilter

​libavdevice

​libswresample

​libswscale

​Component Relationships

​Data Flow

​Decoding Pipeline

​Encoding Pipeline

​Transcoding Pipeline

​Key Data Structures

​AVPacket

​AVFrame

​AVCodecContext

​AVFormatContext

​Threading Model

​Frame Threading

​Slice Threading

​Memory Management

​Best Practices

Use Reference Counting

Check Return Values

Free Resources

Thread Safety

​Common Patterns

​Opening an Input File

​Decoding a Frame

​Encoding a Frame

​Additional Resources

API Changes

Optimization

Multithreading

Examples

Build docs developers (and LLMs) love

Overview

Core Libraries

libavcodec

libavformat

libavutil

libavfilter

libavdevice

libswresample

libswscale

Component Relationships

Data Flow

Decoding Pipeline

Encoding Pipeline

Transcoding Pipeline

Key Data Structures

AVPacket

AVFrame

AVCodecContext

AVFormatContext

Threading Model

Frame Threading

Slice Threading

Memory Management

Best Practices

Common Patterns

Opening an Input File

Decoding a Frame

Encoding a Frame

Additional Resources