Skip to main content

Overview

FFmpeg is a collection of libraries and tools designed to process multimedia content including audio, video, subtitles, and related metadata. The architecture is modular, with each library serving a specific purpose in the multimedia processing pipeline.

Core Libraries

FFmpeg consists of seven core libraries that work together to provide comprehensive multimedia processing capabilities:

libavcodec

Provides implementation of a wide range of codecs for encoding and decoding audio, video, and subtitle streams. Key responsibilities:
  • Codec implementation (encoders and decoders)
  • Bitstream parsing
  • Frame encoding/decoding
  • Packet handling
Common use cases:
  • Decoding H.264/H.265 video streams
  • Encoding audio to AAC or MP3
  • Converting between codec formats

libavformat

Implements streaming protocols, container formats, and basic I/O access. Key responsibilities:
  • Demuxing (reading) container formats
  • Muxing (writing) container formats
  • Protocol handling (file, HTTP, RTSP, etc.)
  • Stream management
Common use cases:
  • Reading MP4, MKV, AVI files
  • Writing output to various containers
  • Streaming over network protocols

libavutil

Includes hashers, decompressors, and miscellaneous utility functions. Key responsibilities:
  • Common data structures (AVFrame, AVPacket)
  • Mathematical utilities
  • Memory management helpers
  • Logging and error handling
  • Pixel format definitions
  • Color space utilities

libavfilter

Provides a framework to alter decoded audio and video through a directed graph of connected filters. Key responsibilities:
  • Filter graph management
  • Audio and video filtering
  • Filter chain processing
  • Format conversion
Common use cases:
  • Scaling and cropping video
  • Audio resampling
  • Overlay and watermarking
  • Complex filter chains

libavdevice

Provides an abstraction to access capture and playback devices. Key responsibilities:
  • Device input/output abstraction
  • Screen capture
  • Camera access
  • Audio device handling

libswresample

Implements audio mixing and resampling routines. Key responsibilities:
  • Sample format conversion
  • Channel layout conversion
  • Sample rate conversion
  • Audio mixing

libswscale

Implements color conversion and scaling routines. Key responsibilities:
  • Image scaling
  • Pixel format conversion
  • Color space conversion
  • Chroma upsampling/downsampling

Component Relationships

┌─────────────────────────────────────────────────────────┐
│                     FFmpeg Tools                         │
│           (ffmpeg, ffplay, ffprobe)                     │
└─────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────┐
│                    libavformat                          │
│         (Container format muxing/demuxing)              │
└─────────────────────────────────────────────────────────┘
          │                                    │
          ▼                                    ▼
┌──────────────────────┐            ┌──────────────────────┐
│    libavcodec        │            │   libavfilter        │
│ (Encoding/Decoding)  │◄──────────►│  (Filter graphs)     │
└──────────────────────┘            └──────────────────────┘
          │                                    │
          └────────────────┬───────────────────┘

              ┌────────────────────────┐
              │      libavutil         │
              │  (Common utilities)    │
              └────────────────────────┘

          ┌───────────────┴───────────────┐
          ▼                               ▼
┌──────────────────┐          ┌──────────────────┐
│  libswresample   │          │   libswscale     │
│ (Audio resample) │          │ (Video scaling)  │
└──────────────────┘          └──────────────────┘
          │                               │
          └───────────────┬───────────────┘

              ┌────────────────────────┐
              │     libavdevice        │
              │   (Device I/O)         │
              └────────────────────────┘

Data Flow

Decoding Pipeline

The typical flow for decoding multimedia content:
  1. Input → libavformat opens input file/stream
  2. Demux → libavformat demuxes packets from containers
  3. Decode → libavcodec decodes packets into frames
  4. Filter → (Optional) libavfilter processes frames
  5. Output → Frames are ready for display or further processing
Input File/Stream


┌─────────────┐
│ libavformat │  ← Demuxer
│   (open)    │
└─────────────┘

       ▼ (AVPacket)
┌─────────────┐
│ libavcodec  │  ← Decoder
│  (decode)   │
└─────────────┘

       ▼ (AVFrame)
┌─────────────┐
│ libavfilter │  ← Optional filtering
│  (process)  │
└─────────────┘

       ▼ (Processed AVFrame)
    Output

Encoding Pipeline

The typical flow for encoding multimedia content:
  1. Input → Raw frames from source
  2. Filter → (Optional) libavfilter processes frames
  3. Encode → libavcodec encodes frames into packets
  4. Mux → libavformat muxes packets into container
  5. Output → libavformat writes to file/stream
Input Frames

       ▼ (AVFrame)
┌─────────────┐
│ libavfilter │  ← Optional filtering
│  (process)  │
└─────────────┘

       ▼ (AVFrame)
┌─────────────┐
│ libavcodec  │  ← Encoder
│  (encode)   │
└─────────────┘

       ▼ (AVPacket)
┌─────────────┐
│ libavformat │  ← Muxer
│   (write)   │
└─────────────┘


Output File/Stream

Transcoding Pipeline

Combining both decoding and encoding:
Input → Demux → Decode → Filter → Encode → Mux → Output
        (fmt)   (codec)  (filter) (codec)  (fmt)

Key Data Structures

AVPacket

Represents compressed data (encoded audio/video/subtitle).
  • Contains compressed bitstream data
  • Used between demuxer and decoder (or encoder and muxer)
  • Includes timing information (PTS, DTS)
  • May contain side data (metadata, subtitles, etc.)

AVFrame

Represents uncompressed data (raw audio/video frames).
  • Contains decoded pixel/sample data
  • Used between decoder and filters (or filters and encoder)
  • Includes metadata (format, dimensions, sample rate)
  • Reference-counted for efficient memory management

AVCodecContext

Context for encoding or decoding operations.
  • Holds codec-specific configuration
  • Maintains encoder/decoder state
  • Contains options for quality, bitrate, etc.

AVFormatContext

Context for format (muxer/demuxer) operations.
  • Represents input or output file/stream
  • Contains stream information
  • Manages I/O operations
  • Holds format-specific metadata

Threading Model

FFmpeg supports multiple threading strategies for improved performance:

Frame Threading

  • Decodes multiple frames in parallel
  • Each frame is processed by a separate thread
  • Adds minimal latency (one frame per thread)
  • Best for independent frame codecs

Slice Threading

  • Splits a single frame into slices
  • Each slice is decoded by a separate thread
  • No additional latency
  • Requires codec support for slices
For detailed information about threading implementation, see the Multithreading guide.

Memory Management

FFmpeg uses reference counting for efficient memory management:
  • AVBuffer - Reference-counted buffer system
  • AVFrame - Automatically reference-counted
  • AVPacket - Reference-counted packet data
This approach allows:
  • Zero-copy operations where possible
  • Safe sharing of data between components
  • Automatic memory cleanup

Best Practices

Use Reference Counting

Leverage AVBuffer reference counting to avoid unnecessary data copies

Check Return Values

Always check return values from FFmpeg functions for error handling

Free Resources

Properly free all allocated contexts and frames to prevent memory leaks

Thread Safety

Ensure callbacks are thread-safe when using frame threading

Common Patterns

Opening an Input File

AVFormatContext *fmt_ctx = NULL;
avformat_open_input(&fmt_ctx, filename, NULL, NULL);
avformat_find_stream_info(fmt_ctx, NULL);

Decoding a Frame

AVPacket *pkt = av_packet_alloc();
AVFrame *frame = av_frame_alloc();

av_read_frame(fmt_ctx, pkt);
avcodec_send_packet(codec_ctx, pkt);
avcodec_receive_frame(codec_ctx, frame);

Encoding a Frame

avcodec_send_frame(codec_ctx, frame);
avcodec_receive_packet(codec_ctx, pkt);
av_interleaved_write_frame(fmt_ctx, pkt);

Additional Resources

API Changes

Track API changes and version compatibility

Optimization

Performance optimization techniques

Multithreading

Threading models and best practices

Examples

Official FFmpeg code examples

Build docs developers (and LLMs) love