Overview
SRVGGNetCompact is a compact VGG-style network architecture designed for efficient super-resolution. It performs upsampling in the last layer and conducts no convolution on the HR feature space, making it computationally efficient. This architecture is used in the lightweight Real-ESRGAN models likerealesr-animevideov3 and realesr-general-x4v3.
Class Definition
Parameters
Number of input channels. Typically 3 for RGB images.
Number of output channels. Typically 3 for RGB images.
Number of feature channels in intermediate layers. Higher values increase model capacity but also computational cost.
Number of convolutional layers in the body network. More layers allow the model to learn more complex patterns.
Upsampling factor for super-resolution. Common values are 2, 4, or 8.
Activation function type. Options:
'relu': ReLU activation'prelu': Parametric ReLU (default, learns activation parameters)'leakyrelu': Leaky ReLU with negative slope of 0.1
Architecture Details
The network consists of:- Initial convolution: 3×3 conv layer that expands input channels to
num_featchannels - Body network:
num_convlayers of 3×3 convolutions with activation functions - Final convolution: Maps features to output space (channels =
num_out_ch × upscale²) - Pixel shuffle upsampler: Rearranges feature maps to produce high-resolution output
- Residual connection: Adds nearest-neighbor upsampled input to the network output
The network learns residual information rather than the full high-resolution image, which helps with training stability and performance.
Model Configurations
realesr-animevideov3 (XS size)
realesr-general-x4v3 (S size)
Usage Example
SRVGGNetCompact is significantly more efficient than RRDBNet, making it ideal for real-time applications and video processing.
Forward Method
- Processes input through body network layers sequentially
- Applies pixel shuffle to upsample feature maps
- Adds nearest-neighbor upsampled input as residual
Source
Defined inrealesrgan/archs/srvgg_arch.py