What is ClassSR and how it helps in Super-Resolution Networks?

Super-Resolution is the process of generating a high-resolution image out of a low-resolution input image. Large-sized low-resolution images are usually broken down into small patches and restored back to high-resolution images. Based on the decomposed images’ complexity (patches of the original image), the super-resolution network takes different computation time. A smooth patch with minimal variations such as sky may take less time to super-resolved, whereas a feature-rich patch with a high-degree of variations such as flowers or butterflies may take more time to super-resolve.

CNN-based deep neural networks such as SRCNN, VDSR, SRResNet, EDSR, RDN, RRDB, RCAN, SAN and RFA had introduced the super-resolution domain and developed greatly. However, they have limited practical applications due to their expensive computational costs. Light-weight models such as FSRCNN, ESPCN, LapSRN, CARN, IMDN and PAN have been introduced to improve super-resolution performance while reducing the computational complexity. However, the large-sized real-world tasks and very-high-resolution output devices such as TV monitors challenge the practical implementation of these networks. An approach that utilizes relatively less computational cost without compromising excellent performance has become the need for the super-resolution domain.

Researchers Xiangtao Kong, Hengyuan Zhao, Yu Qiao and Chao Dong have introduced ClassSR, a general super-resolution framework that can be implemented with any existing super-resolution models that we discussed above to accelerate the workflow speed. The ClassSR efficiently utilizes the available computational resources to decompose the original image, super-resolve it, and restore it. It saves around 50% of computational resources in terms of FLOPS for any super-resolution network while improving their performance in most cases.

ClassSR comparison — Comparison of the original super-resolution networks with their corresponding ClassSR implementation on DIV8K dataset.

How does ClassSR work?

The ClassSR framework is a pipeline with two modules: a classification module known as the Class-Module and a super-resolution module known as the SR-Module. Both the modules function simultaneously during deployment. The original large image is broken down into image patches (sub-images) and fed into the Class-Module of the ClassSR. The Class-Module classifies the input image patches into one of the three categories: simple, medium, and hard based on their restoration ability. This module uses newly introduced Class-Loss and Average-Loss to obtain a better classification.

Classification of sub-images of the original input image into three classes: simple, medium and hard.

Once classified, the sub-images are restored to high-resolution using separate networks meant for each sub-images class. The simple sub-images are restored using a simple network, the medium sub-images are restored using a medium network and the hard sub-images are restored using a hard network. The key success behind the ClassSR is due to the fact that the simple sub-images occupy a lot of space in the original given image. A simple network requires a lot less computational resources than that of the medium and the hard networks.

The major difference between the original networks and their ClassSR implementations is that the original networks employ the same full-sized network to restore any sub-image, thus wasting their computational resources. But, the ClassSR uses a full-sized network only for restoring the hard sub-images, a smaller network for the medium sub-images and an even smaller network for the simple sub-images. The whole framework is trained end-to-end with both modules.

Overview of the ClassSR framework — Overview of the functions in a ClassSR framework

Python Implementation of ClassSR

ClassSR requirements are:

Python 3.6+
Pytorch 1.5.0+
GPU and CUDA
NumPy
OpenCV-Python
lmdb
TensorboardX

Install dependencies using the following commands.

 %%bash
 pip install opencv-python 
 pip install lmdb
 pip install tensorboardX

Download the source code from the official repository using the following commands.

 %%bash
 git clone https://github.com/Xiangtaokong/ClassSR.git

Change the directory to /content/ClassSR to proceed further with the contents.

%cd /content/ClassSR/

Check the proper download of the code files and accessories using the following command.

!ls -p

Output:

Download the validation images from the DIV2K dataset and save them to the datasets directory. Download the validation log file and move it to the /content/ClassSR/codes/data_scripts directory.

 %cd 
 !mv divide_val.log /content/ClassSR/codes/data_scripts/

Decompose the validation images into sub-images and classify them with the Class-Module of the ClassSR framework using the following commands.

 %%bash
 cd /content/ClassSR/codes/data_scripts
 python extract_subimages_test.py
 python divide_subimages_test.py

Download the pre-trained models from the official page and move them to the /content/ClassSR/experiments/pretrained_models directory. The following commands test the models with downloaded validation images.

 %%bash
 cd /content/ClassSR/codes
 python test.py -opt options/test/test_FSRCNN.yml
 python test.py -opt options/test/test_CARN.yml
 python test.py -opt options/test/test_SRResNet.yml
 python test.py -opt options/test/test_RCAN.yml

The above commands test the framework with FSRCNN model, CARN model, SRResNet model and RCAN model on the same images for comparison. The test results are logged in the folder /content/ClassSR/results. Visualization of log results can be performed using the tensorboardX library.

Performance of ClassSR

ClassSR is implemented on existing super-resolution methods such as FSRCNN, CARN, SRResNet and RCAN. The original versions and the ClassSR implementations are trained and evaluated on the DIV2K dataset and DIV8K dataset. Computational complexity is measured in terms of the FLOPs and the restoration quality is measured in terms of the Peak Signal-to-Noise Ratio (PSNR) metric.

As an example, the fully-trained ClassSR-FSRCNN is tested on the Test8K dataset. Out of which 61% of the sub-images are classified as the simple images, 23% of the sub-images are classified as the medium images and 16% of the sub-images are classified as the hard images and are assigned to corresponding networks. This saves more than 50% FLOPs while the performance remains almost the same.