A comparison of upscaling software in 2021

2022-01-14 901 words 5 minutes

Contents

A few months ago, I started to search for some upscaling software to integrate into Zogwine, my media center with the goal of upscaling some of my old animes either on the fly or via some kind of preprocessing. I found quite a lot of projects on github, and especially the Video2X project which gathered many different projects and versions. So I decided to do a quick benchmark to decide which one to use. I would also recommend you to read this excellent article from crunchyroll’s tech blog where they did their own tests for upscaling.

Presentation of the benchmarked solutions

I used the version 4.6.0 of Video2X and the version 3.3.1 of dandere2X.

dandere2x is a faster version of Waifu2X, an upscaling software targeted to anime-style images and using a CNN approach.

Video2X allows to use different upscaling software from the same interface. I have decided to test the following:

Anime4KCPP an algorithmic approach to upscale anime-style images, based on the Anime4K algorithm
Anime4KCPP_GPU Anime4KCPP in GPU mode
Anime4KCPP_GPU_CNN Anime4KCPP with a CNN based algorithm and running on the GPU
REALSR_NCNN_VULKAN an implementation of RealSR using the NCNN project and the Vulkan API
SRMD_NCNN_VULKAN an implementation of SRMD using the NCNN project and the Vulkan API
Waifu2X_NCNN_VULKAN an implementation of Waifu2X using NCNN and the Vulkan API
Waifu2X_CAFFE_CUNET a Waifu2X implementation using Caffe and the CUDA API
Waifu2X_CAFFE_UPRESNET10 the upresnet10 model for Waifu2X-Caffe running on the GPU
Waifu2X_CAFFE_ANIME_STYLE_ART_RGB the anime_style_art_rgb model for Waifu2X-Caffe running on the GPU
Waifu2X_CPP_CONVERTER Waifu2X C++ implementation running on the GPU

You can find additional information on the different Waifu2X-Caffe models here.

All the tests were ran using an X3 upscale ratio, to go from a 480p video to a 1440p video on a configuration with a Ryzen 7 5800X, 16GB RAM and an RTX 2070. Each upscale was run with this configuration without any other significant software running in the meantime.

The input video was a 480p version of the 4th opening of Naruto Shippuden. The images shown here are only for educational purposes and belongs to their respective copyright owners.

Gathering Results

To process the results and compare the different solutions, I created a python script to extract a few frames (158, 500, 835, 1090, 1200 and 2200) for each video using ffmpeg.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import os

FRAMES = [158, 500, 835, 1090, 1200, 2200]

def extractFrames(path, f, offset=0):
    print("extracting frames for: "+f)
    for i in FRAMES:
        os.system('ffmpeg -i "'+f+'" -vf "select=gte(n\,'+str(i+offset)+')" -vframes 1 '+path+'/frame_'+str(i+offset)+'.png ')

base = os.getcwd()
for dir in os.listdir(base):
    path = os.path.join(base, dir)
    if os.path.isdir(path):
        for file in os.listdir(path):
            if file.endswith(".mkv") or file.endswith(".mp4"):
                extractFrames(path, os.path.join(path, file))
                break
    elif path.endswith(".mkv") or path.endswith(".mp4"):
        extractFrames(base, path, -1)

In addition to this, I also executed the VMAF tool from Netflix to get some quality indicator for each video (compared to the original) using the easy-vmaf docker image like this: docker run –rm -v /srv/Documents/test_upscaling:/vid gfdavila/easyvmaf -r /vid/nt_op_480.mkv -d /vid/upscale_video2x_waifu2x_ncnn_vulkan/n_op_1440.mp4.

I then graphed the results of pooled_metrics > vmaf > harmonic_mean for each method compared to their execution time. Here is the result:

/2022/01/14/a-comparison-of-upscaling-software-in-2021/assets/result.png — VMAF score and processing time for each upscaling solution

DOWNLOAD xls file

Conclusions

The input video has a length of 90 seconds, so for a real time upscaling system, only anime4KCPP qualifies. We can also note that a GLSL implementation of Anime4K exists which opens the door to client-side upscaling.

We can exclude the implementations with a time greater than 4000s as more than 1 hour to process a 1min30 video is wayy too long.

For real time upscaling the most efficient version seems to be video2x_anime4KCPP_cnn_gpu while dandere2x_waifu2x_caffe seems to bring the best quality for the lowest execution time among the potential pre-processing upscaling.

/2022/01/14/a-comparison-of-upscaling-software-in-2021/assets/original_anime4kcpp_gpu_anime4kcpp_cnn.png — Original vs Anime4KCPP_gpu vs Anime4KCPP_cnn_gpu

We can see here that the cnn version of anime4KCPP brings quite interesting improvements with smoother curves while anime4KCPP_gpu ends up outputting a more blurry image than the original.

/2022/01/14/a-comparison-of-upscaling-software-in-2021/assets/original_vs_anime4k_cnn.png — Original vs Anime4KCPP_cnn_gpu

We can note that while the image improvements are minor with Anime4KCPP_cnn_gpu, the this is not the case for the text, which is way more readable in the upscaled version.

/2022/01/14/a-comparison-of-upscaling-software-in-2021/assets/og_realsr_srmd_cunet_rgb_dandere.png — Original vs RealSR vs SRMD | Waifu2x_cunet vs Waifu2x_anime_style_art_rgb vs dandere2x

/2022/01/14/a-comparison-of-upscaling-software-in-2021/assets/original_dandere.png — Original vs dandere2x

In opposition to the anime4K performance for the image, we can see some really impressive improvements with an image without any major pixelation for each Waifu2X methods. With the lowsest execution time among the non-realtime candidates, dandere2x is the clear winner of this benchmark.

General Conclusion

With this benchmark, I was able to determine the best upscaling software to use for my mediacenter. Anime4KCPP with the CNN version seems to be the best possible result for a realtime upscaling. While this remains an option, I don’t think that this feature would be implemented in the near future, as the benefits are quite thin.

However, I discovered a very interesting option for a pre-processing feature (the file would be upscaled in background and not in realtime), dandere2x which proved to bring really impressive results. But while this solution is the fastest for non-realtime upscaling, this would still take more than 5 hours to upscale a single episode. This could be implemented as a feature that allows to upscale only a few selected files, a distributed system may also be a good lead if you have multiple powerful computers at home.