Fast deep parallel residual network for accurate super resolution image processing
Introduction
Many researchers have participated in working on single image super resolution (SISR) problem (Glasner, Bagon, & Irani, 2009) in computer vision domain since 1970s (Duchon, 1979). This typical issue has a wide range of applications including medical imaging, astronomy, machine vision, and auto-driving, etc. There is a common objective in these application areas that more image information is required for further processing. Hence the basic operation for SISR is to gain high-resolution (HR) images based on low-resolution (LR) images.
Early implementations for SISR offer multiple ideas and algorithms include local linear regression (Timofte, De Smet, & Van Gool, 2014), sparse coding (Jianchao, Wright, Huang, & Ma, 2008), dictionary learning (Yang, Wright, Huang, & Ma, 2010) and random forest (Schulter, Leistner, & Bischof, 2015). These shallow methods have been demonstrated quite successful in both image super resolution theory and application.
As convolutional neural network models blossom in computer vision community, Super-Resolution Convolutional Neural Network abbreviated as SRCNN (Dong, Loy, He, & Tang, 2016) starts a new era for faster and much more accurate outcomes that inspires new algorithms and technologies. It applied fully convolutional structure to perform no-linear mapping from LR to HR, and without further per-engineered features, SRCNN provided significant improvements compared to non-deep learning models. However, as a representative model with small size deep learning network, the learning ability of SRCNN cannot provide satisfied results when managing large and complicated data.
After SRCNN inspires new research in SISR study, many new state-of-the-art methods have appeared in recent years. There are two main methods for enhancement. On one side, some researchers have designed network structure to increase network depth and prevent gradient explosion and disappears. Kim, Kwon Lee, and Mu Lee (2016a) proposed that based on Global Residual Learning, deeper network called Very Deep Super-Resolution (VDSR) with 20 convolution layers gains better accuracy and convergence speed in image super resolution. At the same time, Kim, Kwon Lee, and Mu Lee (2016b) proposed another network structure as “Deeply-Recursive Convolutional Network” (DRCN) with weight-balanced residual learning in recursive layer design which also demonstrated excellent results. In the most recent years, Deep Laplacian Pyramid Networks for Super-Resolution (LapSRN) (Lai, Huang, Ahuja, & Yang, 2017) introduced an innovative idea which used a cascade learning (pyramid structure) to get output step by step. This network has shown great outcome in upscale 8 × condition and it proposed a new loss function. On the other side, some other methods such as Enhanced Deep Residual Networks for Super-Resolution (EDSR) (Lim, Son, Kim, Nah, & Lee, 2017) and a deep convolutional neural network with selection units for super-resolution (SelNet) (Choi & Kim, 2017) had achieved a breakthrough in some existing restrictions by training much higher quality data (Agustsson & Timofte, 2017) which can produce a better resulting system. However, EDSR, SelNet and other methods from NTIRE (Timofte et al., 2017) challenge perform average quality with recent benchmarks when using same training datasets.
In this paper, a novel deep network structure design using parallel residual learning (DPRN) has been proposed to achieve the high quality of image super resolution and provide new layout for deep network with 35 layers. In this new approach, convolutional layers have been grouped as residual combination and put into branches. Each layer gains input information from the first two layers and each branch performs local residual learning to pass branch output to next branch. After up-sampling process, the original data information conducts global residual learning with branches’ output to provide the final output. The structure of DPRN can avoid information deterioration during the layer training process while increasing the depth of network, and more information can be learned for convolutional layers in residual combinations. The new network also applied Adam optimizer (Kinga & Adam, 2015) instead of common Stochastic Gradient Descent (SGD) to provide adaptive learning rate for different parameters and reduce resource consuming during training process. The experiment results have demonstrated that proposed DPRN increased 1.08 dB, 0.21 dB, 0.22 dB than SRCNN, VDSR and LapSRN with upscale 2 × on set5 (Bevilacqua, Roumy, Guillemot, & Alberi-Morel, 2012) testing dataset. Furthermore, the model execution time of DPRN is faster than most of the existing methods, the new approach achieves 27.18 fps average value for all 4 test datasets and it demonstrates the capability for real time applications.
This paper is organized in the following five sections. Firstly, it gives the brief introduction of single image super resolution with current state-of-the-art methods and general information about this paper. Section 2 describes the related work. We review Deep residual network (ResNet) concept (He, Zhang, Ren, & Sun, 2016) and some of the recent state-of-the-art methods such as VDSR (Kim et al., 2016a), DRCN (Kim et al., 2016b) and LapSRN (Lai et al., 2017). Section 3 explains the technical parts for the proposed DPRN and explains how our new network can achieve good quality of super resolution images when compared to other existing methods. Finally, the experimental results using benchmark datasets and conclusion for our proposed method have been shown in Sections 4 and 5, respectively.
Section snippets
Related work
Learned from biological processes (Matsugu, Mori, Mitari, & Kaneda, 2003), CNN based models have showed to have a great effect and usability in computer vision applications. In this section, we review recent state-of-the-art methods in SISR area and the superior idea of ResNet (He et al., 2016) which have provided the basis for DPRN. Fig. 1 provides the network structure for ResNet, VDSR and DRCN where ReLU (Nair & Hinton, 2010) and batch normalization (Ioffe & Szegedy, 2015) layers have been
Deep parallel residual network
In this section, detailed design and technical explanation for DPRN will be presented. We introduce a new approach for connecting multiple residual branches together. Each branch has initial feature mapping at first convolution, the information will pass to Residual Combinations for parallel convolutional training. The first convolutional layer H0 will conduct local residual learning with output from residual combinations. The result of this branch will deliver to batch normalization layer (
Experiment
In this section, we compare the proposed DPRN with several current state-of-the-art methods on standard testing datasets. The results include quantitative and qualitative evaluation of the accuracy performance, runtime comparison and model convergence trend. The experiment shows our DPRN approach achieves the best accuracy performance for image reconstruction, while the average model execution time reaches the requirement for real-time human eye recognition (>24 fps). We also discuss the
Conclusions
In this article, a novel deep convolutional neural network for single image super resolution named Deep Parallel Residual Network (DPRN) has been proposed for superior accuracy outcome and balanced real-time model execution. Our model consists of residual combination and residual branch to conduct as an efficient local residual learning and global residual learning algorithm. Each convolutional layer in this residual combination will perform as the parallel learning from previous two layers. In
CRediT authorship contribution statement
Feng Sha: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Writing - original draft, Writing - review & editing. Seid Miad Zandavi: Software, Validation. Yuk Ying Chung: Supervision, Writing - review & editing.
References (35)
- et al.
Subject independent facial expression recognition with robust face detection using a convolutional neural network
Neural Networks
(2003) - et al.
NTIRE 2017 challenge on single image super-resolution: dataset and study
- Bevilacqua, M., Roumy, A., Guillemot, C., & Alberi-Morel, M. L. (2012). Low-complexity single-image super-resolution...
Stochastic gradient descent tricks
Neural networks: Tricks of the trade
(2012)- et al.
A deep convolutional neural network with selection units for super-resolution
- et al.
Image super-resolution using deep convolutional networks
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2016) - et al.
Accelerating the super-resolution convolutional neural network
- et al.
Adaptive subgradient methods for online learning and stochastic optimization
Journal of Machine Learning Research
(2011) Lanczos filtering in one and two dimensions
Journal of Applied Meteorology
(1979)- et al.
Super-resolution from a single image
Generative adversarial nets
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification
Deep residual learning for image recognition
Single image super-resolution from transformed self-exemplars
Batch normalization: Accelerating deep network training by reducing internal covariate shift
Caffe: Convolutional architecture for fast feature embedding
Image super-resolution as sparse representation of raw image patches
Cited by (17)
DarkDeblur: Learning single-shot image deblurring in low-light condition
2023, Expert Systems with ApplicationsSelf-supervised cycle-consistent learning for scale-arbitrary real-world single image super-resolution
2023, Expert Systems with ApplicationsCitation Excerpt :Learning-based methods generally obtain the LR-to-HR mapping from training images. A variety of models have been used to learn the mapping, including sparse coding (Li et al., 2020; Yang et al., 2010; Zeyde et al., 2010), neighborhood regression (Perez-Pellitero et al., 2016; Timofte et al., 2013, 2014; Zhang, Wang et al., 2019), random forests (Huang et al., 2017; Schulter et al., 2015; Zhi-Song & Siu, 2018), and deep CNNs (Chang et al., 2020; Dai et al., 2019; Dong et al., 2015; Guo et al., 2020; Huang et al., 2021; Ledig et al., 2017; Lim et al., 2017; Sha et al., 2019; Sharma & Kumar, 2021; Wang et al., 2022; Zhang, Gao et al., 2021; Zhang, Li et al., 2018; Zhao et al., 2019; Zhou et al., 2021). Among them, deep CNNs are the current mainstream models for their impressive reconstruction accuracy and high efficiency on the GPU platform.
A two-stage deep generative adversarial quality enhancement network for real-world 3D CT images
2022, Expert Systems with ApplicationsCitation Excerpt :Image QE makes it possible to obtain HQ images without upgrading imaging systems, which is more flexible and economical. Typical image/video QE tasks include super-resolution (SR) (Chang, Li, Ding, & Li, 2020; Chen, He, Ren, Qing, & Teng, 2018; Dong, Loy, He, & Tang, 2016; Huang, Li, Li, & Zhou, 2021; Sha, Zandavi, & Chung, 2019; Zhao, Zhang, Zhang, & Zou, 2019), denoising (Dong, et al., 2019; Gai & Bao, 2019; Zhang, Zuo, Chen, Meng, & Zhang, 2017), deblurring (Gao, Tao, Shen, & Jia, 2019; Liu, Feng, Zhang, Song, & Chen, 2019), compression artifacts reduction (Chen, He, An, & Nguyen, 2019; Liu, Cheung, Wu, & Zhao, 2017; Lu, et al., 2018), dehazing (Huang, Ye, & Chen, 2014; Zhang, Sindagi, & Patel, 2018), deraining (Fu, Liang, Huang, Ding, & Paisley, 2019; Liu, Yang, Yang, & Guo, 2018), etc. In this work, we focus on the QE approaches for real-world 3D CT images of rock samples.
Deep residual network for face sketch synthesis
2022, Expert Systems with ApplicationsDivide and conquer: Ill-light image enhancement via hybrid deep network
2021, Expert Systems with ApplicationsCitation Excerpt :The recovery of clean images from ill-lit images is an ill-posed inverse problem due to several possible solutions. The problem is also combated by super-resolution (Sha, Zandavi, & Chung, 2019) and high dynamic range (HDR) imaging methods. HDR methods gather the pixel brightness information from two or more successive photographs with multiple exposures.
A novel deep auto-encoder considering energy and label constraints for categorization
2021, Expert Systems with ApplicationsCitation Excerpt :It is universally acknowledged that if supervised method can be added to AE, it would enhance the performance of categorization (Zheng, 2015). Traditional neural network suffers slow convergence, and it is prone to falling into a local minimum with too many layers (Sha et al., 2019). To deal with these issues, as a kind of mainstream learning method, deep learning usually adopts unsupervised process for pre-training on a large number of unlabeled data (Lee and Nam, 2017; Rigollet, 2006).