Skip to main content
Log in

Wavelet Domain Generative Adversarial Network for Multi-scale Face Hallucination

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Most modern face hallucination methods resort to convolutional neural networks (CNN) to infer high-resolution (HR) face images. However, when dealing with very low-resolution (LR) images, these CNN based methods tend to produce over-smoothed outputs. To address this challenge, this paper proposes a wavelet-domain generative adversarial method that can ultra-resolve a very low-resolution (like \(16\times 16\) or even \(8\times 8\)) face image to its larger version of multiple upscaling factors (\(2\times \) to \(16\times \)) in a unified framework. Different from the most existing studies that hallucinate faces in image pixel domain, our method firstly learns to predict the wavelet information of HR face images from its corresponding LR inputs before image-level super-resolution. To capture both global topology information and local texture details of human faces, a flexible and extensible generative adversarial network is designed with three types of losses: (1) wavelet reconstruction loss aims to push wavelets closer with the ground-truth; (2) wavelet adversarial loss aims to generate realistic wavelets; (3) identity preserving loss aims to help identity information recovery. Extensive experiments demonstrate that the presented approach not only achieves more appealing results both quantitatively and qualitatively than state-of-the-art face hallucination methods, but also can significantly improve identification accuracy for low-resolution face images captured in the wild.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. https://github.com/hhb072/WaveletSRNet

References

  • Anbarjafari, G., & Demirel, H. (2010). Image super resolution based on interpolation of wavelet domain high frequency subbands and the spatial domain input image. ETRI Journal, 32(3), 390–394.

    Article  Google Scholar 

  • Bruna, J., Sprechmann, P., & LeCun, Y. (2016). Super-resolution with deep convolutional sufficient statistics. In International conference on learning representations.

  • Bulat, A., & Tzimiropoulos, G. (2018). Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with GANS. In IEEE conference on computer vision and pattern recognition (pp. 109–117).

  • Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a GAN to learn how to do image degradation first. In European conference on computer vision (pp. 185–200).

  • Chang, H., Yeung, D. Y., & Xiong, Y. (2004). Super-resolution through neighbor embedding. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 275–282).

  • Chen, Y., Tai, Y., Liu, X., Shen, C., & Yang, J. (2018). FSRNet: End-to-end learning face super-resolution with facial priors. In IEEE conference on computer vision and pattern recognition (pp. 2492–2501).

  • Coifman, R. R., & Wickerhauser, M. V. (1992). Entropy-based algorithms for best basis selection. IEEE Transactions on Information Theory, 38(2), 713–718.

    Article  MATH  Google Scholar 

  • Dahl, R., Norouzi, M., & Shlens, J. (2017). Pixel recursive super resolution. In IEEE international conference on computer vision (pp. 5439–5448).

  • Dong, C., Loy, C. C., He, K., & Tang, X. (2016). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.

    Article  Google Scholar 

  • Farrugia, R. A., & Guillemot, C. (2017). Face hallucination using linear models of coupled sparse support. IEEE Transactions on Image Processing, 26(9), 4562–4577.

    Article  MathSciNet  MATH  Google Scholar 

  • Gao, X., & Xiong, H. (2016). A hybrid wavelet convolution network with sparse-coding for image super-resolution. In IEEE international conference on image Processing (pp. 1439–1443).

  • Gatys, L. A., Ecker, A. S., & Bethge, M. (2016) Image style transfer using convolutional neural networks. In IEEE conference on computer vision and pattern recognition (pp. 2414–2423).

  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672–2680.

    Google Scholar 

  • Hayat, M., Khan, S. H., & Bennamoun, M. (2017). Empowering simple binary classifiers for image set based face recognition. International Journal of Computer Vision, 123(3), 479–498.

    Article  MathSciNet  Google Scholar 

  • Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007) Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.

  • Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In IEEE international conference on computer vision (pp. 1689–1697).

  • Huang, H., Li, Z., He, R., Sun, Z., & Tan, T. (2018). Introvae: Introspective variational autoencoders for photographic image synthesis. In Neural information processing systems.

  • Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In IEEE conference on computer vision and pattern recognition (pp. 5197–5206).

  • Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448–456).

  • Ji, H., & Fermüller, C. (2009). Robust wavelet-based super-resolution reconstruction: Theory and algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 649–660.

    Article  Google Scholar 

  • Jiang, J., Hu, R., Wang, Z., & Han, Z. (2014). Noise robust face hallucination via locality-constrained representation. IEEE Transactions on Multimedia, 16(5), 1268–1281.

    Article  Google Scholar 

  • Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694–711).

  • Jung, C., Jiao, L., Liu, B., & Gong, M. (2011). Position-patch based face hallucination using convex optimization. IEEE Signal Processing Letters, 18(6), 367–370.

    Article  Google Scholar 

  • Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of GANs for improved quality, stability, and variation. In International conference on learning representations.

  • Kim, J., Kwon Lee, J., & Mu Lee, K. (2016a). Accurate image super-resolution using very deep convolutional networks. In IEEE conference on computer vision and pattern recognition (pp. 1646–1654).

  • Kim, J., Kwon Lee, J., & Mu Lee, K. (2016b). Deeply-recursive convolutional network for image super-resolution. In IEEE conference on computer vision and pattern recognition (pp. 1637–1645).

  • Kingma, D., & Ba, J. (2014). Adam: A method for stochastic optimization. In International conference on learning representations.

  • Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep Laplacian pyramid networks for fast and accurate super-resolution. In IEEE conference on computer vision and pattern recognition (pp. 624–632).

  • Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., & Shi, W. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In IEEE conference on computer vision and pattern recognition (pp. 4681–4690).

  • Li, B., Chang, H., Shan, S., & Chen, X. (2009). Aligning coupled manifolds for face hallucination. IEEE Signal Processing Letters, 16(11), 957–960.

    Article  Google Scholar 

  • Lin, Z., He, J., Tang, X., & Tang, C. K. (2008). Limits of learning-based superresolution algorithms. International Journal of Computer Vision, 80(3), 406–420.

    Article  Google Scholar 

  • Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.

    Article  Google Scholar 

  • Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In IEEE international conference on computer vision (pp. 3730–3738).

  • Ma, X., Zhang, J., & Qi, C. (2010). Hallucinating face by position-patch. Pattern Recognition, 43(6), 2224–2236.

    Article  Google Scholar 

  • Mallat, S. (1996). Wavelets for a vision. Proceedings of the IEEE, 84(4), 604–614.

    Article  Google Scholar 

  • Mallat, S. (2016). Understanding deep convolutional networks. Philos Trans R Soc A, 374(2065), 20150203.

    Article  Google Scholar 

  • Mallat, S. G. (1989). A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674–693.

    Article  MATH  Google Scholar 

  • Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Smolley, S. P. (2017). Least squares generative adversarial networks. In IEEE international conference on computer vision (pp. 2813–2821).

  • Naik, S., & Patel, N. (2013). Single image super resolution in spatial and wavelet domain. The International Journal of Multimedia & Its Applications, 5(4), 23.

    Article  Google Scholar 

  • Nguyen, N., & Milanfar, P. (2000). A wavelet-based interpolation-restoration method for superresolution (wavelet superresolution). Circuits, Systems, and Signal Processing, 19(4), 321–338.

    Article  MATH  Google Scholar 

  • Odena, A., Olah, C., & Shlens, J. (2017). Conditional image synthesis with auxiliary classifier GANs. In International conference on machine learning (pp. 2642–2651).

  • van den Oord, A., Kalchbrenner, N., Espeholt, L., kavukcuoglu, k, Vinyals, O., & Graves, A. (2016). Conditional image generation with pixelcnn decoders. Advances in Neural Information Processing Systems, 29, 4790–4798.

    Google Scholar 

  • Park, J. S., & Lee, S. W. (2008). An example-based face hallucination method for single-frame, low-resolution facial images. IEEE Transactions on Image Processing, 17(10), 1806–1816.

    Article  MathSciNet  MATH  Google Scholar 

  • Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In British machine vision conference.

  • Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In International conference on learning representations.

  • Sajjadi, M. S. M., Scholkopf, B., & Hirsch, M. (2017). Enhancenet: Single image super-resolution through automated texture synthesis. In IEEE international conference on computer vision (pp. 4491–4500).

  • Shamir, L. (2008). Evaluation of face datasets as tools for assessing the? Performance of face recognition methods. International Journal of Computer Vision, 79(3), 225.

    Article  Google Scholar 

  • Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, AP., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In IEEE conference on computer vision and pattern recognition (pp. 1874–1883).

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.

  • Singh, A., Porikli, F., & Ahuja, N. (2014). Super-resolving noisy images. In IEEE conference on computer vision and pattern recognition (pp. 2846–2853).

  • Sohn, K,. Liu, S., Zhong, G., Yu, X., Yang, M. H., & Chandraker, M. (2017). Unsupervised domain adaptation for face recognition in unlabeled videos. In IEEE international conference on computer vision (pp. 3210–3218).

  • Sønderby, C. K., Caballero, J., Theis, L., Shi, W., & Huszár, F. (2017). Amortised map inference for image super-resolution. In International conference on learning representations.

  • Sun, J., Xu, Z., & Shum, H. Y. (2008). Image super-resolution using gradient profile prior. In IEEE conference on computer vision and pattern recognition (pp. 1–8).

  • Tai, Y., Yang, J., & Liu, X. (2017). Image super-resolution via deep recursive residual network. In IEEE conference on computer vision and pattern recognition (pp. 3147–3155)

  • Tong, T., Li, G., Liu, X., & Gao, Q. (2017). Image super-resolution using dense skip connections. In IEEE international conference on computer vision (pp. 4799–4807).

  • Wang, N., Tao, D., Gao, X., Li, X., & Li, J. (2014). A comprehensive survey to face hallucination. International Journal of Computer Vision, 106(1), 9–30.

    Article  Google Scholar 

  • Wang, X., & Tang, X. (2005). Hallucinating face by eigentransformation. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(3), 425–434.

    Article  Google Scholar 

  • Wu, X., Song, L., He, R., & Tan, T. (2018). Coupled deep learning for heterogeneous face recognition. In AAAI conference on artificial intelligence.

  • Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. In IEEE international conference on computer vision (pp. 251–260).

  • Yang, C. Y., & Yang, M. H. (2013). Fast direct super-resolution by simple functions. In IEEE international conference on computer vision (pp. 561–568)

  • Yang, C. Y., Liu, S., & Yang, M. H. (2013) Structured face hallucination. In IEEE conference on computer vision and pattern recognition (pp. 1099–1106).

  • Yang, C. Y., Liu, S., & Yang, M. H. (2017). Hallucinating compressed face images. International Journal of Computer Vision. https://doi.org/10.1007/s11263-017-1044-4.

    Article  Google Scholar 

  • Yang, J., Tang, H., Ma, Y., & Huang, T. (2008). Face hallucination via sparse coding. In IEEE international conference on image processing (pp. 1264–1267).

  • Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873.

    Article  MathSciNet  MATH  Google Scholar 

  • Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In European conference on computer vision (pp. 318–333).

  • Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In AAAI conference on artificial intelligence (pp. 4327–4333).

  • Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In IEEE conference on computer vision and pattern recognition (pp. 3760–3768).

  • Yu, X., Fernando, B., Ghanem, B., Porikli, F., & Hartley, R. (2018a). Face super-resolution guided by facial component heatmaps. In European conference on computer vision (pp. 217–233).

  • Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2018b). Super-resolving very low-resolution face images with supplementary attributes. In IEEE conference on computer vision and pattern recognition (pp. 908–917).

  • Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, DN. (2017). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In IEEE international conference on computer vision (pp. 5907–5915).

  • Zhao, S., Han, H., & Peng, S. (2003). Wavelet-domain HMT-based image super-resolution. IEEE International Conference on Image Processing, 2, 953–956.

    Google Scholar 

  • Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016). Deep cascaded bi-network for face hallucination. In European conference on computer vision (pp. 614–630).

Download references

Acknowledgements

This work is partially funded by the State Key Development Program (Grant No. 2016YFB1001001), National Natural Science Foundation of China (Grant No. 61622310, 61427811), and Beijing Natural Science Foundation (Grants No. JQ18017).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ran He.

Additional information

Communicated by Xiaoou Tang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, H., He, R., Sun, Z. et al. Wavelet Domain Generative Adversarial Network for Multi-scale Face Hallucination. Int J Comput Vis 127, 763–784 (2019). https://doi.org/10.1007/s11263-019-01154-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-019-01154-8

Keywords

Navigation