Skip to main content

Deep Learning Based Handwritten Chinese Character and Text Recognition

  • Chapter
  • First Online:
Deep Learning: Fundamentals, Theory and Applications

Part of the book series: Cognitive Computation Trends ((COCT,volume 2))

Abstract

This chapter introduces recent advances on using deep learning methods for handwritten Chinese character recognition (HCCR) and handwritten Chinese text recognition (HCTR). In HCCR, we integrate the traditional normalization-cooperated direction-decomposed feature map (directMap) with the deep convolutional neural network, and under this framework, we can eliminate the needs for data augmentation and model ensemble, which are widely used in other systems to achieve their best results. Although the baseline accuracy is very high, we show that writer adaptation with style transfer mapping (STM) in this case is still effective for further boosting the performance. In HCTR, we use an effective approach based on over-segmentation and path search integrating multiple contexts, wherein the language model (LM) and character shape models play important roles. Instead of using traditional back-off n-gram LMs (BLMs), two types of character-level neural network LMs (NNLMs), namely, feedforward neural network LMs (FNNLMs) and recurrent neural network LMs (RNNLMs) are applied. Both FNNLMs and RNNLMs are combined with BLMs to construct hybrid LMs. To further improve the performance of HCTR, we also replace the baseline character classifier, over-segmentation, and geometric context models with convolutional neural network based models. By integrating deep learning methods with traditional approaches, we are able to achieve state-of-the-art performance for both HCCR and HCTR.

Part of this chapter is reprinted from: Pattern Recognition, 61, 348-360, Xu-Yao Zhang, Yoshua Bengio, Cheng-Lin Liu, “Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark” 2017, with permission from Elsevier Pattern Recognition, 65, 251-264, Yi-Chao Wu, Fei Yin, Cheng-Lin Liu, “Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models”, 2017, with permission from Elsevier

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bai Z, Huo Q (2005) A study on the use of 8-directional features for online handwritten Chinese character recognition. In: Proceedings of International Conference Document Analysis and Recognition (ICDAR), pp 262–266

    Google Scholar 

  • Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  PubMed  Google Scholar 

  • Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(2):1137–1155

    Google Scholar 

  • Bengio Y, Senecal J-S (2008) Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Trans Neural Netw 19(4):713–722

    Article  CAS  PubMed  Google Scholar 

  • Chen L, Wang S, Fan W, Sun J, Naoi S (2015) Beyond human recognition: a CNN-based framework for handwritten character recognition. In: Proceedings of Asian Conference on Pattern Recognition (ACPR)

    Google Scholar 

  • Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of 34th Annual Meeting on Association for Computational Linguistics, pp 310–318

    Google Scholar 

  • Ciresan D, Schmidhuber J (2013) Multi-column deep neural networks for offline handwritten Chinese character classification. arXiv:1309.0261

    Google Scholar 

  • Connell SD, Jain AK (2002) Writer adaptation for online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 24(3):329–346

    Article  Google Scholar 

  • Dai R-W, Liu C-L, Xiao B-H (2007) Chinese character recognition: history, status and prospects. Front Comput Sci China 1(2):126–136

    Article  Google Scholar 

  • Ding K, Deng G, Jin L (2009) An investigation of imaginary stroke technique for cursive online handwriting Chinese character recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 531–535

    Google Scholar 

  • Fujisawa H (2008) Forty years of research in character and document recognition—an industrial perspective. Pattern Recognit 41(8):2435–2446

    Article  Google Scholar 

  • Fürnkranz J (1998) A study using n-gram features for text categorization. Austrian Res Inst Artif Intell 3:1–10

    Google Scholar 

  • Goodman J (2001) Classes for fast maximum entropy training. In: Proceedings of ICASSP, pp 561–564

    Google Scholar 

  • Graham B (2013) Sparse arrays of signatures for online character recognition. arXiv:1308.0371

    Google Scholar 

  • Graham B (2014) Spatially-sparse convolutional neural networks. arXiv:1409.6070

    Google Scholar 

  • Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868

    Article  PubMed  Google Scholar 

  • He X, Wu Y-C, Chen K, Yin F, Liu C-L (2015) Neural network based over-segmentation for scene text recognition. In: Proceedings of ACPR, pp 715–719

    Google Scholar 

  • Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  CAS  PubMed  Google Scholar 

  • Joshua T, Goodman J (2001) A bit of progress in language modeling extended version. In: Machine Learning and Applied Statistics Group Microsoft Research, pp 1–72

    Google Scholar 

  • Katz S (1987) Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans Acoust Speech Signal Process 35(3):400–401

    Article  Google Scholar 

  • Kimura F, Takashina K, Tsuruoka S, Miyake Y (1987) Modified quadratic discriminant functions and the application to Chinese character recognition. IEEE Trans Pattern Anal Mach Intell (1):149–153

    Article  Google Scholar 

  • Kombrink S, Mikolov T, Karafiát M, Burget L (2011) Recurrent neural network based language modeling in meeting recognition. In: INTERSPEECH, pp 2877–2880

    Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp 1097–1105

    Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  CAS  PubMed  Google Scholar 

  • Lee H, Verma B (2012) Binary segmentation algorithm for English cursive handwriting recognition. Pattern Recognit 45(4):1306–1317

    Article  Google Scholar 

  • Liu C-L (2007) Normalization-cooperated gradient feature extraction for handwritten character recognition. IEEE Trans Pattern Anal Mach Intell 29(8):1465–1469

    Article  PubMed  Google Scholar 

  • Liu C-L, Jaeger S, Nakagawa M (2004) Online recognition of Chinese characters: the state-of-the-art. IEEE Trans Pattern Anal Mach Intell 26(2):198–213

    Article  PubMed  Google Scholar 

  • Liu C-L, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437

    Article  Google Scholar 

  • Liu C-L, Marukawa K (2005) Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition. Pattern Recognit 38(12):2242–2255

    Article  Google Scholar 

  • Liu C-L, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recognit 36(10):2271–2285

    Article  Google Scholar 

  • Liu C-L, Sako H, Fujisawa H (2004) Effects of classifier structures and training regimes on integrated segmentation and recognition of handwritten numeral strings. IEEE Trans Pattern Anal Mach Intell 26(11):1395–1407

    Article  PubMed  Google Scholar 

  • Liu C-L, Yin F, Wang D-H, Wang Q-F (2010) Chinese handwriting recognition contest 2010. In: Proceedings of Chinese Conference on Pattern Recognition (CCPR)

    Google Scholar 

  • Liu C-L, Yin F, Wang D-H, Wang Q-F (2011) CASIA online and offline Chinese handwriting databases. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 37–41

    Google Scholar 

  • Liu C-L, Yin F, Wang D-H, Wang Q-F (2013) Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recognit 46(1):155–162

    Article  Google Scholar 

  • Liu C-L, Yin F, Wang Q-F, Wang D-H (2011) ICDAR 2011 Chinese handwriting recognition competition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 1464–1469

    Google Scholar 

  • Liu C-L, Zhou X-D (2006) Online Japanese character recognition using trajectory-based normalization and direction feature extraction. In: Proceedings of International Workshop on Frontiers in Handwriting Recognition (IWFHR), pp 217–222

    Google Scholar 

  • Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of International Conference on Machine Learning (ICML)

    Google Scholar 

  • Messina R, Louradour J (2015) Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In: Proceedings of 13th International Conference on Document Analysis and Recognition, pp 171–175

    Google Scholar 

  • Mikolov T, Deoras A, Kombrink S, Burget L, Cernockỳ J (2011) Empirical evaluation and combination of advanced language modeling techniques. In: INTERSPEECH, pp 605–608

    Google Scholar 

  • Mikolov T, Deoras A, Povey D, Burget L, Černockỳ J (2011) Strategies for training large scale neural network language models. In: Proceedings of ASRU, pp 196–201

    Google Scholar 

  • Mikolov T, Karafiát M, Burget L, Cernockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Proceedings of INTERSPEECH, pp 1045–1048

    Google Scholar 

  • Mikolov T, Kombrink S, Burget L, Černockỳ J, Khudanpur S (2011) Extensions of recurrent neural network language model. In: Proceedings of ICASSP, pp 5528–5531

    Google Scholar 

  • Morin F, Bengio Y (2005) Hierarchical probabilistic neural network language model. In: Proceedings of AISTATS, vol 5, pp 246–252

    Google Scholar 

  • Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(9):533–536

    Article  Google Scholar 

  • Sarkar P, Nagy G (2005) Style consistent classification of isogenous patterns. IEEE Trans Pattern Anal Mach Intell 27(1):88–98

    Article  PubMed  Google Scholar 

  • Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518

    Article  Google Scholar 

  • Schwenk H (2012) Continuous space translation models for phrase-based statistical machine translation. In: Proceedings of COLING, pp 1071–1080

    Google Scholar 

  • Schwenk H, Rousseau A, Attik M (2012) Large, pruned or continuous space language models on a GPU for statistical machine translation. In: Proceedings of NAACL-HLT 2012 Workshop, pp 11–19

    Google Scholar 

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations (ICLR)

    Google Scholar 

  • Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    Google Scholar 

  • Su T-H, Zhang T-W, Guan D-J, Huang H-J Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit 42(1):167–182 (2009)

    Article  Google Scholar 

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of Computer Vision and Pattern Recognition (CVPR)

    Google Scholar 

  • Wang Q-F, Yin F, Liu C-L (2012) Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans Pattern Anal Mach Intell 34(8):1469–1481

    Article  PubMed  Google Scholar 

  • Wang Q-F, Yin F, Liu C-L (2014) Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern Recognit 47(3):1202–1216

    Article  Google Scholar 

  • Wang S, Chen L, Xu L, Fan W, Sun J, Naoi S (2016) Deep knowledge training and heterogeneous CNN for handwritten Chinese text recognition. In: Proceedings of 15th ICFHR, pp 84–89

    Google Scholar 

  • Wu C, Fan W, He Y, Sun J, Naoi S (2014) Handwritten character recognition by alternately trained relaxation convolutional neural network. In: Proceedings of International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 291–296

    Google Scholar 

  • Wu Y-C, Yin F, Liu C-L (2015) Evaluation of neural network language models in handwritten Chinese text recognition. In: Proceedings of 13th International Conference on Document Analysis and Recognition, pp 166–170

    Google Scholar 

  • Wu Y-C, Yin F, Liu C-L (2017) Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recognit 65:251–264

    Article  Google Scholar 

  • Xu L, Yin F, Wang Q-F, Liu C-L (2011) Touching character separation in Chinese handwriting using visibility-based foreground analysis. In: Proceedings of 11th International Conference on Document Analysis and Recognition, pp 859–863

    Google Scholar 

  • Yang W, Jin L, Tao D, Xie Z, Feng Z (2015) DropSample: a new training method to enhance deep convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition. arXiv:1505.05354

    Google Scholar 

  • Yin F, Wang Q-F, Liu C-L (2013) Transcript mapping for handwritten Chinese documents by integrating character recognition model and geometric context. Pattern Recognit 46(10):2807–2818

    Article  Google Scholar 

  • Yin F, Wang Q-F, Zhang X-Y, Liu C-L (2013) ICDAR 2013 Chinese handwriting recognition competition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 1095–1101

    Google Scholar 

  • Yu S, Duan H, Swen B, Chang B-B (2003) Specification for corpus processing at Peking University: word segmentation, pos tagging and phonetic notation. J Chinese Lang Comput 13(2):1–20

    Google Scholar 

  • Zamora-Martínez F, Frinken V, España-Boquera S, Castro-Bleda MJ, Fischer A, Bunke H (2014) Neural network language models for off-line handwriting recognition. Pattern Recognit 47(4):1642–1652

    Article  Google Scholar 

  • Zhang X-Y, Bengio Y, Liu C-L (2017) Online and offline handwritten Chinese character recognition: a comprehensive study and new benchmark. Pattern Recognit 61:348–360

    Article  Google Scholar 

  • Zhang X-Y, Liu C-L (2013) Writer adaptation with style transfer mapping. IEEE Trans Pattern Anal Mach Intell 35(7):1773–1787

    Article  PubMed  Google Scholar 

  • Zhang X-Y, Yin F, Zhang Y-M, Liu C-L, Bengio Y (2018) Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans Pattern Anal Mach Intell (PAMI) 40(4):849–862

    Article  Google Scholar 

  • Zhong Z, Jin L, Xie Z (2015) High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR)

    Google Scholar 

  • Zhou X-D, Yu J-L, Liu C-L, Nagasaki T, Marukawa K (2007) Online handwritten Japanese character string recognition incorporating geometric context. In: Proceedings of 9th International Conference on Document Analysis and Recognition, pp 48–52

    Google Scholar 

Download references

Acknowledgements

This work has been supported by National Natural Science Foundation of China under Grants 61411136002, 61403380, 61633021, and 61573355.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Lin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhang, XY., Wu, YC., Yin, F., Liu, CL. (2019). Deep Learning Based Handwritten Chinese Character and Text Recognition. In: Huang, K., Hussain, A., Wang, QF., Zhang, R. (eds) Deep Learning: Fundamentals, Theory and Applications. Cognitive Computation Trends, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-030-06073-2_3

Download citation

Publish with us

Policies and ethics