Abstract
This chapter introduces recent advances on using deep learning methods for handwritten Chinese character recognition (HCCR) and handwritten Chinese text recognition (HCTR). In HCCR, we integrate the traditional normalization-cooperated direction-decomposed feature map (directMap) with the deep convolutional neural network, and under this framework, we can eliminate the needs for data augmentation and model ensemble, which are widely used in other systems to achieve their best results. Although the baseline accuracy is very high, we show that writer adaptation with style transfer mapping (STM) in this case is still effective for further boosting the performance. In HCTR, we use an effective approach based on over-segmentation and path search integrating multiple contexts, wherein the language model (LM) and character shape models play important roles. Instead of using traditional back-off n-gram LMs (BLMs), two types of character-level neural network LMs (NNLMs), namely, feedforward neural network LMs (FNNLMs) and recurrent neural network LMs (RNNLMs) are applied. Both FNNLMs and RNNLMs are combined with BLMs to construct hybrid LMs. To further improve the performance of HCTR, we also replace the baseline character classifier, over-segmentation, and geometric context models with convolutional neural network based models. By integrating deep learning methods with traditional approaches, we are able to achieve state-of-the-art performance for both HCCR and HCTR.
Part of this chapter is reprinted from: Pattern Recognition, 61, 348-360, Xu-Yao Zhang, Yoshua Bengio, Cheng-Lin Liu, “Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark” 2017, with permission from Elsevier Pattern Recognition, 65, 251-264, Yi-Chao Wu, Fei Yin, Cheng-Lin Liu, “Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models”, 2017, with permission from Elsevier
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bai Z, Huo Q (2005) A study on the use of 8-directional features for online handwritten Chinese character recognition. In: Proceedings of International Conference Document Analysis and Recognition (ICDAR), pp 262–266
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(2):1137–1155
Bengio Y, Senecal J-S (2008) Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Trans Neural Netw 19(4):713–722
Chen L, Wang S, Fan W, Sun J, Naoi S (2015) Beyond human recognition: a CNN-based framework for handwritten character recognition. In: Proceedings of Asian Conference on Pattern Recognition (ACPR)
Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of 34th Annual Meeting on Association for Computational Linguistics, pp 310–318
Ciresan D, Schmidhuber J (2013) Multi-column deep neural networks for offline handwritten Chinese character classification. arXiv:1309.0261
Connell SD, Jain AK (2002) Writer adaptation for online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 24(3):329–346
Dai R-W, Liu C-L, Xiao B-H (2007) Chinese character recognition: history, status and prospects. Front Comput Sci China 1(2):126–136
Ding K, Deng G, Jin L (2009) An investigation of imaginary stroke technique for cursive online handwriting Chinese character recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 531–535
Fujisawa H (2008) Forty years of research in character and document recognition—an industrial perspective. Pattern Recognit 41(8):2435–2446
Fürnkranz J (1998) A study using n-gram features for text categorization. Austrian Res Inst Artif Intell 3:1–10
Goodman J (2001) Classes for fast maximum entropy training. In: Proceedings of ICASSP, pp 561–564
Graham B (2013) Sparse arrays of signatures for online character recognition. arXiv:1308.0371
Graham B (2014) Spatially-sparse convolutional neural networks. arXiv:1409.6070
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868
He X, Wu Y-C, Chen K, Yin F, Liu C-L (2015) Neural network based over-segmentation for scene text recognition. In: Proceedings of ACPR, pp 715–719
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Joshua T, Goodman J (2001) A bit of progress in language modeling extended version. In: Machine Learning and Applied Statistics Group Microsoft Research, pp 1–72
Katz S (1987) Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans Acoust Speech Signal Process 35(3):400–401
Kimura F, Takashina K, Tsuruoka S, Miyake Y (1987) Modified quadratic discriminant functions and the application to Chinese character recognition. IEEE Trans Pattern Anal Mach Intell (1):149–153
Kombrink S, Mikolov T, Karafiát M, Burget L (2011) Recurrent neural network based language modeling in meeting recognition. In: INTERSPEECH, pp 2877–2880
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp 1097–1105
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Lee H, Verma B (2012) Binary segmentation algorithm for English cursive handwriting recognition. Pattern Recognit 45(4):1306–1317
Liu C-L (2007) Normalization-cooperated gradient feature extraction for handwritten character recognition. IEEE Trans Pattern Anal Mach Intell 29(8):1465–1469
Liu C-L, Jaeger S, Nakagawa M (2004) Online recognition of Chinese characters: the state-of-the-art. IEEE Trans Pattern Anal Mach Intell 26(2):198–213
Liu C-L, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437
Liu C-L, Marukawa K (2005) Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition. Pattern Recognit 38(12):2242–2255
Liu C-L, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recognit 36(10):2271–2285
Liu C-L, Sako H, Fujisawa H (2004) Effects of classifier structures and training regimes on integrated segmentation and recognition of handwritten numeral strings. IEEE Trans Pattern Anal Mach Intell 26(11):1395–1407
Liu C-L, Yin F, Wang D-H, Wang Q-F (2010) Chinese handwriting recognition contest 2010. In: Proceedings of Chinese Conference on Pattern Recognition (CCPR)
Liu C-L, Yin F, Wang D-H, Wang Q-F (2011) CASIA online and offline Chinese handwriting databases. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 37–41
Liu C-L, Yin F, Wang D-H, Wang Q-F (2013) Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recognit 46(1):155–162
Liu C-L, Yin F, Wang Q-F, Wang D-H (2011) ICDAR 2011 Chinese handwriting recognition competition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 1464–1469
Liu C-L, Zhou X-D (2006) Online Japanese character recognition using trajectory-based normalization and direction feature extraction. In: Proceedings of International Workshop on Frontiers in Handwriting Recognition (IWFHR), pp 217–222
Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of International Conference on Machine Learning (ICML)
Messina R, Louradour J (2015) Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In: Proceedings of 13th International Conference on Document Analysis and Recognition, pp 171–175
Mikolov T, Deoras A, Kombrink S, Burget L, Cernockỳ J (2011) Empirical evaluation and combination of advanced language modeling techniques. In: INTERSPEECH, pp 605–608
Mikolov T, Deoras A, Povey D, Burget L, Černockỳ J (2011) Strategies for training large scale neural network language models. In: Proceedings of ASRU, pp 196–201
Mikolov T, Karafiát M, Burget L, Cernockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Proceedings of INTERSPEECH, pp 1045–1048
Mikolov T, Kombrink S, Burget L, Černockỳ J, Khudanpur S (2011) Extensions of recurrent neural network language model. In: Proceedings of ICASSP, pp 5528–5531
Morin F, Bengio Y (2005) Hierarchical probabilistic neural network language model. In: Proceedings of AISTATS, vol 5, pp 246–252
Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(9):533–536
Sarkar P, Nagy G (2005) Style consistent classification of isogenous patterns. IEEE Trans Pattern Anal Mach Intell 27(1):88–98
Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518
Schwenk H (2012) Continuous space translation models for phrase-based statistical machine translation. In: Proceedings of COLING, pp 1071–1080
Schwenk H, Rousseau A, Attik M (2012) Large, pruned or continuous space language models on a GPU for statistical machine translation. In: Proceedings of NAACL-HLT 2012 Workshop, pp 11–19
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations (ICLR)
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Su T-H, Zhang T-W, Guan D-J, Huang H-J Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit 42(1):167–182 (2009)
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of Computer Vision and Pattern Recognition (CVPR)
Wang Q-F, Yin F, Liu C-L (2012) Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans Pattern Anal Mach Intell 34(8):1469–1481
Wang Q-F, Yin F, Liu C-L (2014) Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern Recognit 47(3):1202–1216
Wang S, Chen L, Xu L, Fan W, Sun J, Naoi S (2016) Deep knowledge training and heterogeneous CNN for handwritten Chinese text recognition. In: Proceedings of 15th ICFHR, pp 84–89
Wu C, Fan W, He Y, Sun J, Naoi S (2014) Handwritten character recognition by alternately trained relaxation convolutional neural network. In: Proceedings of International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 291–296
Wu Y-C, Yin F, Liu C-L (2015) Evaluation of neural network language models in handwritten Chinese text recognition. In: Proceedings of 13th International Conference on Document Analysis and Recognition, pp 166–170
Wu Y-C, Yin F, Liu C-L (2017) Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recognit 65:251–264
Xu L, Yin F, Wang Q-F, Liu C-L (2011) Touching character separation in Chinese handwriting using visibility-based foreground analysis. In: Proceedings of 11th International Conference on Document Analysis and Recognition, pp 859–863
Yang W, Jin L, Tao D, Xie Z, Feng Z (2015) DropSample: a new training method to enhance deep convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition. arXiv:1505.05354
Yin F, Wang Q-F, Liu C-L (2013) Transcript mapping for handwritten Chinese documents by integrating character recognition model and geometric context. Pattern Recognit 46(10):2807–2818
Yin F, Wang Q-F, Zhang X-Y, Liu C-L (2013) ICDAR 2013 Chinese handwriting recognition competition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp 1095–1101
Yu S, Duan H, Swen B, Chang B-B (2003) Specification for corpus processing at Peking University: word segmentation, pos tagging and phonetic notation. J Chinese Lang Comput 13(2):1–20
Zamora-Martínez F, Frinken V, España-Boquera S, Castro-Bleda MJ, Fischer A, Bunke H (2014) Neural network language models for off-line handwriting recognition. Pattern Recognit 47(4):1642–1652
Zhang X-Y, Bengio Y, Liu C-L (2017) Online and offline handwritten Chinese character recognition: a comprehensive study and new benchmark. Pattern Recognit 61:348–360
Zhang X-Y, Liu C-L (2013) Writer adaptation with style transfer mapping. IEEE Trans Pattern Anal Mach Intell 35(7):1773–1787
Zhang X-Y, Yin F, Zhang Y-M, Liu C-L, Bengio Y (2018) Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans Pattern Anal Mach Intell (PAMI) 40(4):849–862
Zhong Z, Jin L, Xie Z (2015) High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR)
Zhou X-D, Yu J-L, Liu C-L, Nagasaki T, Marukawa K (2007) Online handwritten Japanese character string recognition incorporating geometric context. In: Proceedings of 9th International Conference on Document Analysis and Recognition, pp 48–52
Acknowledgements
This work has been supported by National Natural Science Foundation of China under Grants 61411136002, 61403380, 61633021, and 61573355.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zhang, XY., Wu, YC., Yin, F., Liu, CL. (2019). Deep Learning Based Handwritten Chinese Character and Text Recognition. In: Huang, K., Hussain, A., Wang, QF., Zhang, R. (eds) Deep Learning: Fundamentals, Theory and Applications. Cognitive Computation Trends, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-030-06073-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-06073-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-06072-5
Online ISBN: 978-3-030-06073-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)