DOI QR코드

DOI QR Code

Sub-word Based Offline Handwritten Farsi Word Recognition Using Recurrent Neural Network

  • Ghadikolaie, Mohammad Fazel Younessy (Department of Electrical and Computer Engineering, Islamic Azad University, Science and Research Branch) ;
  • Kabir, Ehsanolah (Department of Electrical and Computer Engineering, Tarbiat Modarres University) ;
  • Razzazi, Farbod (Department of Electrical and Computer Engineering, Islamic Azad University, Science and Research Branch)
  • Received : 2015.06.15
  • Accepted : 2016.04.05
  • Published : 2016.08.01

Abstract

In this paper, we present a segmentation-based method for offline Farsi handwritten word recognition. Although most segmentation-based systems suffer from segmentation errors within the first stages of recognition, using the inherent features of the Farsi writing script, we have segmented the words into sub-words. Instead of using a single complex classifier with many (N) output classes, we have created N simple recurrent neural network classifiers, each having only true/false outputs with the ability to recognize sub-words. Through the extraction of the number of sub-words in each word, and labeling the position of each sub-word (beginning/middle/end), many of the sub-word classifiers can be pruned, and a few remaining sub-word classifiers can be evaluated during the sub-word recognition stage. The candidate sub-words are then joined together and the closest word from the lexicon is chosen. The proposed method was evaluated using the Iranshahr database, which consists of 17,000 samples of Iranian handwritten city names. The results show the high recognition accuracy of the proposed method.

Keywords

References

  1. L.M. Lorigo and V. Govindaraju, "Offline Arabic Handwriting Recognition: A Survey," IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 5, May 2006, pp. 712-724. https://doi.org/10.1109/TPAMI.2006.102
  2. M.T. Parvez and S.A. Mahmoud, "Offline Arabic Handwritten Text Recognition: A Survey," ACM Comput. Surveys, vol. 45, no. 2, Mar. 2013, pp. 23-1-23-35.
  3. O.H. Assma, O.O. Khalifa, and A. Hassan, "Handwritten Arabic Word Recognition: a Review of Common Approaches," Int. Conf. Comput. Commun. Eng., Kuala Lumpur, Malaysia, May 13-15, 2008, pp. 801-805.
  4. K. Mostafa and A.M. Darwish, "Robust Base-Line Independent Algorithms for Segmentation and Reconstruction of Arabic Handwritten Cursive Script," Proc. SPIE, vol. 3651, 1999, pp. 73-83.
  5. T. Sari, L. Souici, and M. Sellami, "Off-Line Handwritten Arabic Character Segmentation Algorithm: ACSA," Proc. Int. Workshop Frontiers Handwriting Recogn., Ontario, Canada, Aug, 6-8, 2002, pp. 452-457.
  6. L. Lorigo and V. Govindaraju, "Segmentation and Prerecognition of Arabic Handwriting," Int. Conf. Document Anal. Recogn., Seoul, Rep. of Korea, Aug. 3-Sept. 1, 2005, pp. 605-609.
  7. S.S. Maddouri, F. Ghazouani, and F.B. Samoud, "Text Lines and PAWs Segmentation of Handwritten Arabic Document by Two Hybrid Methods," Int. Conf. Adv. Technol. Signal Image Process., Sousse, Tunisia, Mar. 17-19, 2014, pp. 310-315.
  8. M. Dehghan et al., "Handwritten Farsi (Arabic) Word Recognition: A Holistic Approach Using Discrete HMM," Pattern Recogn., vol. 34, no. 5, May 2001, pp. 1057-1065. https://doi.org/10.1016/S0031-3203(00)00051-0
  9. R.A.H. Mohamad, L. Likforman-Sulem, and C. Mokbel, "Combining Slanted-Frame Classifiers for Improved HMM-Based Arabic Handwriting Recognition," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 7, July 2009, pp. 165-1177.
  10. V. Margner, H. El-Abed, and M. Pechwitz, "Offline Handwritten Arabic Word Recognition Using HMM: A Character-Based Approach without Explicit Segmentation," Proc. Colloque Int. Francophone sur l'Ecrit et le Document, 2006.
  11. M.P. Lewis, Ethnologue: Languages of the World; Sixteenth Edition, Dallas, TX, USA: SIL International, 2009.
  12. A. Broumandnia, J. Shanbehzadeh, and M. Rezakhah Varnoosfaderani, "Persian/Arabic Handwritten Word Recognition Using M-Band Packet Wavelet Transform," Image Vis. Comput., vol. 26, no. 6, June 2008, pp. 829-842. https://doi.org/10.1016/j.imavis.2007.09.004
  13. R. El-Hajj, L. Likforman-Sulem, and C. Mokbel, "Arabic Handwriting Recognition Using Baseline Dependent Features and Hidden Markov Modeling," Int. Conf. Document Anal. Recogn., Seoul, Rep. of Korea, Aug. 29-Sept. 1, 2005, pp. 893-897.
  14. R. El-Hajj et al., "Combination of HMM-Based Classifiers for the Recognition of Arabic Handwritten Words," Int. Conf. Document Anal. Recogn., Parana, Brazil, Sept. 23-26, 2007, pp. 959-963.
  15. L. Dinges et al., "Offline Automatic Segmentation Based Recognition of Handwritten Arabic Words," Int. J. Signal Process., Image Process., Pattern Recogn., vol. 4, no. 1, 2011, pp. 131-144.
  16. A. Maqqor, A. Halli, and K. Satori, "A Multi-stream HMM Approach to Offline Handwritten Arabic Word Recognition," Int. J. Natural Language Comput., vol. 2, no. 4, 2013, pp. 21-33. https://doi.org/10.5121/ijnlc.2013.2402
  17. R. Gonzalez and R. Woods, Digital Image Processing (3rd Edition), Upper Saddle River, NJ, USA: Prentice Hall, 2006.
  18. M.Y. Chen, A. Kundu, and S.N. Srihari, "Variable Duration Hidden Markov and Morphological Segmentation for Handwritten Word Recognition," IEEE Trans. Image Process., vol. 4, no. 12, Dec. 1995, pp. 1675-1688. https://doi.org/10.1109/83.477074
  19. F. Farooq, V. Govindaraju, and M. Perrone, "Pre-processing Methods for Handwritten Arabic Documents," Int. Conf. Document Anal. Recogn., Seoul, Rep. of Korea, Aug. 29-Sept. 1, 2005, pp. 267-271.
  20. Y.H. Tay et al., "An Offline Cursive Handwritten Word Recognition System," Proc. IEEE Region Int. Conf. Electr. Electron. Technol., Singapore, Aug. 19-22, 2001, pp. 519-524.
  21. R.M. Haralick and L.G. Shapiro, Computer and Robot Vision, Volume I, reading, MA, USA: Addison-Wesley, 1992, pp. 28-48.
  22. D. Xiang et al., "Offline Arabic Handwriting Recognition System Based On HMM," IEEE Int. Conf. Sci. Inform. Technol., Chengdu, China, July 9-11, 2010, pp. 526-529.
  23. I. Siddiqi and N. Vincent. "Text Independent Writer Recognition Using Redundant Writing Patterns with Contour-Based Orientation and Curvature Features," Pattern Recogn., vol. 43, no. 11, Nov. 2010, pp. 3853-3865. https://doi.org/10.1016/j.patcog.2010.05.019
  24. G.H. Kim and V. Govindaraju, "A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications," IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 4, Apr. 1997, pp. 366-379. https://doi.org/10.1109/34.588017
  25. A. Graves et al., "Connectionist Temporal Classification: Labeling Unsegmented Sequence Data with Recurrent Neural Networks," Proc. Int. Conf. Mach. Learning, Pittsburgh, PA, USA, June 25-29, 2006, pp. 369-376.
  26. A. Graves et al., "A Novel Connectionist System for Unconstrained Handwriting Recognition," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 5, May 2009, pp. 855-868. https://doi.org/10.1109/TPAMI.2008.137
  27. V. Frinken et al., "Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition," Int. Conf. Pattern Recogn., Tsukuba, Japan, Nov. 11-15, 2012, pp. 701-704.
  28. Y. LeCun et al., "Efficient BackProp," in Neural Networks: Tricks of the trade, Heidelberg, Netherland: Springer, 1998, pp. 9-50.
  29. M. Riedmiller and H. Braun, "A Direct Adaptive Method for Faster Backpropagation Learning: The Rprop Algorithm," IEEE Int. Conf. Neural Netw., San Francisco, CA, USA, Mar. 28-Apr. 1, 1993, pp. 586-591.
  30. E. Bayesteh, A. Ahmadifard, and H. Khosravi, "A Lexicon Reduction Method Based on Clustering Word Images in Offline Farsi Handwritten Word Recognition Systems," Iranian Conf. Mach. Vis. Image Process., Tehran, Iran, Nov. 16-17, 2011, pp. 1-5.

Cited by

  1. Seam carving-based Arabic handwritten sub-word segmentation vol.7, pp.1, 2016, https://doi.org/10.1080/23311916.2020.1769315