Please use this identifier to cite or link to this item:
Title: BLSTM-based handwritten text recognition using Web resources
Authors: Oprean, Cristina
Likforman-Sulem, Laurence
Mokbel, Chafic 
Popescu, Adrian
Affiliations: Department of Electrical Engineering 
Keywords: Facsimiles
Subjects: Dictionaries
Issue Date: 2015
Part of: 2015 13th International Conference on Document Analysis and Recognition (ICDAR)
Start page: 466
End page: 470
Conference: International Conference on Document Analysis and Recognition (ICDAR) (13th : 23-26 Aug 2015 : Tunisia) 
Handwriting recognition systems usually rely on static dictionaries and language models. Full coverage of these dictionaries is generally not achieved when dealing with unrestricted document corpora due to the presence of Out-Of-Vocabulary words. In a previous work, dynamic dictionaries were built from Web resources and successfully applied to isolated word recognition. In the present work we extend this approach to text-line recognition. Line segmentation into words is needed to exploit dynamic dictionaries and it is performed using BLSTM classifiers to align filler models and word sequence outputs. Words are then classified based on the confidence score into anchor and non-anchor words (AWs and NAWs). AWs are equated to the BLSTM outputs and used as such. Dynamic dictionaries are built for NAWs by exploiting Web resources for their character sequence and for neighboring AWs. Text-lines are decoded again using dynamic dictionaries and re-estimated language model. We conduct experiments on the publicly available RIMES database and show that the introduction of the dynamic dictionary is beneficial. Equally important, we show that the gain increases as the proportion of OOVs increases.
Ezproxy URL: Link to full text
Type: Conference Paper
Appears in Collections:Department of Electrical Engineering

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.