Handwriting-OOV word-recognition using web resources

Abstract : Handwriting recognition systems rely on predefined dictionaries. Small and static dictionaries are often exploited to obtain high in-vocabulary (IV) accuracy at the expense of coverage. Thus the recognition of out-of-vocabulary (OOV) words is not handled efficiently. To improve OOV recognition while keeping IV dictionaries small, we introduce a multi-step approach that exploits web resources. After an IV-OOV classification, Wikipedia is used to create OOV sequence-adapted dynamic dictionaries. A second decoding is done the dynamic dictionary to determine the most probable word for the OOV sequence. We validate our approach with experiments conducted on the RIMES dataset using a BLSTM recognizer. Results show that improvements are obtained compared to handwriting recognition with static dictionary.
Complete list of metadatas

https://hal-cea.archives-ouvertes.fr/cea-01822860
Contributor : Bruno Savelli <>
Submitted on : Monday, June 25, 2018 - 3:08:15 PM
Last modification on : Wednesday, May 15, 2019 - 7:36:31 AM

Identifiers

Collections

Citation

Cristina Oprean, Chafic Mokbel, Laurence Likforman-Sulem, Adrian Popescu. Handwriting-OOV word-recognition using web resources. Revue des Sciences et Technologies de l'Information - Série Document Numérique, Lavoisier, 2014, 17 (3), pp.77 - 96. ⟨10.3166/DN.17.3.77-96⟩. ⟨cea-01822860⟩

Share

Metrics

Record views

33