Skip to Main content Skip to Navigation
Conference papers

Building specialized bilingual lexicons using large-scale background knowledge

Abstract : Bilingual lexicons are central components of machine translation and cross-lingual information retrieval systems. Their manual construction requires strong expertise in both languages involved and is a costly process. Several automatic methods were proposed as an alternative but they often rely on resources available in a limited number of languages and their performances are still far behind the quality of manual translations. We introduce a novel approach to the creation of specific domain bilingual lexicon that relies on Wikipedia. This massively multilingual encyclopedia makes it possible to create lexicons for a large number of language pairs. Wikipedia is used to extract domains in each language, to link domains between languages and to create generic translation dictionaries. The approach is tested on four specialized domains and is compared to three state of the art approaches using two language pairs: French-English and Romanian-English. The newly introduced method compares favorably to existing methods in all configurations tested.
Document type :
Conference papers
Complete list of metadata
Contributor : Léna Le Roy Connect in order to contact the contributor
Submitted on : Thursday, July 19, 2018 - 3:57:28 PM
Last modification on : Tuesday, December 14, 2021 - 3:56:49 AM


  • HAL Id : cea-01844695, version 1



D. Bouamor, A. Popescu, N. Semmar, P. Zweigenbaum. Building specialized bilingual lexicons using large-scale background knowledge. 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, Oct 2013, Seattle, United States. pp.479-489. ⟨cea-01844695⟩



Les métriques sont temporairement indisponibles