Compounds and distributional thesauri

Abstract : The building of distributional thesauri from corpora is a problem that was the focus of a significant number of articles, starting with (Grefenstette, 1994) and followed by (Lin, 1998), (Curran and Moens, 2002) or (Heylen and Peirsman, 2007). However, in all these cases, only single terms were considered. More recently, the topic of compositionality in the framework of distributional semantic representations has come to the surface and was investigated for building the semantic representation of phrases or even sentences from the representation of their words. However, this work was not done until now with the objective of building distributional thesauri. In this article, we investigate the impact of the introduction of compounds for achieving such building. More precisely, we consider compounds as undividable lexical units and evaluate their influence according to three different roles: as features in the distributional contexts of single terms, as possible neighbors of single term entries and finally, as entries of a thesaurus. This investigation was conducted through an intrinsic evaluation for a large set of nominal English single terms and compounds with various frequencies.
Document type :
Conference papers
Complete list of metadatas

https://hal-cea.archives-ouvertes.fr/cea-01844444
Contributor : Léna Le Roy <>
Submitted on : Thursday, July 19, 2018 - 1:31:11 PM
Last modification on : Thursday, September 12, 2019 - 8:56:06 AM

Identifiers

  • HAL Id : cea-01844444, version 1

Collections

CEA | DRT | LIST

Citation

Olivier Ferret. Compounds and distributional thesauri. 9th International Conference on Language Resources and Evaluation, LREC 2014, May 2014, Reykjavik, Iceland. pp.2979-2984. ⟨cea-01844444⟩

Share

Metrics

Record views

32