Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri? - CEA - Commissariat à l’énergie atomique et aux énergies alternatives Access content directly
Conference Poster Year : 2022

Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?

Abstract

While contextual language models are now dominant in the field of Natural Language Processing, the representations they build at the token level are not always suitable for all uses. In this article, we propose a new method for building word or type-level embeddings from contextual models. This method combines the generalization and the aggregation of token representations. We evaluate it for a large set of English nouns in the perspective of the building of distributional thesauri for extracting semantic similarity relations. Moreover, we analyze the differences of static embeddings and type-level embeddings according to features such as the frequency of words or the type of semantic relations these embeddings account for, showing that the properties of these two types of embeddings can be complementary and exploited for further improving distributional thesauri.
Fichier principal
Vignette du fichier
2022.lrec-1.276.pdf (281.83 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

cea-03745322 , version 1 (04-08-2022)

Identifiers

  • HAL Id : cea-03745322 , version 1

Cite

Olivier Ferret. Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?. 13th Conference on Language Resources and Evaluation (LREC 2022), Jun 2022, Marseille, France. pp.2583‑2590, 2022, LREC 2022. ⟨cea-03745322⟩
34 View
18 Download

Share

Gmail Facebook Twitter LinkedIn More