Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?

Olivier Ferret

Communication Dans Un Congrès Année : 2022

Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?

(1)

Olivier Ferret

Fonction : Auteur
PersonId : 14770
IdHAL : olivier-ferret
ORCID : 0000-0003-0755-2361
IdRef : 155894498

Département Intelligence Ambiante et Systèmes Interactifs

Résumé

While contextual language models are now dominant in the field of Natural Language Processing, the representations they build at the token level are not always suitable for all uses. In this article, we propose a new method for building word or type-level embeddings from contextual models. This method combines the generalization and the aggregation of token representations. We evaluate it for a large set of English nouns in the perspective of the building of distributional thesauri for extracting semantic similarity relations. Moreover, we analyze the differences of static embeddings and type-level embeddings according to features such as the frequency of words or the type of semantic relations these embeddings account for, showing that the properties of these two types of embeddings can be complementary and exploited for further improving distributional thesauri.

Domaines

Traitement du texte et du document

Fichier principal

2022.lrec-1.276.pdf (281.83 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Contributeur MAP CEA : Connectez-vous pour contacter le contributeur

https://cea.hal.science/cea-03745322

Soumis le : jeudi 4 août 2022-09:02:59

Dernière modification le : mercredi 3 avril 2024-11:14:12

Archivage à long terme le : samedi 5 novembre 2022-18:09:50

Dates et versions

cea-03745322 , version 1 (04-08-2022)

Identifiants

HAL Id : cea-03745322 , version 1

Citer

Olivier Ferret. Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?. 13th Conference on Language Resources and Evaluation (LREC 2022), Jun 2022, Marseille, France. pp.2583‑2590. ⟨cea-03745322⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEA DRT CEA-UPSAY UNIV-PARIS-SACLAY LIST ANR GS-ENGINEERING GS-COMPUTER-SCIENCE GS-SPORT-HUMAN-MOVEMENT

47 Consultations

39 Téléchargements

Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager