Unsupervised Event Clustering and Aggregation from Newswire and Web Articles - CEA - Commissariat à l’énergie atomique et aux énergies alternatives Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Unsupervised Event Clustering and Aggregation from Newswire and Web Articles

Résumé

In this paper, we present an unsupervised pipeline approach for clustering news articles based on identified event instances in their content. We leverage press agency newswire and monolingual word alignment techniques to build meaningful and linguistically varied clusters of articles from the Web in the perspective of a broader event type detection task. We validate our approach on a manually annotated corpus of Web articles.
Fichier principal
Vignette du fichier
W17-4211 (234.98 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

cea-01857885 , version 1 (17-08-2018)

Identifiants

  • HAL Id : cea-01857885 , version 1

Citer

Swen Ribeiro, Olivier Ferret, Xavier Tannier. Unsupervised Event Clustering and Aggregation from Newswire and Web Articles. 2017 EMNLP Workshop: Natural Language Processing meets Journalism, 2017, Copenhagen, Denmark. pp.62-67. ⟨cea-01857885⟩
144 Consultations
110 Téléchargements

Partager

Gmail Facebook X LinkedIn More