Unsupervised Event Clustering and Aggregation from Newswire and Web Articles - Archive ouverte HAL Access content directly
Conference Papers Year : 2017

Unsupervised Event Clustering and Aggregation from Newswire and Web Articles

Abstract

In this paper, we present an unsupervised pipeline approach for clustering news articles based on identified event instances in their content. We leverage press agency newswire and monolingual word alignment techniques to build meaningful and linguistically varied clusters of articles from the Web in the perspective of a broader event type detection task. We validate our approach on a manually annotated corpus of Web articles.
Fichier principal
Vignette du fichier
W17-4211 (234.98 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

cea-01857885 , version 1 (17-08-2018)

Identifiers

  • HAL Id : cea-01857885 , version 1

Cite

Swen Ribeiro, Olivier Ferret, Xavier Tannier. Unsupervised Event Clustering and Aggregation from Newswire and Web Articles. 2017 EMNLP Workshop: Natural Language Processing meets Journalism, 2017, Copenhagen, Denmark. pp.62-67. ⟨cea-01857885⟩
136 View
104 Download

Share

Gmail Facebook Twitter LinkedIn More