Unsupervised Event Clustering and Aggregation from Newswire and Web Articles

Abstract : In this paper, we present an unsupervised pipeline approach for clustering news articles based on identified event instances in their content. We leverage press agency newswire and monolingual word alignment techniques to build meaningful and linguistically varied clusters of articles from the Web in the perspective of a broader event type detection task. We validate our approach on a manually annotated corpus of Web articles.
Complete list of metadatas

https://hal-cea.archives-ouvertes.fr/cea-01857885
Contributor : Olivier Ferret <>
Submitted on : Friday, August 17, 2018 - 3:31:08 PM
Last modification on : Saturday, May 4, 2019 - 1:20:46 AM

File

W17-4211
Files produced by the author(s)

Identifiers

  • HAL Id : cea-01857885, version 1

Citation

Swen Ribeiro, Olivier Ferret, Xavier Tannier. Unsupervised Event Clustering and Aggregation from Newswire and Web Articles. 2017 EMNLP Workshop: Natural Language Processing meets Journalism, 2017, Copenhagen, Denmark. pp.62-67. ⟨cea-01857885⟩

Share

Metrics

Record views

96

Files downloads

98