Div150Multi: A social image retrieval result diversification dataset with multi-topic queries
Abstract
In this paper we introduce a new dataset, Div150Multi, that was designed to support shared evaluation of diversification techniques in different areas of social media photo retrieval and related areas. The dataset comes with associated relevance and diversity assessments performed by trusted annotators. The data consists of around 300 complex queries represented via 86,769 Flickr photos, around 27M photo links for around 6,000 users, metadata, Wikipedia pages and content descriptors for text and visual modalities, including state of the art deep features. To facilitate distribution, only Creative Commons content allowing redistribution was included in the dataset. The proposed dataset was validated during the 2015 Retrieving Diverse Social Images Task at the MediaEval Benchmarking.