Skip to Main content Skip to Navigation
Conference papers

AMECON: Abstract meta-concept features for text-illustration

Abstract : Cross-media retrieval is a problem of high interest that is at the frontier between computer vision and natural language processing. The state-of-the-art in the domain consists of learning a common space with regard to some constraints of correlation or similarity from two textual and visual modalities that are processed in parallel and possibly jointly. This paper proposes a different approach that considers the cross-modal problem as a supervised mapping of visual modalities to textual ones. Each modality is thus seen as a particular projection of an abstract meta-concept, each of its dimension subsuming several semantic concepts ("meta" aspect) but may not correspond to an actual one ("abstract" aspect). In practice, the textual modality is used to generate a multi-label representation, further used to map the visual modality through a simple shallow neural network. While being quite easy to implement, the experiments show that our approach significantly outperforms the state-of-the-art on Flickr-8K and Flickr-30K datasets for the text-illustration task. The source code is available at
Document type :
Conference papers
Complete list of metadata
Contributor : Léna Le Roy Connect in order to contact the contributor
Submitted on : Tuesday, June 12, 2018 - 3:30:08 PM
Last modification on : Thursday, February 17, 2022 - 10:08:05 AM



I. Chami, Y. Tamaazousti, Hervé Le Borgne. AMECON: Abstract meta-concept features for text-illustration. ICMR 2017 - Proceedings of the 2017 ACM International Conference on Multimedia Retrieval, Jun 2017, Bucharest, Romania. pp.347-355, ⟨10.1145/3078971.3078993⟩. ⟨cea-01813718⟩



Record views