Cross-modal classification by completing unimodal representations

Thi Quynh Nhi Tran; Hervé Le Borgne; M. Crucianu

doi:10.1145/2983563.2983570

Communication Dans Un Congrès Année : 2016

Cross-modal classification by completing unimodal representations

(1, 2) , (1) , (2)

1
2

Thi Quynh Nhi Tran

Fonction : Auteur

Département Intelligence Ambiante et Systèmes Interactifs

CEDRIC. Données complexes, apprentissage et représentations

Hervé Le Borgne

Fonction : Auteur
PersonId : 181478
IdHAL : herve-le-borgne
ORCID : 0000-0003-0520-8436
IdRef : 079208452

Département Intelligence Ambiante et Systèmes Interactifs

M. Crucianu

Fonction : Auteur
PersonId : 180351
IdHAL : michel-crucianu
ORCID : 0000-0001-8204-6843

CEDRIC. Données complexes, apprentissage et représentations

Résumé

We argue that cross-modal classification, where models are trained on data from one modality (e.g. text) and applied to data from another (e.g. image), is a relevant problem in multimedia retrieval. We propose a method that addresses this specific problem, related to but different from cross-modal retrieval and bimodal classification. This method relies on a common latent space where both modalities have comparable representations and on an auxiliary dataset from which we build a more complete bimodal representation of any unimodal data. Evaluations on Pascal VOC07 and NUS-WIDE show that the novel representation method significantly improves the results compared to the use of a latent space alone. The level of performance achieved makes cross-modal classification a convincing choice for real applications.

Mots clés

Multimedia Retrieval Cross-modal Text processing Unimodal nocv2 Real applications Representation method Specific problems

Domaines

Sciences de l'ingénieur [physics]

Fichier principal

ivlm08-tranA.pdf (1.14 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Léna Le Roy : Connectez-vous pour contacter le contributeur

https://cea.hal.science/cea-01840417

Soumis le : vendredi 10 janvier 2020-16:29:25

Dernière modification le : mercredi 3 avril 2024-11:14:12

Dates et versions

cea-01840417 , version 1 (10-01-2020)

Identifiants

HAL Id : cea-01840417 , version 1
DOI : 10.1145/2983563.2983570

Citer

Thi Quynh Nhi Tran, Hervé Le Borgne, M. Crucianu. Cross-modal classification by completing unimodal representations. iV&L-MM '16 Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion, Oct 2016, Amsterdam, Netherlands. pp.17-25, ⟨10.1145/2983563.2983570⟩. ⟨cea-01840417⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEA CNAM DRT CEA-UPSAY UNIV-PARIS-SACLAY LIST CEDRIC-CNAM GS-ENGINEERING GS-COMPUTER-SCIENCE GS-SPORT-HUMAN-MOVEMENT HESAM

91 Consultations

173 Téléchargements

Cross-modal classification by completing unimodal representations

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager