Skip to Main content Skip to Navigation
Theses

Modèles neuronaux pour la représentation et l'appariement d'objets géotextuels

Paul Mousset 1
1 IRIT-IRIS - Recherche d’Information et Synthèse d’Information
IRIT - Institut de recherche en informatique de Toulouse
Abstract : Stimulated by the heavy use of smartphones, the joint use of textual and spatial data in space-textual objects (e.g., tweets, Flickr photos, POI reviews) became the mainstay of many applications, such as crisis management, tourist assistance or the finding of places of interest. These tasks are fundamentally based on the representation of spatial objects and the definition of matching functions. In previous work, the problem has been addressed using linguistic models that rely on costly probability estimation of the relevance of words in spatial regions. However, these traditional methods are not very effective when dealing with social network data. These data are usually short, use unconventional or ambiguous words, and are difficult to match with other documents because of vocabulary mismatches. As a result, the proposed approaches generally lead to low recall and precision rates. In this thesis, we focus on tackling the semantic gap in the representation and matching of geotagged tweets and POIs. We propose to leverage geographic contexts and distributional semantics to resolve the semantic location prediction task. Our work consists of two main contributions: (1) improving word embeddings which can be combined to construct object representations using spatial word distributions; (2) exploiting deep neural networks to perform semantic matching between tweets and POIs. Regarding the improvement of text representations, we propose to regularize word embeddings that can be combined to construct object representations. The purpose is to reveal possible local semantic relationships between words and the multiplicity of meanings of the same word. To detect the local specificities of the different meanings, we consider two alternatives. One based on a spatial partitioning method using the k-means algorithm, and the other one based on a probabilistic partitioning using a kernel density estimation (KDE). Word embeddings are then retrofitted using a regularization function that integrates the spatial distributions to compute the local semantic relationships between words. Regarding the use of deep neural networks for the semantic location prediction task, we propose an interaction-based neural model designed for tweet-POI pair matching. Unlike existing architectures, our approach is based on joint learning of local and global interactions between tweet-POI pairs. According to the proposed model, the exact matching signals of the local word-to-word interactions are corrected by a spatial damping factor. Then, these smoothed signals are processed using matching histograms. The local interactions reveal word-pairs patterns similarity driven by spatial information. Global interactions consider the strength of the interaction between the tweet and the POI, both spatially, through a geographical distance between geotextual objects, and semantically, through a semantic proximity of their latent representation.
Document type :
Theses
Complete list of metadatas

Cited literature [314 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02979573
Contributor : Abes Star :  Contact
Submitted on : Tuesday, October 27, 2020 - 11:09:06 AM
Last modification on : Wednesday, October 28, 2020 - 3:35:12 AM

File

2020TOU30042a.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02979573, version 1

Citation

Paul Mousset. Modèles neuronaux pour la représentation et l'appariement d'objets géotextuels. Interface homme-machine [cs.HC]. Université Paul Sabatier - Toulouse III, 2020. Français. ⟨NNT : 2020TOU30042⟩. ⟨tel-02979573⟩

Share

Metrics

Record views

61

Files downloads

50