Automatic Detection of Bot-generated Tweets - CEA - Commissariat à l’énergie atomique et aux énergies alternatives Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Automatic Detection of Bot-generated Tweets

Résumé

Deep neural networks have the capacity to generate textual content which is increasingly difficult to distinguish from that produced by humans. Such content can be used in disinformation campaigns and its detrimental effects are amplified if it spreads on social networks. Here, we study the automatic detection of bot-generated Twitter messages. This task is difficult due to combination between the strong performance of recent deep language models and the limited length of tweets. In this study, we propose a challenging definition of the problem by making no assumption regarding the bot account, its network or the method used to generate the text. We devise two approaches for bot detection based on pretrained language models and create a new dataset of generated tweets to improve the performance of our classifier on recent text generation algorithms. The obtained results show that the generalization capabilities of the proposed classifier heavily depends on the dataset used to trained the model. Interestingly, the two automatic dataset augmentation proposed here show promising results. Their introduction leads to consistent performance gains compared to the use of the original dataset alone.
Fichier principal
Vignette du fichier
Bot_Tweet_Detection_HAL.pdf (2.53 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

cea-03788573 , version 1 (26-09-2022)

Identifiants

Citer

Julien Tourille, Babacar Sow, Adrian Popescu. Automatic Detection of Bot-generated Tweets. 1st ACM International Workshop on Multimedia AI against Disinformation, Jun 2022, Newark, United States. pp.44-51, ⟨10.1145/3512732.3533584⟩. ⟨cea-03788573⟩
52 Consultations
335 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More