Skip to Main content Skip to Navigation
New interface
Conference papers

Automatic Detection of Bot-generated Tweets

Abstract : Deep neural networks have the capacity to generate textual content which is increasingly difficult to distinguish from that produced by humans. Such content can be used in disinformation campaigns and its detrimental effects are amplified if it spreads on social networks. Here, we study the automatic detection of bot-generated Twitter messages. This task is difficult due to combination between the strong performance of recent deep language models and the limited length of tweets. In this study, we propose a challenging definition of the problem by making no assumption regarding the bot account, its network or the method used to generate the text. We devise two approaches for bot detection based on pretrained language models and create a new dataset of generated tweets to improve the performance of our classifier on recent text generation algorithms. The obtained results show that the generalization capabilities of the proposed classifier heavily depends on the dataset used to trained the model. Interestingly, the two automatic dataset augmentation proposed here show promising results. Their introduction leads to consistent performance gains compared to the use of the original dataset alone.
Document type :
Conference papers
Complete list of metadata

https://hal-cea.archives-ouvertes.fr/cea-03788573
Contributor : Contributeur MAP CEA Connect in order to contact the contributor
Submitted on : Monday, September 26, 2022 - 6:17:29 PM
Last modification on : Thursday, September 29, 2022 - 4:32:11 AM

File

Bot_Tweet_Detection_HAL.pdf
Files produced by the author(s)

Identifiers

Citation

Julien Tourille, Babacar Sow, Adrian Popescu. Automatic Detection of Bot-generated Tweets. 1st ACM International Workshop on Multimedia AI against Disinformation, Jun 2022, Newark, United States. pp.44-51, ⟨10.1145/3512732.3533584⟩. ⟨cea-03788573⟩

Share

Metrics

Record views

11

Files downloads

8