Disinformation: analysis and identifcation
26 de Abril de 2024

We present an extensive study on disinformation, which is defned as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: – Can we automatically and accurately classify a news article as containing disinformation? – What characteristics of disinformation diferentiate it from other types of benign information? We conduct this study in the context of two signifcant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifers to (i) examine linguistic clues exhibited by diferent types of fake news articles, (ii) analyze “clickbaityness” of disinformation headlines, and (iii) fnally, perform fne-grained, veracity-based article classifcation through a natural language inference (NLI) module for automated disinformation verifcation; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is signifcant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifes factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation.

Titulo em português
Plataforma
Springer
Tipos de dataset
Tabular
Formato do Dataset
Brutos
Repositório
Autores
Archita Pathak (None), Rohini Shihari (None), Nihiti Natu (None)
Tipo de coleta
Scripting
Ferramenta e Método de Coleta
Início da coleta
Final da coleta
Procedimento de Reidratação
Classificação do Dataset
Ciências Exatas e da Terra
Palavras-chave
COVID-19, Pandemia, Desinformação, Fake News
Data de criação
26 de Abril de 2024
Organizações Financiadoras
ABNT
Pathak, A., Natu, N., Shihari, R. Disinformation: analysis and identifcation. Disponível em: https://link.springer.com/content/pdf/10.1007/s10588-021-09336-x.pdf. Acesso em 03/05/2025
APA
Pathak, A., Natu, N., Shihari, R. (???). Disinformation: analysis and identifcation.