To Make Someone Do Something: Mining Alert-Style Directives in Bulgarian Social Media

Last modified by ruslana m on 2026/03/30 09:40

Published Friday 27 March 2026 at 10:21

A research, addressing a critical gap in the field of Natural Language Processing (NLP) for Bulgarian - a language often classified as "low-resource" due to the scarcity of annotated datasets, was presented at the LoResLM (Language Models for Low-Resource Languages) workshop during EACL 2026.

The study, presented by Stanislav Penkov and co-authored by Ruslana Margova, was titled "To Make Someone Do Something: Mining Alert-Style Directives in Bulgarian Social Media."  It focuses on automatically identifying linguistic patterns designed to mobilize audiences. 

The researchers operate on the hypothesis that specific linguistic constructions act as stimuli, or "directives," framing public reception and provoking immediate action. In the context of social media, these "alert-style" rhetorics are frequently used to spread misinformation or mobilize social groups.

The study analyzed a dataset of nearly 14,000 Bulgarian Facebook posts (spanning 2014–2024). The team developed an unsupervised pipeline to mine these patterns without the need for manual labeling. Posts were processed using Stanza POS tagging, followed by the mining of multiword constructions (2–5 tokens). An "Alert Enrichment Score" was used to rank these constructions based on their association with high-engagement posts. Using K-means clustering and logistic regression, the team identified 18,000+ alert constructions with an interpretable model accuracy of ~65%.

The research identified several key clusters of "alert" language that appear across various topics (politics, war, economics), suggesting a cross-topic rhetorical layer:

  • Modality (Obligation): Phrases like “трябва да” (must/should).
  • Negation (Resistance): Constructions such as “няма да” (will not) or “не искаме” (we do not want).
  • National Identity: Framing through collective terms like “ние, българите” (we, Bulgarians).
  • Accusatory Tone: Attributing intent to out-groups, e.g., “те искат” (they want).

Future Impact

This work provides a foundational corpus and feature set for the Bulgarian NLP community. Beyond academic contribution, the findings pave the way for real-time monitoring systems capable of detecting manipulative rhetoric and escalating social tension online.

The research was conducted at GATE Institute (Sofia University) and supported by the GATE project and the BROD (Bulgarian-Romanian Observatory of Digital Media) project.

 

BROD