American online social network and social networking site “X” (formerly known as Twitter) displayed on a … (+)
After a two-year hiatus, X (formerly Twitter) finally released its latest transparency report for the first half of 2024 in September – a marked change from its previous practice of more frequent disclosures. Transparency reports, which platforms like Facebook and Google also publish, are designed to provide insight into how companies manage content, enforce policies and respond to government requests.
The latest data from The most striking revelation: alarming reports of child exploitation content, but a significant drop in action against hateful content.
THE report highlights a key trend in content moderation: X’s increasing reliance on artificial intelligence to identify and manage harmful behavior. According to the platform, moderation is enforced through “a combination of machine learning and human review,” with AI systems taking direct action or flagging content for further review. Can machines really take responsibility for moderating sensitive issues – or are they exacerbating the problems they are supposed to solve?
More warning signs, less action
X’s transparency report reveals staggering numbers. During the first half of 2024, users reported more than 224 million accounts and tweets, a massive increase from the 11.6 million accounts reported during the second half of 2021. Despite this increase of almost 1,830 % of reports, the number of account suspensions increased only modestly. , increasing from 1.3 million in the second half of 2021 to 5.3 million in 2024, an increase of around 300%. Even more striking, of the more than 8.9 million reported child safety-related posts this year, X only removed 14,571.
Part of the discrepancy between these figures can be attributed to changing definitions and policies regarding hate speech and misinformation. Under his previous leadership, X released comprehensive 50-page reports detailing takedowns and policy violations. On the other hand, the latest report is 15 pages long and uses measures methods.
Furthermore, the company canceled rules on COVID-19 misinformation and no longer considers misgendering or deadnaming as hate speech, complicating enforcement. For example, the report states that X only suspended 2,361 accounts for hateful behavior, compared to 104,565 in the second half of 2021.
Can AI make moral judgments?
As AI becomes the backbone of several global content creation and moderation systems, including platforms like Facebook, YouTubeand X itself – several questions remain about its effectiveness in combating harmful behavior.
Automated reviewers have long been shown to be error-prone, struggle to accurately interpret hate speech, and often misclassify innocuous content as harmful. For example, in 2020, Facebook’s automated systems were seen block ads of companies in difficulty, and in April of this year, its algorithm by mistake flagged Auschwitz Museum posts as going against community standards.
Additionally, many algorithms are developed using datasets primarily from the North, which can lead to insensitivity to various contexts.
A month of September note of the Center for Democracy and Technology highlighted the pitfalls of this approach, noting that a lack of diversity in natural language processing teams can negatively impact the accuracy of automated content moderation, particularly in dialects like Maghrebi Arabic.
As the Mozilla Foundation notes: “As it is integrated into every aspect of our lives, AI has the potential to reinforce existing power hierarchies and societal inequalities. This raises questions about how to responsibly manage potential risks to individuals when designing AI.
In practice, AI fails in more nuanced areas, like sarcasm or coded language. This inconsistency may also explain the decline in action against hate speech on X, where AI systems struggle to identify all harmful behavior.
Despite advances in linguistic AI technology, detecting hate speech remains a complex challenge. A study by Oxford and the Alan Turing Institute in 2021 tested several AI hate speech detection models, revealing significant performance gaps. Their model, HateCheck, includes targeted testing for different types of hate speech and non-hate scenarios that often confuse the AI. Looking at tools like Google’s Jigsaw’s Perspective API and Two Hat’s SiftNinja, the study found that Perspective over-reported non-hate content while SiftNinja under-detected hate speech.
Excessive dependence on AI Moderation also risks undermining freedom of expression, particularly for marginalized communities who often use coded language. Also, researchers argue that even well-optimized algorithms can exacerbate existing problems within content policies.
A visitor takes a photo with his cell phone of an image designed with artificial intelligence by … (+)
A broader goal
X’s difficulties are not unique. Other platforms, like Meta (which owns Facebook, Instagram, and Threads), have shared similar trajectories with their AI moderation systems. Meta has acknowledged that its algorithms often fail to correctly identify misinformation or hate speech, resulting in false positives and missed instances of harmful behavior.
The concerning trends highlighted in X’s transparency report could pave the way for new regulatory policies, especially as several US states move forward with restrictions on social media for minors. AI Now Institute Experts lawyer for increased accountability of platforms regarding their AI moderation systems, calling for transparency and ethical standards. Lawmakers may need to consider regulations requiring a more effective combination of AI and human moderators to ensure fairer moderation.
As platforms increasingly shape political and social discourse, particularly in light of key upcoming elections, the stakes are high. However, the current landscape, including close in August, Meta’s CrowdTangle – an analytics tool that has helped researchers monitor social media posts, including detecting misinformation – made it harder to access the data. Additionally, Elon Musk’s termination of free access to the X API early 2023 even more restricted access to valuable data, increasing concerns on the ability to examine these trends.
In the long term, social media platforms may need to address the ethical challenges posed by AI-based moderation. Can we trust machines to make moral judgments about the content we consume? Or will platforms need a more fundamental overhaul to aim for greater fairness and accountability?