MFA websites provide no material public benefits but, without proper safeguards, could create significant negative externalities in an AI era. LLMs are designed to generate outcomes at scale—a perfect fit for content farms whose sole purpose is search engine optimization (SEO) through nonsensical keywords, summarized or verbatim text from news sources, and highly repetitive spam. These articles often list fake authors or anonymous bylines and appear to lack human oversight. The rising prevalence of AI-generated spam could decrease public trust and understanding of critical current events, especially if it distorts the market for real news and obscures legitimate newsrooms as centralized sources of information. It will become exponentially harder for human journalists to disseminate trustworthy information when the internet ecosystem is stuffed with bots.
Content farms divert more than user attention away from legitimate news websites; they also cost valuable digital advertising dollars. The AI-generated websites that NewsGuard detected were stuffed with programmatic advertisements, including from major brands like Subaru and Citigroup—almost all of which were automatically routed through Google’s Ad Exchange. Google Ads maintains policies against servicing “spammy automatically-generated content” but does not publicly reveal the results of its placement algorithm or content review outcomes. In June 2023, an Adalytics study showed that Google frequently served video ads on lower-quality clickbait or junk websites without the awareness of its buy-side advertising clients. The same month, the Association of National Advertisers estimated that about $13 billion in digital advertising revenue is algorithmically funneled into clickbait MFA websites, which amounts to approximately 15 percent of the total $88 billion pie that marketers spend on automated ad exchanges every year. If not for the proliferation of AI-generated MFA content, those funds could otherwise provide a much-needed lifeline for legitimate news outlets.
Analysis of Policy Approaches
A massive legislative push to compel large technology platforms that host news content to pay publishers is playing out all over the world. In June 2023, the Canadian Parliament enacted the Online News Act, which requires designated search engines and social media platforms to pay news publishers for any external article links or quotes their users view or share. Australia and the European Union respectively passed the News Media Bargaining Code (NMBC) and Copyright Directive in 2021, and legislators in Brazil, India, the United Kingdom, the United States, and California have either proposed or are actively considering similar measures.
Canada’s parliamentary budget officer predicts that news organizations could share an additional $329 million in annual revenue after the Online News Act becomes effective. However, this figure is a small fraction of the estimated $4.9 billion that Canadian news outlets lost from 2010 to 2022, and it will never be realized if Google and Meta choose to boycott the law altogether. Just hours after the passage of the Online News Act, Meta announced plans to permanently shut down news access for Canadian users. Shortly after, Google stated it too would block all Canadian news links on its search engine. Their responses should not come as a surprise: directly prior to Australia’s passage of the NMBC in 2021, Meta abruptly cut off users from viewing news pages, and Google announced it might have “no real choice” but to withdraw search services within the country. Faced with those ultimatums, Australian lawmakers soon amended the NMBC’s final text in a manner that exempted Meta and Google from any binding actions. And after France began enforcing the Copyright Directive in 2021, Google throttled users from seeing article previews in France, which drastically decreased click-throughs. Their actions underscore the problem with forced negotiation: it is very difficult to enforce payment schemes when digital gatekeepers can simply choke off access to the news content internet users see.
These legislative measures, sometimes referred to as “link taxes,” create the wrong incentives. In the past, they have discouraged Google and Meta from displaying news content on their platforms, which decreases critical streams of traffic to external news websites. In the future, such policies may even motivate search engines to accelerate the adoption of generative AI to answer user queries instead of displaying external links. Forced payment measures also seek to reinforce newspapers’ dependency on large technology companies, as they do not address the structural reasons for Google and Meta’s market dominance. For these reasons, U.S. technology companies need bright-line rules that meaningfully prevent harmful ad-tech, data collection, and AI practices. Such rules, in turn, can foster a healthier and more sustainable online environment in which newsrooms can evolve in the long term.
(1) Dominant technology platforms need clear ex ante rules to prevent anticompetitive practices that reinforce their gatekeeper power over news publishers.
Two-party negotiations cannot work if the playing field is not level. Because Google and Meta have taken steps to lock in gatekeeper power over digital advertising and content distribution in recent years, they basically own the league newspapers operate in. For example, Google’s 2008 acquisition of DoubleClick enabled it to effectively monopolize all three stages of the ad-tech process: the buy-side advertiser network, sell-side publisher tools, and the ad exchange through which most news websites auction online advertising spots. In turn, market dominance enables the search giant to demand up to 35 percent of proceeds that would otherwise flow to publishers. It also provides Google with ample means to compel news websites to adopt Accelerated Mobile Pages formatting and control their ability to engage in header bidding, among other actions. Similarly, Meta also increased its gatekeeper power by acquiring nascent competitors like Instagram (2012) and WhatsApp (2014), which allowed it to combine user data across multiple subsidiaries to curate personalized advertisements much more granularly than traditional newspapers can.
These behaviors have raised alarm bells in numerous jurisdictions. In June 2023, the European Commission filed a formal statement of objection to Google’s ad-tech practices, arguing that the company’s control over all stages of the digital advertising process allows it to illegally disadvantage website publishers. In January 2023, the U.S. Department of Justice similarly sued Google over alleged anticompetitive actions that distort free competition in the ad-tech space, seeking to split up its Ad Manager suite. In November 2021, the Federal Trade Commission (FTC) challenged Meta’s acquisitions of Instagram and WhatsApp, seeking a possible divestiture of both platforms. Also in 2021, an Australian Competition and Consumer Commission (ACCC) investigation found that Google had engaged in “systemic competition concerns” like blocking ad-tech competitors from placing ads on YouTube and other subsidiaries. Further, ACCC chair Rod Sims noted at the time, “Investigation and enforcement proceedings under general competition laws are not well suited to deal with these sorts of broad concerns, and can take too long if anti-competitive harm is to be prevented.” The ACCC report summarizes a widespread issue: enforcement actions occur after the fact and are not guaranteed to undo the years of consolidation that have helped Google and Meta lock in market power and divert advertising revenue from news organizations.
Traditional antitrust law requires a modernized approach in the digital age—one that implements forward-looking guardrails to prevent dominant technology companies from harming nascent rivals, news publishers, and society at large. The European Union recently put new ex ante rules into place with its Digital Markets Act, which aims to prohibit gatekeeper technology platforms from abusing their control over multiple sides of a market. Members of the U.S. Congress have floated several bills containing similar proposals to limit practices like self-prioritization and acquisitions, but their momentum stalled following debates over their possible effects on malware prevention, content moderation, and other issues. In March 2023, Canada’s Competition Bureau put forward over 50 recommendations to modernize its antitrust legal framework, which has not undergone significant updates since the 1980s. Comprehensive antitrust reform is never quick or straightforward to implement, but it is essential to preventing anticompetitive acquisitions, growing news websites’ ad-tech options and revenue, and fostering a more diverse and sustainable news ecosystem overall.
(2) Both technology platforms and newsrooms need formal guardrails to promote ethics, fairness, and transparency in any development and deployment of AI.
Approximately 100 million entities registered for ChatGPT within two months of its release, meaning numerous companies, including search engines and newsrooms, are deploying LLMs before direct legal safeguards are in place. The United States has existing federal and state privacy, copyright, consumer protection, and civil rights laws that apply to some aspects of the digital space, but there are broad legal uncertainties about how to interpret them in the context of generative AI (see sections 3 and 4).
In July 2023, the White House announced voluntary commitments from OpenAI, Google, Meta, and four other AI developers to invest in algorithms to “address society’s greatest challenges” and create “robust technical mechanisms to ensure that users know when content is AI generated.” This announcement follows previous nonbinding strategies like the White House’s Blueprint for an AI Bill of Rights (2022) and the National Institute of Standards and Technology’s AI Risk Management Framework (2023), which both call upon companies to prioritize transparency, accountability, fairness, and privacy in AI development. Broad voluntary principles, like these, are the first steps in the absence of a mandatory legal framework that directly regulates generative AI, but LLM developers will need to take significant strides to meet them. For example, OpenAI released a tool in January 2023 to help identify AI-generated text but withdrew it six months later due to high error rates. Furthermore, generative AI as an industry largely continues to obscure how it collects data, assesses and mitigates risk, and promotes internal accountability.
As politicians additionally debate mandatory safeguards to mitigate the risks of AI, it is important to consider how any forthcoming laws could better support journalism and trustworthy information-sharing online. In 2022, Congress introduced the draft American Data Privacy and Protection Act (ADPPA), which contains provisions for large companies to publicly explain how high-risk AI systems make decisions, incorporate training data, and generate output. In April 2023, the National Telecommunications and Information Administration at the Department of Commerce issued a request for comment on AI accountability measures like audits and certifications. Transparency measures, such as these, could help news readers evaluate the credibility and fairness of the AI-generated text they view. They could also assist marketers in contesting automated advertisement placement with MFA websites instead of traditional news publishers. Both internet users and news publishers could benefit from increased public visibility into all AI development, regardless of the algorithm’s perceived level of risk of any given algorithm, which could include high-level statistics into methodology, specific sources of training data, generalized outcomes, and error rates.
In June 2023, the European Parliament passed the draft AI Act, which could require developers to proactively mitigate automated output that perpetuates existing societal inequities. Under the act, “general purpose” algorithms (which would likely include LLMs like ChatGPT) would be required to identify “reasonably foreseeable risks” in their design and test training datasets for bias. Furthermore, “high-risk systems” (which would include social media ranking algorithms with over 45 million users) would be subject to more intensive standards like human oversight, assessments of an algorithm’s potential impact in specific contexts, and documentation of training datasets. Going further, evaluations for high-risk AI use by large search engines and social media companies should also include their potential impacts on journalism and information-sharing, including the spread of harmful content or burying of legitimate news online.