Taking on generative AI’s safety dilemma

How do we navigate an internet that starts with a prompt and ends with AI-generated, synthetic media? What lies between the start and the end point? The internet we’re accustomed to allows us to search for information we’re interested in. Until now, this content has mostly been created by humans. Increasingly, we are encountering the products of generative artificial intelligence, or GenAI, across multiple industries and within our personal lives.

GenAI is a type of AI technology that can create text, multimedia and synthetic data. Optimists and proponents of generative AI are bullish about the prospect of every user being just one “prompt” away from the information they’re looking for. Prompts are natural-language descriptions that allow a user to generate relevant content.

With this ascendent “internet of the prompt,” companies such as Meta and StabilityAI are taking center stage alongside OpenAI and its popular products. This moment has even created new career opportunities. In an article about the rising profile of prompt engineers, The Wall Street Journal described this new position as somebody who “talks to chatbots.”

But can prompts really get users everything they need? Like other innovative technologies, GenAI will bring about great opportunities, but it will also carry great risk. Even when AI models are trained on expert or specialized data, issues of transparency, accuracy and potential bias or harm remain embedded in the data. Consequently, these errors can be reproduced in newly generated content. In some cases, what the user actually needs is the credible opinion of a real doctor, for example.

GenAI and synthetic media pose new risks to communities. These risks arise from imperfections, ethical questions and the sheer speed with which unlimited amounts of unverified machine-generated synthetic media can be produced. From AI-generated health articles to chatbot conversations that can slide off the rails at any moment, the potential consequences deserve deep contemplation and well-considered safeguards. Imagine the negative impact of specious GenAI content flooding an underserved language or community with inaccurate health claims or outdated medical tips.

How do we make sure we’re providing these communities with trusted information when and where they need it, so GenAI doesn’t fill the information gap with unverified content?

2023: Year zero

Only one year ago, we witnessed an explosion of interest in GenAI tools such as Dall-E and ChatGPT. Although these free services were still in earlier phases, we were mesmerized by the images generated from our imaginative prompts.

That same year, following a White House request, OpenAI and six other companies voluntarily agreed to guardrails for their technology. In a statement from the White House about the AI companies’ commitments, officials said these promises “underscore three principles that must be fundamental to the future of AI — safety, security, and trust.” Toward the end of the year, United Nations Under-Secretary-General for Global Communications Melisa Fleming urged caution for AI developers, insisting they “put people before profit” at a meeting centered on the intersections of misinformation, disinformation, hate speech and AI.

Amid this atmosphere of rising caution, a 2023 McKinsey report on the state of AI found that only 32% of respondents to a global survey said the organizations they belong to were mitigating inaccuracy, despite “inaccuracy” being the most commonly cited GenAI risk.

As GenAI technologies advance exponentially, will safety measures put forward by lawmakers and companies today have any real value one year from now? This is what could be described as GenAI’s safety dilemma: Technology moves faster than regulations, and way faster than bureaucracies.

Legislators may be taking heed of this dilemma. In September, European Union lawmakers were able to strike a deal on AI regulation after only 36 hours of negotiations. This landmark moment includes transparency requirements for all general-purpose AI models and “tougher requirements for the more powerful models,” according to France 24. But will this draft law make sense one year from now?

When machines ‘hallucinate,’ they cause harm

As interest in GenAI tools rose last year, AI hallucinations became a popular topic. This term refers to “incorrect or misleading results that AI models generate.” Hallucinations can be caused by a variety of factors, including insufficient training data, incorrect assumptions made by the model and biases in the data used to train the model.

These errors are not just amusing quirks, though. They can have significant consequences for real-world applications. As IBM notes in an article about AI hallucinations, “a health care AI model might incorrectly identify a benign skin lesion as malignant, leading to unnecessary medical interventions.” Other errors have even resulted in generative AI solutions casting legal aspersions on users. Overall, GenAI can create misinformation and bolster its reach — through hallucinations or intentional user manipulation.

It’s also worth noting that biases in training data — which can be a significant source of hallucinations — can reflect and reproduce existing societal prejudices, potentially leading to additional harm being inflicted on already disenfranchised communities. Some tools have already been found to exacerbate existing information inequities. Generative AI applications can recreate and spread bias, and some AI tools perpetuate stereotypes or create harmful content, such as misinformation, disinformation, hate speech and abuse.

‍From the perspective of user and community safety, how much harm will these hallucinations cause before GenAI tools get better at doing their jobs?

Wide adoption, uncontrolled risks

This AI-driven internet is even more opaque than previous iterations. It leaves less of a trace as it generates and disseminates information. In fact, a lot of new information propagates in closed messaging apps, making falsehoods difficult to detect and even more challenging to debunk.

Reporters, fact-checking organizations and other groups may struggle to determine if the content they’re seeing is real or generated by AI — even with technological assistance. After all, generative AI companies do not disclose much about the inner workings of their systems or their plans for the future.

Oversight bodies and government agencies are struggling to understand what emerging risks they face and to keep up with a rapidly changing AI landscape. In the meantime, the internet continues to host more and more GenAI content. In the process, this content is served up to communities where it can then spread unchecked.

Researchers, advocacy groups and regulators continue to call on platforms to detect and label AI-generated content. Likewise, these groups insist that GenAI developers should build safeguards against malicious deployments of their services. Such efforts could curb large-scale disinformation campaigns and the creation of deepfakes, for instance.

Most importantly, more effort should be put into raising public awareness about the current state of GenAI, how it’s being used and how it may affect the lives and choices of individuals and groups. After all, the end user has the right to choose what world they want to live in.

Gendered disinformation, amplified by AI

How does GenAI impact women — online and offline — in the Larger World*? Gendered disinformation, broadly speaking, involves spreading false narratives laden with gender-based attacks or exploiting issues related to gender in the pursuit of various agendas. The objective of gendered disinformation is to silence women and minority groups and to hinder their participation in the media ecosystem and democratic processes.

Studies have shown that political leaders and journalists are at the receiving end of a rising barrage of disinformation. According to the Organization for Economic Cooperation and Development article about AI and the gender gap, information shared by the Carnegie Endowment for International Peace indicates that women from minority groups, in particular, are more likely to be targeted.

During times of election or crisis, coordinated harassment efforts — which may consist of doxing, legal actions, doctored and sexualized content — can prove especially perilous for women’s safety. In some cases, what starts out online can quickly transform into offline threats and violent attacks.

For example, the International Center for Journalists and the University of Sheffield analyzed over 13 million tweets targeting Indian journalist Rana Ayyub and Al Jazeera Arabic anchor Ghada Oueiss. The researchers’ efforts revealed extensive, misogynistic digital disinformation campaigns aimed at discrediting these women. The attacks, often perpetrated by “patriotic trolls and foreign states,” aimed to undermine the journalists' credibility and expose them to physical violence.

In Pakistan, journalists such as Meher Bokhari recently faced online attacks that involved the nonconsensual use of images and doctored visuals. This is a disturbing example of ideologically motivated opponents using AI tools to spread sexist and misogynistic content, including threats of physical violence. This pattern of harassment is common in the Larger World, especially during election periods. With the proliferation of GenAI, it’s more important than ever that we protect the ability of women to engage online.

Next steps for Meedan and our partners

Recognizing the importance of this issue, Meedan, through several partnerships, is co-designing and collaborating with its partners to identify data-driven solutions for ongoing concerns raised by the proliferation of GenAI.

For instance, in 2023 Meedan launched a project titled “Understanding Trends in Gendered Disinformation During Elections.” This initiative aims to tackle the harassment women face during elections by analyzing patterns of gendered disinformation across regions, communities and languages.

In collaboration with a partnership between Africa Women Journalism Project and TogoCheck in Togo — and in concert with organizations such as Ecofeminita in Argentina and the Digital Rights Foundation in Pakistan — Meedan collects data through messaging tiplines on its Check platform to document and counter gender disinformation. As a result, innovative tools such as the Naka Facebook bot and ECOFEMIBOT act as AI-powered WhatsApp tiplines that allow communities to report disinformation. In Latin America, we partnered with the feminist organization Coding Rights, which launched the Not My A.I. project to map public-sector AI initiatives that negatively impact gender equality and intersectionality. This effort emphasizes the unique oppressions faced in Latin America compared to the Global North and encourages collaborative contributions.

In addition to our work on gendered disinformation, Meedan has for years advocated for safer practices related to the use of AI, large language model technologies and synthetic media. These efforts include supporting partners that use emerging AI technologies in a human rights context.

Earlier in 2023, Meedan joined a diverse cohort of partners committed to Partnership on AI’s responsible practices for synthetic media, as a first-of-its-kind framework for the ethical and responsible development, creation and sharing of synthetic media.

Our latest project, supported by the Patrick J. McGovern Foundation, will allow us to address and combat the spread of dangerous misinformation and enable the verification of synthetic media and GenAI content.

‍* Our team has gone through many discussions about the different ways we can talk about the people we work with and the regions where our programs are running. What we call things matter, so we have started using “Larger World” — a term coined by our friends at Numun Fund. We use it here, on our website and in our communications material, with their consent.

Footnotes

Online conversations are heavily influenced by news coverage, like the 2022 Supreme Court decision on abortion. The relationship is less clear between big breaking news and specific increases in online misinformation.
The tweets analyzed were a random sample qualitatively coded as “misinformation” or “not misinformation” by two qualitative coders trained in public health and internet studies.
This method used Twitter’s historical search API
The peak was a significant outlier compared to days before it using Grubbs' test for outliers for Chemical Abortion (p<0.2 for the decision; p<0.003 for the leak) and Herbal Abortion (p<0.001 for the decision and leak).
All our searches were case insensitive and could match substrings; so, “revers” matches “reverse”, “reversal”, etc.

References

Authors

Words by

No items found.

Words by

Organization

Published on

February 27, 2024