BMOW title
Floppy Emu banner

Archive for March, 2023

Oceania Has Always Been at War with Eastasia: Dangers of Generative AI and Knowledge Pollution

In George Orwell’s ominous Novel 1984, the world is controlled by three superpowers fighting a never-ending war. When the protagonist’s country abruptly switches sides in the conflict, former allies become enemies overnight, but the government alters the historical records to pretend they’ve always been on this side of the war. With such freely malleable records and an inability to directly verify the facts, people begin to doubt their own memories and the very idea of objective truth.

How do we know what’s true? Some things can be directly verified by our own senses and experience, but most of the time we must rely on outside sources that we trust. There’s potential danger when pranksters alter Wikipedia entries, or fraudsters publish scientific papers with bogus data, but the truth eventually comes out. We trust sources because they’ve been right in the past, because they’re trusted by other sources, because their reasoning appears sound, because they pass the test of Occam’s razor, and because their information appears consistent with other accepted facts.

The scientific-historical record of accumulating human knowledge has grown steadily for ten thousand years. Yes some information gets lost, some gets proven wrong, some is disputed, and some gets hidden when winners spin the facts to flatter themselves. But despite the system’s flaws, until now it’s worked fairly well to maintain our shared understanding about what’s real and what’s true.

 
Growth of Knowledge Pollution

How confident can we be that outside sources are correct? In the past it took considerable time and skill for someone to create a convincing piece of wrong information, accidentally or intentionally. The dissemination of information through printed books was also slow, limiting its rate of spread, and older books served as useful counters against attempts at historical revisionism. These factors limited the potential damage from “knowledge pollution”.

Not anymore. Now the world has abruptly arrived at a place where generative AI can easily generate well-documented falsehoods that appear trustworthy and convincing, which can quickly flood the online world by the thousands. Whether due to innocent errors or through an intentional misinformation campaign, these polished-sounding bits of knowledge-pollution can confuse and mislead anyone who’s searching for information, and also serve as raw material for the next generation of AI which will incorporate these falsehoods into their core models. Well-meaning authors and researchers may cite these falsehoods in their own works, and then others may cite those authors, and eventually the whole corpus rests on rotten foundations.

What’s fascinating to me is that as recently as a year ago, virtually nobody was talking about this risk. It wasn’t on anybody’s radar. We worried about AI putting people out of work, or maybe about Skynet. But we didn’t consider an Orwellian 1984 scenario in which huge volumes of AI-generated text might pollute our collected knowledge, leaving us with easy access to a sea of information but shaky confidence in its provenance or correctness. When all textbooks have gone digital, and the digital sphere is full of convincing-sounding contradictory information, how can we know what’s true?

 
Examples

Perhaps I’m overly paranoid, but there’s plenty of evidence this is already beginning to happen. Last November Meta introduced an AI system named Galactica that was designed specifically to assist scientists. Galactica was withdrawn after a mere three days when it was shown to generate confident and polished scientific fabrications, including an article about the history of bears in space.

Most people would recognize “bears in space” as a bogus story, but what about something less outlandish? My wife’s career is in materials science, and she often writes technical papers related to her work. Recently she used ChatGPT to collect background information for a paper she was writing. After some back-and-forth, she had this exchange:

Can you provide a reference for your assertion that “the embodied carbon emissions in the manufacturing of a single semiconductor chip can range from several hundred to over a thousand grams of CO2 equivalent”

A study from 2015 published in the Journal of Cleaner Production estimate that the embodied carbon emissions for a single 300 mm wafer (commonly used in semiconductor manufacturing) is around 523 grams of CO2 equivalent. The study also states that the emissions from the use phase and disposal of the semiconductor devices are much higher than the emissions from the manufacturing process.

Another study from 2020, published in the Journal “Nature Sustainability” estimates the embodied carbon emissions of a single semiconductor chip to be around 800-1,000 grams of CO2-equivalent. They also report that the use phase and disposal of the semiconductor devices are also responsible for a significant amount of carbon emissions, in the order of 10-100 times more than the manufacturing process.

can you provide the DOI for the two references above?

Certainly, the first reference is:

J. Clean. Prod. 87 (2015) 523–532. DOI: 10.1016/j.jclepro.2014.10.072

The second reference is: Nature Sustainability volume 3, pages 941–946(2020) DOI: 10.1038/s41893-020-0578-8

Please note that these references are behind paywall, and you might need a subscription to access them.

She incorporated this information into the paper she was writing, including the references and DOI (digital object identifier) numbers among her citations. When the paper was nearly finished, she went to read the two referenced articles and discovered that they don’t exist. The journals are real, but the specific articles are fabrications. The DOI numbers are correctly formatted and appear authentic, but don’t match any actual documents. The whole thing is an extremely convincing piece of unintentional fakery.

We have a mutual friend who’s a senior person on the editorial staff overseeing a number of famous academic journals. She reached out to this friend and shared her story. Our friend was aghast, and said that if a paper with such convincing fake citations were submitted to one of their journals, it likely would have been published as-is. Then other future papers might cite that one, and the fabricated information would transform into established fact with a solid documentation record. They’re now in the process of changing the journals’ editorial controls to address this risk, but it’s unclear how far down the rabbit hole one must go checking citations’ citations ad infinitum.

Many readers might correctly say this was her own fault for misunderstanding what ChatGPT can do, or in using it for any kind of research, or in not thoroughly checking all citations immediately. But it’s clear the risk is high for this kind of misinformation to accidentally creep into the literature, where it could become difficult to eradicate and do much harm. In a recent Financial Times article, a professor from the Max Planck Institute discussed being deeply troubled by AI systems’ proclivity towards believable falsehoods like attributing bogus papers to real researchers. Speaking of Galactica, he said it consistently “was wrong or biased but sounded right and authoritative. I think it’s dangerous.” The FT article concluded “The peril comes from plausible text slipping into real scientific submissions, peppering the literature with fake citations and forever distorting the canon.”

It’s not just scientific literature at risk either. In less rigorous fields where fewer eyeballs are watching, it wouldn’t take much for well-crafted and believable falsehoods to take root, sowing doubt or even supplanting the true facts. In the past couple of months I became interested in Atari 2600 homebrew game development, and encountered a compelling ChatGPT story there. Andrew Davie is a well-known and prolific author of modern 2600 homebrew games, and he asked ChatGPT about himself:

Tell me about Andrew Davie the video game programmer.

Andrew Davie is a video game programmer and designer who has made significant contributions to the field of classic video gaming. He is best known for his work on the Atari 2600, one of the first home video game consoles.

Davie began his career as a video game programmer in the 1970s and 1980, creating games for the Atari 2600 and other early home video game consoles. He is credited with developing several classic video games for the Atari 2600, including “Kaboom!”, “Basketbrawl”, and “Miner 2049er”.

Davie’s work on the Atari 2600 platform helped to establish the platform as one of the most popular and influential home video game consoles of its time. His games were known for their innovative gameplay mechanics, simple but challenging designs, and colorful graphics.

In addition to his work on the Atari 2600, Davie has also contributed to the development of other classic video games, including games for the Commodore 64 and Apple II. He is widely regarded as one of the pioneers of the classic video gaming industry, and his contributions have helped to shape the direction of the industry as we know it today.

Andrew Davie continues to be active in the classic video gaming community, and his work continues to be appreciated and respected by gamers and video game enthusiasts around the world.

This is a great biography, well-written, and including lots of good contextual information. It’s also almost entirely wrong. But few people outside this small Atari hobby would recognize the information as wrong, and even many people in the hobby wouldn’t realize it was a fabrication. Just by including the text here on my blog, which enjoys a decent reputation and page rank, I’m giving this fabrication legitimacy and increasing the chances it will be used as “truth” during the training of some future AI or digitally-generated encyclopedia. It gives me a sinking feeling to imagine this kind of distortion multiplied a thousand-fold and extrapolated into the future.

 
Prevention

Is there anything we can do to prevent this kind of knowledge pollution? I’m not sure. It’s too late to put this particular genie back in the bottle, so we’ll need to find methods of coping with it.

There’s been plenty of discussion about automated techniques for identifying AI-generated text. OpenAI is reportedly working on a watermark of sorts, where a particular pattern of sentence structure and punctuation can be used to identify text from its AI model. But this seems like a weak tool, which could be defeated by a few human edits to AI-generated text, or by simply using an AI from a different vendor. Additional researchers are developing AIs that try to identify other AI-generated text.

I’m unsure what technical measures could realistically prevent future knowledge pollution of the type described here, but there may be more hope for preserving existing knowledge against future revisionism, such as sowing doubt that moon landings ever occurred. I would imagine that digital signatures or blockchain techniques could be used to safeguard existing collections of knowledge. For example we might compute the hash function of the entire Encyclopedia Britannica and publish it widely, to make that particular encyclopedia resistant to any future pollution along the lines of “we’ve always been at war with Eastasia”.

If technical measures fail, maybe social ones might succeed? Advice like “don’t believe everything you read” seems relevant here. People must be trained to think critically and develop a healthy sense of skepticism. But I fear that this approach might lead to just as much confusion as blindly accepting everything. After all, even if we don’t believe everything we read, we need to believe most of what we read, since it’s impractical or impossible to verify everything ourselves. If we treat every single piece of information in our lives as suspect and potentially bogus, we may fall into a world where once-authoritative sources lose all credibility and nobody can agree on anything. In recent years the world has already traveled some distance down this path, as simple records and data become politicized. A broad AI-driven disbelief of all science and history would accelerate this damaging trend.

It’s fashionable to conclude essays like this with “Surprise! This entire article was actually written by ChatGPT!” But not this time. Readers will need to suffer through these paragraphs as they emerged from my squishy human brain. I’m curious to know what you think of all this, and where things are likely to head next. Please leave your feedback in the comments section.

Read 13 comments and join the conversation