The Efforts to Make Text-Based AI Less Racist and Terrible

Text generated by large language models is coming ever closer to language that looks or sounds like it came from a human, yet it still fails to understand things requiring reasoning that almost all people understand. In other words, as some researchers put it, this AI is a fantastic bullshitter, capable of convincing both AI researchers and other people that the machine understands the words it generates.

UC Berkeley psychology professor Alison Gopnik studies how toddlers and young people learn to apply that understanding to computing. Children, she said, are the best learners, and the way kids learn language stems largely from their knowledge of and interaction with the world around them. Conversely, large language models have no connection to the world, making their output less grounded in reality.

“The definition of bullshitting is you talk a lot and it kind of sounds plausible, but there’s no common sense behind it,” Gopnik says.

Yejin Choi, an associate professor at the University of Washington and leader of a group studying common sense at the Allen Institute for AI, has put GPT-3 through dozens of tests and experiments to document how it can make mistakes. Sometimes it repeats itself. Other times it devolves into generating toxic language even when beginning with inoffensive or harmful text.

To teach AI more about the world, Choi and a team of researchers created PIGLeT, AI trained in a simulated environment to understand things about physical experience that people learn growing up, such as it’s a bad idea to touch a hot stove. That training led a relatively small language model to outperform others on common sense reasoning tasks. Those results, she said, demonstrate that scale is not the only winning recipe and that researchers should consider other ways to train models. Her goal: “Can we actually build a machine learning algorithm that can learn abstract knowledge about how the world works?”

The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
The Better Web Movement
 

Choi is also working on ways to reduce the toxicity of language models. Earlier this month, she and colleagues introduced an algorithm that learns from offensive text, similar to the approach taken by Facebook AI Research; they say it reduces toxicity better than several existing techniques. Large language models can be toxic because of humans, she says. “That’s the language that’s out there.”

Perversely, some researchers have found that attempts to fine-tune and remove bias from models can end up hurting marginalized people. In a paper published in April, researchers from UC Berkeley and the University of Washington found that Black people, Muslims, and people who identify as LGBT are particularly disadvantaged.

The authors say the problem stems, in part, from the humans who label data misjudging whether language is toxic or not. That leads to bias against people who use language differently than white people. Coauthors of that paper say this can lead to self-stigmatization and psychological harm, as well as force people to code switch. OpenAI researchers did not address this issue in their recent paper.

Jesse Dodge, a research scientist at the Allen Institute for AI, reached a similar conclusion. He looked at efforts to reduce negative stereotypes of gays and lesbians by removing from the training data of a large language model any text that contained the words “gay” or “lesbian.” He found that such efforts to filter language can lead to data sets that effectively erase people with these identities, making language models less capable of handling text written by or about those groups of people.

Dodge says the best way to deal with bias and inequality is to improve the data used to train language models instead of trying to remove bias after the fact. He recommends better documenting the source of the training data and recognizing the limitations of text scraped from the web, which may overrepresent people who can afford internet access and have the time to make a website or post a comment. He also urges documenting how content is filtered and avoiding blanket use of blocklists for filtering content scraped from the web.

Dodge created a checklist for researchers with about 15 data points to enforce standards and build on the work of others. Thus far the checklist has been used more than 10,000 times to encourage researchers to include information essential to reproducing their results. Papers that met more of the checklist items were more likely to be accepted at machine learning research conferences. Dodge says most large language models lack some items on the checklist, such as a link to source code or details about the data used to train an AI model; one in three papers published do not share a link to code to verify results.

Leave a Reply

Your email address will not be published. Required fields are marked *