When academics and other researchers need to recruit people for large-scale surveys, they often rely on crowdsourcing sites like Prolific or Amazon Mechanical Turk. Participants sign up to provide demographic information and opinions in exchange for money or gift cards. Prolific currently has around 200,000 active users, who, it promises, have been verified “to prove they are who they say they are.”
However, even if the users are real people, there are signs that many of them are using AI to answer survey questions.
Janet Xuassistant professor of organizational behavior at Stanford Graduate School of Businesssays she first heard about it from a colleague who noticed that some answers to the open-ended survey questions seemed… non-human. Responses contained fewer typos. They were longer. (Even the most opinionated people will write four or five sentences at most.) And they were strangely kind. “When you do a survey and people respond to you, there’s usually a certain amount of ridicule,” Xu says.
In a new paperXu, Simone Zhang from New York University, and AJ Alvero from Cornell University examine how, when, and why academic research participants turn to AI. Nearly a third of Prolific users who participated in the study reported using large language models (LLMs) like ChatGPT in some of their survey work.
In search of the right words
The authors surveyed around 800 participants on Prolific to find out how they engage with LLMs. All had completed surveys on Prolific at least once; 40% had completed seven or more surveys in the past 24 hours. They were promised that admission to use the LLM would not influence their eligibility to participate in future studies.
About two-thirds said they had never used LLMs to help them answer open-ended survey questions. About a quarter said they sometimes use AI assistants or chatbots to help them write, and less than 10% said they use LLMs very frequently, suggesting that AI tools have not (so far) been widely adopted. The most common reason given for using AI was the need for help expressing thoughts.
Respondents who reported never using LLMs in surveys tended to express concerns about authenticity and validity. “A lot of their responses had this moral inflection where it seems like (using AI) would be a disservice to research; that would be cheating,” Xu says.
The use of AI has likely led academics, researchers, and publishers to pay increased attention to the quality of their data.
Janet XuAssistant Professor of Organizational Behavior
Certain groups of participants, such as those who were newer to Prolific or identified as male, Black, Republican, or college-educated, were more likely to report using the AI writing aid. Xu emphasizes that this is just a snapshot; these patterns may change as technology diffuses or as users opt out of the platform. But she says they’re worth noting because differences in AI use could lead to biases in public opinion data.
To see how human-created responses differ from AI-generated ones, the authors examined data from three studies conducted on reference samples before ChatGPT’s public release in November 2022. Human responses in these Studies tended to contain more concrete answers, emotionally charged language. The authors also noted that these responses included more “dehumanizing” language to describe Black Americans, Democrats and Republicans. In contrast, LLMs consistently use more neutral and abstract language, suggesting that they can approach race, politics, and other sensitive topics with more detachment.
Dilute diversity
Xu says that while it’s likely that studies that received AI-generated answers have already been published, she doesn’t think the use of LLM is widespread enough to require researchers to issue corrections or retractions . Instead, she says, “I would say it has probably caused academics, researchers and publishers to pay increased attention to the quality of their data. »
“We don’t want to argue that the use of AI is unilaterally bad or wrong,” she says, adding that it depends on how it is used. Someone may use an LLM to help them express their opinion on a social issue, or they may borrow an LLM’s description of other people’s ideas on a topic. In the first scenario, AI helps someone refine an existing idea, Xu says. The second scenario is more concerning “because it is essentially about generating a common trend rather than reflecting the specific point of view of someone who already knows what they think.”
If too many people use AI in this way, it could lead to a flattening or dilution of human responses. “What that means for diversity, what that means in terms of expressions of beliefs, ideas, identities – it’s a warning sign about the potential for homogenization,” Xu says.
This has implications beyond academia. If people use AI to answer surveys about diversity in their workplace, for example, it could create a false sense of acceptance. “People might draw conclusions like, ‘Oh, discrimination isn’t a problem at all, because people only have positive things to say about groups that we historically thought were at risk of discrimination. discrimination”, or “Everyone gets along and loves each other”. other.’ »
The authors note that directly asking survey participants to refrain from using AI can reduce its use. There are also high-tech ways to discourage LLM use, such as code that blocks copy-pasting of text. “A popular form of survey software has this feature that allows you to request to upload a voice recording instead of written text,” says Xu.
The paper’s results are informative for survey creators and inspire them to create concise and clear questions. “Many subjects in our study who reported using AI say they do so when they don’t think the instructions are clear,” says Xu. “When the participant is confused or frustrated, or simply has a lot of information to take in, they start not paying full attention.” Designing studies with humans in mind may be the best way to avoid the boredom or burnout that might prompt someone to start ChatGPT. “Many of the general principles of good survey design still apply,” says Xu, “and are more important than ever.”
For more information
This story was originally published by the Stanford Graduate School of Business.