Is ChatGPT making scientists hyper-productive? The highs and lows of using AI

ChatGPT continues to steal the spotlight, more than a year after its public debut.

The artificial intelligence (AI) chatbot was released as a free-to-use tool in November 2022 by tech company OpenAI in San Francisco, California. Two months later, ChatGPT had already been listed as an author on a handful of research papers.

Academic publishers scrambled to announce policies on the use of ChatGPT and other large language models (LLMs) in the writing process. By last October, 87 of 100 top scientific journals had provided guidance to authors on generative AI, which can create text, images and other content, researchers reported on 31 January in the The BMJ 1 .

But that’s not the only way in which ChatGPT and other LLMs have begun to change scientific writing. In academia’s competitive environment, any tool that allows researchers to “produce more publications is going to be a very attractive proposition”, says digital-innovation researcher Savvas Papagiannidis at Newcastle University in Newcastle upon Tyne, UK.

Generative AI is continuing to improve — so publishers, grant-funding agencies and scientists must consider what constitutes ethical use of LLMs, and what over-reliance on these tools says about a research landscape that encourages hyper-productivity.

Are scientists routinely using LLMs to write papers?

Before its public release, ChatGPT was not nearly as user-friendly as it is today, says computer scientist Debora Weber-Wulff at the HTW Berlin University of Applied Sciences. “The interfaces for the older GPT models were something that only a computer scientist could love.”

In the past, researchers typically needed specialized expertise to use advanced LLMs. Now, “GPT has democratized that to some degree”, says Papagiannidis.

This democratization has catalysed the use of LLMs in research writing. In a 2023 Nature survey of more than 1,600 scientists , almost 30% said that they had used generative AI tools to help write manuscripts, and about 15% said they had used them to help write grant applications.

And LLMs have many other uses. They can help scientists to write code, brainstorm research ideas and conduct literature reviews. LLMs from other developers are improving as well, such as Google’s Gemini and Claude 2 by Anthropic, an AI company in San Francisco. Researchers with the right skills can even develop their own personalized LLMs that are fine-tuned to their writing style and scientific field, says Thomas Lancaster, a computer scientist at Imperial College London.

What are the benefits for researchers?

About 55% of the respondents to the Nature survey felt that a major benefit of generative AI is its ability to edit and translate writing for researchers whose first language is not English. Similarly, in a poll by the European Research Council (ERC), which funds research in Europe, 75% of more than 1,000 ERC-grant recipients felt that generative AI will reduce language barriers in research by 2030, according to a report released in December 2 .

Of the ERC survey respondents, 85% thought that generative AI could take on repetitive or labour-intensive tasks, such as literature reviews. And 38% felt that generative AI will promote productivity in science, such as by helping researchers to write papers at a faster pace.

What are the downsides?

Although ChatGPT’s output can be convincingly human-like, Weber-Wulff warns that LLMs can still make language mistakes that readers might notice. That’s one of the reasons she advocates for researchers to acknowledge LLM use in their papers. Chatbots are also notorious for generating fabricated information, called ‘hallucinations’.

And there is a drawback to the productivity boost that LLMs might bring. Speeding up the paper-writing process could increase throughput at journals, potentially stretching editors and peer reviewers even thinner than they already are. “With this ever-increasing number of papers — because the numbers are going up every year — there just aren’t enough people available to continue to do free peer review for publishers,” Lancaster says. He points out that alongside researchers who openly use LLMs and acknowledge it, some quietly use the tools to churn out low-value research.

It’s already difficult to sift through the sea of published papers to find meaningful research, Papagiannidis says. If ChatGPT and other LLMs increase output, this will prove even more challenging.

“We have to go back and look at what the reward system is in academia,” Weber-Wulff says. The current ‘publish or perish’ model rewards researchers for constantly pushing out papers. But many people argue that this needs to shift towards a system that prioritizes quality over quantity. For example, Weber-Wulff says, the German Research Foundation allows grant applicants to include only ten publications in a proposal. “You want to focus your work on getting really good, high-level papers,” she says.

Where do scientific publishers stand on LLM use?

According to the study in The BMJ , 24 of the 100 largest publishers — collectively responsible for more than 28,000 journals — had by last October provided guidance on generative AI 1 . Journals with generative-AI policies tend to allow some use of ChatGPT and other LLMs, as long as they’re properly acknowledged.

Springer Nature, for example, states that LLM use should be documented in the methods or another section of the manuscript, a guideline introduced in January 2023. Generative AI tools do not, however, satisfy criteria for authorship, because that “carries with it accountability for the work, and AI tools cannot take such responsibility”. ( Nature ’s news team is editorially independent of its publisher, Springer Nature.)

Enforcing these rules is easier said than done, because undisclosed AI-generated text can be difficult for publishers and peer reviewers to spot. Some sleuths have caught it through subtle phrases and mistranslations . Unlike cases of plagiarism, in which there is clear source material, “you can’t prove that anything was written by AI”, Weber-Wulff says. Despite researchers racing to create LLM-detection tools , “we haven’t seen one that we thought produced a compelling enough result” to screen journal submissions, says Holden Thorp, editor in chief of the Science family of journals.

What about other uses?

Although as of November, the American Association for the Advancement of Science — which publishes Science — allows for some disclosed use of generative AI in the preparation of manuscripts, it still bans the use of LLMs during peer review , Thorp says. This is because he and others at Science want reviewers to devote their full attention to the manuscript being assessed, he adds. Similarly, Springer Nature’s policy prohibits peer reviewers from uploading manuscripts into generative AI tools .

Some grant-funding agencies, including the US National Institutes of Health and the Australian Research Council , forbid reviewers from using generative AI to help examine grant applications because of concerns about confidentiality (grant proposals are treated as confidential documents, and the data entered into public LLMs could be accessed by other people). But the ERC Scientific Council, which governs the ERC, released a statement in December recognizing that researchers use AI technologies, along with other forms of external help, to prepare grant proposals. It said that, in these cases, authors must still take full responsibility for their work.

“Many organizations come out now with very defensive statements” requiring authors to acknowledge all use of generative AI, says ERC Scientific Council member Tom Henzinger, a computer scientist at the Institute of Science and Technology Austria in Klosterneuburg.

To him, ChatGPT seems no different from running text by a colleague for feedback. “Use every resource at your disposal,” Henzinger says.

Regardless of the ever-changing rules around generative AI, researchers will continue to use it, Lancaster says. “There is no way of policing the use of technology like ChatGPT.”

doi: https://doi.org/10.1038/d41586-024-00592-w

References

  1. Ganjavi, C. et al. BMJ 384, e077192 (2024).

    Article PubMed Google Scholar 

  2. ERC. Foresight: Use and Impact of Artificial Intelligence in the Scientific Process (European Research Council, 2023).

Previous
Previous

More than 1 billion people worldwide are now estimated to have obesity

Next
Next

How cancer hijacks the nervous system to grow and spread