CLINICAL QUESTION
What lessons for journal editors, reviewers, and teachers can be drawn from the first known Medical Teacher submission using ChatGPT, and what are the wider implications?
BOTTOM LINE
As shown by examples of academic submissions with “fake” references, journal staff and reviewers must be able to detect inappropriate use of artificial intelligence (AI) tools such as ChatGPT.
COMMENT: Ken Masters, widely considered a leader in AI and ethics, also discusses the concept of AI hallucinations. As large language models (LLMs) advance, the hallucinations will improve, but, for now, academicians at every level need to be aware of the presence of these potential drawbacks and pitfalls of LLM use. This does not mean we should shy away from them, but rather embrace them and leverage them for their utility.—Eric Gantwerker, MD, MSc, MS
BACKGROUND: The inappropriate use of ChatGPT by both students and academics is a growing concern. Using a “next-word prediction paradigm,” ChatGPT produces convincing “erroneous generations” (or “hallucinations”). This generated material can appear in manuscripts in the form of fake citations and references that editors, reviewers, and teachers must then try to detect.
STUDY DESIGN: Commentary.
SETTING: Sultan Qaboos University, Sultanate of Oman.
SYNOPSIS: To illustrate how ChatGPT generates false citations in academic papers, the author describes a submission to the journal Medical Teacher in which the actual subject of the work was ChatGPT and its use in healthcare. He notes how the citations included in the paper were dated prior to the launch of ChatGPT and how they did not include a DOI, URL, or any other unique identifier. Acknowledging that citing articles without reading them is already a problem in academia, he states that the availability of ChatGPT may have exacerbated this problem. Moreover, given that citations and references are often used as evidence in narrative papers—and decisions are made based on that evidence—the implications for the journal, students, and patients could be devastating, he adds. Several recommendations to editors, reviewers, and teachers for detecting ChatGPT “hallucinations” in references are given. These include not glossing over citations; using subject expertise and critical thinking to spot oddities; being alert to the absence of unique identifiers; searching for a small sample of the papers; and reporting any concerning findings to the journal. The author stresses that in academic manuscripts, responsibility, and authorship must reside with the human author.
CITATION: Masters K. Medical Teacher’s first ChatGPT’s referencing hallucinations: Lessons for editors, reviewers, and teachers. Med Teach. 2023;45:673–675.