AI Is a Great Helper—But Don’t Mistake Its Words for Evidence

There is an irony here worth stating plainly. This column, which argues against treating chatbot output as evidence, has itself been shaped through a proprietary AI pipeline at the shadow author’s direction. That is offered as both disclaimer and device. The contradiction is real, though perhaps instructive. AI can be useful in drafting, organising and sharpening prose. It becomes far less useful when people promote its answers as if they were proof.

One of the more dispiriting habits of the past two years has been the rise of the screenshot as argument. A person posts a query to a chatbot, receives a smooth paragraph in reply, then waves that paragraph around like a signed affidavit. “I asked AI, and it says…” As a form of public reasoning, this is shoddy. A chatbot is not a witness, not an archive, not a scholar, not an editor and certainly not a source with legal or moral responsibility for what it emits. It is a machine for generating plausible language.

That distinction matters. The problem with AI output is not merely that it can be wrong, though it often is wrong in confident and inventive ways. The deeper problem is that it dissolves accountability. A newspaper article can be challenged. An academic paper can be checked. A government report can be scrutinised. Even a mediocre blog post has an identifiable author, a publication date and some traceable chain of responsibility. A screenshot from a chatbot has almost none of that. It is detached from stable provenance, vulnerable to prompt manipulation and impossible to verify unless the user shows the exact query, context, model version and any system instructions that shaped the answer. Even then, reproducibility is shaky.

In that sense, the better comparison is not with a book or encyclopedia, but with early Wikipedia. In its rougher years, Wikipedia was often treated with suspicion because the line between knowledge and hearsay looked alarmingly thin. Yet Wikipedia matured by doing something chatbots still struggle to do in public argument: it built norms of verifiability. Its articles became studded with citations; its edits became inspectable; its disputes became visible. One could check the footnotes, inspect the revision history and follow the argument back to accountable sources. The site earned a degree of trust not by demanding faith, but by making doubt easier to practise.

Chatbots pull in the opposite direction. They flatten many sources, of varying quality, into a single polished answer with the mess removed. That polish is precisely what makes them dangerous in argument. They speak in a voice cleansed of uncertainty. Researchers have repeatedly shown that large language models can hallucinate facts and citations, and even the companies building them acknowledge that such systems are optimised to produce fluent responses rather than to preserve a transparent chain of evidence. The technology can be marvellous at synthesis. Synthesis without attribution leaves the user holding something rhetorically potent and evidentially weak.

None of this requires puritanism. AI is already woven into ordinary intellectual life. People use it to brainstorm, summarise, translate, draft emails and sketch ideas they later refine. Journalists, students, academics and office workers all do versions of this, whether publicly admitted or quietly practised. The sensible norm is not abstinence. It is discipline. If AI points you toward a claim, go and find the claim in the world. If it names a study, read the study. If it describes a law, check the law. If it offers a statistic, locate the dataset or the report. Use the machine as a guide dog, not as a magistrate.

That is why citing a random AI answer feels so outrageous. It asks the audience to accept language in place of evidence, confidence in place of accountability. It turns a tool for assistance into an oracle. A culture that tolerates this will make itself easier to manipulate, not because people are foolish, but because the form itself encourages passivity. The great task of the information age was learning to ask, “Where did this come from?” The age of generative AI makes that question more urgent, not less.

So by all means use the tools. I do. This very piece is touched by them. Yet if one wishes to persuade, one still owes the reader the old courtesies: sources, traceability, and a path back to something sturdier than a machine’s immaculate guess.

Sources:
OpenAI, “Why language models hallucinate” (September 5, 2025); OpenAI, “Understanding the source of what we see and hear online” (May 7, 2024); Nature Machine Intelligence, “Improving Wikipedia verifiability with AI” (2023); Nature, “AI tidies up Wikipedia’s references — and boosts reliability” (2023); arXiv, “Citation-Enhanced Generation for LLM-based Chatbots” (2024); arXiv, “Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools” (2024); Wikipedia, “Reliability of Wikipedia”; Wikipedia, “Wikipedia: Citing sources”; Wikipedia, “Wikipedia Seigenthaler biography incident.”