On a recent trip to India, representing my university at a series of events, I found myself in conversation after conversation with students who want to study computer science. The question that came up most often was some version of the same thing: what is the future of AI? I should say upfront that I do not work with what most people currently mean by that word. I am not an LLM researcher, and I have no particular stake in any of the tools that have dominated the headlines for the past few years. What I do have is thirty-eight years in computer science, which is long enough to have watched several waves of transformative technology arrive, be overhyped, partially deflate, and then quietly reshape everything anyway. That tends to produce opinions.
One thing I noticed on that trip, and have been noticing for a while, is that I cannot remember the last time I received an email containing a typo or a grammatical mistake. Not from students, not from colleagues, not from strangers. When one does arrive with an error in it, my first reaction is something close to relief — it feels more genuine, more human. That small observation contains something worth thinking about.
The reason LLMs produce such clean, competent output is not mysterious. These systems have ingested an extraordinary proportion of everything humanity has written, and what they return when you ask them something is, in a meaningful sense, an average answer: the response that best represents the distribution of all the responses that could be given. There are parameters that allow for more or less variation, and I am aware that the technical picture is more complex than this, but if you use these tools at a standard setting, what you are getting is a well-formed, thoroughly adequate, resolutely average answer. That is not a criticism. It is a description. And it turns out to be enormously consequential.
But distributions have tails. Push a model away from the mean (through higher temperature settings, unusual prompting, or simply the natural variance in how these systems behave) and the answers become less average, less safe, and occasionally less accurate. This is where hallucinations live. A model that invents a citation, asserts something confidently untrue, or produces an answer that no reasonable person would recognise as correct is not necessarily malfunctioning. It may simply be operating at the edge of its distribution, where the training data is thinner and the outputs are stranger. The standard response to hallucinations is to treat them as a flaw to be engineered away. I am not convinced that is right. Humans operate exactly the same way: our most creative ideas and our most embarrassing errors come from the same place, the moment we venture beyond what we know for certain. Isaac Newton's contributions to mechanics are beyond dispute. He was also deeply committed to alchemy. The entire scientific process exists precisely to check which tail-of-the-distribution ideas are worth keeping. Hallucinations are not the problem. Unverified hallucinations are.
But the main point is this: if an LLM reliably produces an average answer, then the floor has risen. For any task you can name (writing a cover letter, cooking a meal, decorating a room, writing a piece of code, offering emotional support) the baseline of what is acceptable has shifted upward. Why would you accept a bad recipe when an adequate one is available instantly and at no cost? Why would you tolerate an incompetent professional when the average level of competence is now freely accessible to anyone with a phone?
Think about the normal distribution for a moment. If we assume that human ability on any given task follows something like a bell curve (and I am aware this is a simplification, but bear with me), then we have always had three rough groups, as shown in the diagram below: the mediocre (below one standard deviation from the mean), the average (the broad central zone), and the expert (above one standard deviation). Mediocrity has always existed, has always found work, has always been tolerated, partly because the alternative was nothing. That tolerance is eroding. An LLM, producing answers that sit squarely in that central zone, is reliably not mediocre. And that changes the calculation entirely.
I do not think this spells the end of expertise. If anything, I think it clarifies what expertise is for. As the diagram below shows, the distribution does not disappear; it shifts. And as it shifts, the zone of genuine quality moves with it: the work that exceeds what any model could produce becomes more visible, more legible, and arguably more valuable. If you want to be taken seriously as a professional in almost any field, the implicit standard has shifted: your work needs to be meaningfully better than what I can get from a language model in thirty seconds. That is a higher bar than many people have previously been asked to clear. For some, that will be a crisis. For others, it will be the pressure that produces something genuinely good.
There is a harder question lurking here about diversity of ideas. Mediocre work is not only bad work; it is also unexpected work, work that does not know the rules well enough to follow them, work that occasionally stumbles into something no expert would have thought to try. Consider Newton again. We celebrated his hallucinations earlier as a curiosity, but the question runs deeper: would we have had the genius without the alchemy? Was it precisely because Newton was willing to pursue ideas that no rigorous mind would have entertained that he developed the habit of reaching beyond what was already known? The willingness to be wrong in large, extravagant ways may be inseparable from the capacity to be right in ways that change everything. If we raise the floor and squeeze out the mediocre, do we also lose something from the pool of ideas that has always fed innovation in unpredictable ways? I do not have a confident answer. My instinct is that the gains outweigh the losses: pushing people toward higher levels of craft and understanding is, on balance, a good thing for any field. But the question deserves to be asked.
What I find most interesting, and most open, is the longer arc. As the floor rises and people are pushed toward greater creativity and expertise, that creativity will itself become training data. The models will learn from it. The average will shift. The floor will rise again. Whether there is a ceiling to that process, or whether it describes an open-ended spiral of mutual escalation between human creativity and machine learning, I genuinely do not know. But I suspect that the professionals who will thrive in this environment are not those who can perform at the average (that position is now occupied) but those who understand that the distribution itself is moving, and that standing still on it is not a neutral act.
What I cannot tell you is whether the spiral has a natural speed limit. LLMs learn from the work we produce trying to stay ahead of them. The floor rises. We rise. The floor rises again. At some point, the interesting question is not whether mediocrity ends, but whether we can keep up with the pace at which it does.