The Model Isn't Thinking. Neither Were You, Most of the Time.

Two clichés, both wrong

The first cliché says a large language model is just autocomplete with delusions of grandeur, a stochastic parrot stitching together fragments of other people's sentences with no idea what any of it means. The second cliché says the model is on the verge of consciousness, that something is waking up inside the weights, that the difference between it and a thinking mind is a rounding error and a few more GPUs. Pick whichever flatters the position you already hold.

Both are wrong, in opposite directions, for the same reason. The truth is more interesting than either, and it does not fit on a t-shirt, which is presumably why neither cliché has retired.

The mechanism, without the jargon you would skip

A language model is trained to do one thing: given a sequence of text, predict the next chunk of text. The chunk is called a token, which is roughly a word or a piece of a word. Predict means assign a probability to every possible next token in the vocabulary, then pick from that distribution. That is the whole job.

The training works like this. Take a vast pile of text, hundreds of billions of words pulled from books, code, forums, papers, transcripts, and the open web. Show the model a sequence with the last token hidden. Ask it to guess. Compare the guess to the real token. Nudge the model's internal parameters in the direction that would have made the right answer slightly more likely. Repeat this trillions of times, across every kind of text humans have written, until the nudges have shaped a network of billions of parameters into something that produces eerily plausible continuations of almost anything you feed it.

Two pieces of vocabulary are worth knowing because they will not go away. Embeddings are how the model represents meaning: every token gets mapped to a long list of numbers, a point in a high-dimensional space, where tokens with similar uses end up near each other. Attention is the mechanism that lets the model decide, when predicting the next token, which earlier tokens in the sequence matter most. Attention is what makes the model good at long-range structure, at remembering that the sentence opened with a question, at noticing that paragraph three referred back to paragraph one.

That is the technology. A predictor of next tokens, trained at absurd scale, with a mechanism for weighing what came before. No magic. No soul. Nothing that, described at this level, sounds like it should produce a working translator, a passable lawyer, or a competent first draft of almost any document a knowledge worker might need.

Why something so mundane produces something so useful

Here is where the cliché breaks down. The stochastic parrot story is correct about the mechanism and wrong about the consequence. Predicting the next token, at sufficient scale, on sufficient data, is not what it sounds like.

The training data does not just contain words. It contains the shape of every kind of structured human output. The shape of an explanation. The shape of a proof. The shape of a working code snippet. The shape of a courtroom argument, a research abstract, a customer support reply, a recipe, an apology, a joke. To predict the next token in a debugging session, the model has to internalize what debugging sessions look like. To predict the next token in a chain of mathematical reasoning, it has to internalize what the steps of such a chain tend to do.

The model is not memorizing those examples. The space is too large, the parameters too few relative to the data. What gets stored is closer to a compressed atlas of patterns, an enormous library of templates for how text of a particular kind tends to move. When prompted, the model finds itself, statistically speaking, somewhere on that atlas, and produces the continuation that fits the territory it appears to be in.

The model learns the shape, not the substance. At sufficient scale, the shape carries the substance most of the time.

That last clause is the part both clichés cannot tolerate. Most of the time, in a great many domains, the shape of correct reasoning and the substance of correct reasoning are the same object. The form of a competent code review just is a competent code review. The form of a clear summary just is a clear summary. For tasks where mimicking the output of a competent person is indistinguishable from being competent, the model is competent. Not in the way a person is. In the way the work itself can be checked and used.

Where the shape and the substance come apart

The same explanation predicts the failure modes, which is how you know it is doing real work.

Anywhere the shape of correct reasoning and the substance of correct reasoning are different objects, the model gets into trouble. Novel mathematics is the obvious case. The model can produce text that looks exactly like a proof, with the right rhythm of lemmas and the right transitional phrases, while the proof itself is nonsense. The pattern of a proof and the validity of a proof are distinct, and the model was trained on the pattern.

Multi-step planning under genuine uncertainty is another. The model can describe a plan in the right register, with appropriate hedging and convincing intermediate steps, while the plan would collapse on contact with a real environment. A plan that sounds right and a plan that works are different things, and the training signal does not reliably distinguish them.

Anything requiring a stable worldview is a third. Ask the model to hold a controversial position consistently across a long conversation, against pushback, and it will drift. Not because it is being agreeable. Because there is no it doing the holding. There is a distribution over plausible continuations, and the distribution will move with the prompt.

These failures are not bugs. They are direct consequences of the mechanism. A system that learns shapes will be excellent wherever shape and substance coincide, and unreliable wherever they diverge. Notice that this prediction is testable, has been tested, and keeps coming out the same way.

The uncomfortable part nobody wants to say out loud

If pattern completion at scale can write a passable contract, draft a serviceable email, debug a medium-sized function, summarize a paper, and translate a press release, that raises a question the discourse has been carefully arranging itself to avoid.

How much of human reasoning, in those same domains, was ever more than pattern completion?

Not the deepest cases. Not the working mathematician at the frontier, not the surgeon improvising in a chest cavity, not the founder making a bet with incomplete information at three in the morning. Those involve something the model does not have. But the median knowledge work? The third draft of the deck, the standard contract, the routine diagnosis, the boilerplate code, the executive summary, the polite refusal, the performance review, the legal brief that cites the same six cases everyone cites? An enormous amount of that output was already shape over substance, produced by humans whose training was, in important respects, the same kind of training: see many examples, internalize the patterns, generate plausible continuations.

That is the implication both clichés exist to dodge. The dismissive cliché protects the human ego: if the model is just autocomplete, then anyone whose work the model can do was never doing real thinking, and the rest of us were. The exalting cliché protects the AI ego, and the financial position of those selling it: if the model is on the verge of mind, then the eye-watering valuations are downstream of an emerging consciousness rather than a very good pattern matcher. Both stories let the reader avoid the more disorienting middle, which is that the model is doing exactly what it appears to be doing, and so, for a large fraction of professional output, are the humans.

The working mental model

Keep this picture instead. A language model is a high-dimensional map of how human-produced text tends to move. When prompted, it locates itself on the map and rolls forward along the most likely path. Where that path coincides with correct reasoning, the output is useful, often startlingly so. Where the path and the reasoning come apart, the output is a confident hallucination, indistinguishable on the surface from the real thing, which is why the surface is no longer a safe place to evaluate it.

Treat the model as a competent intern with unlimited stamina, a perfect memory for form, no judgment about substance, and no stake in the outcome. Use it for the work where form is most of the substance. Check it ruthlessly where the two diverge. Notice, while doing so, which categories of your own work are which, and what that implies.

The boring mechanism produces the surprising result. The surprise is not that a machine learned to think. The surprise is how far you can get without thinking, and how much of what looked like thinking was on the same map all along.