MilkCrunch

How I'm Going to Explain LLMs to My Parents

· by Michael Doornbos · 925 words

I recently wrote a 2,500-word breakdown of how large language models work. Tokenization, vectors, attention mechanisms, training phases. I was pretty happy with it.

Then I realized my Mom is going to ask me what ChatGPT actually does. And none of that is going to matter. My Dad, who ran mainframes from 1974 to 2007, is going to need a completely different explanation.

My parents are smart people. This isn’t about dumbing things down. It’s about figuring out the right depth for the person in front of you. Meet them where they are.

These are loose characterizations of my parents, maybe unfairly so. They’d probably argue with me about it, unless they understand I’m using their differences to make a point, not to be specific about them. But broadly: Mom has no technical background and Dad has deep technical background from a different era. Most people you’ll explain this to fall somewhere between them. The point isn’t the specific analogies. It’s adjusting your depth to meet the person in front of you.

Mom: it guesses the next word

“You know how your phone suggests the next word when you’re texting? ChatGPT does the same thing, except it read basically everything on the internet first. So it’s really good at guessing.”

I think this will land. Not “artificial intelligence.” Not “large language model.” Just: it guesses the next word, and it’s read a lot. If she asks why people call it AI: “Because ’large language model’ is a mouthful and ’next-word guesser’ doesn’t sound like it’s worth billions of dollars.”

Knowing my Mom, she’ll want to know how it guesses. Under the hood, the text gets chopped into pieces called tokens, each token becomes a number, and those numbers have complex relationships with each other, how close or far apart they are in meaning. Figuring out those relationships across trillions of examples is what makes training so expensive. Once that’s done, the model uses all of it to figure out which word is most likely next.

It’s not thinking

This is the important part. I’m pretty sure she assumes it’s some kind of brain in a box.

“It’s not thinking. It’s never had an original thought. It picks the most likely next word based on patterns it “learned” from reading. Like how you know that when someone says ‘How are you?’ they don’t actually want a medical report. You learned that from experience. ChatGPT learned it from reading billions of sentences.”

I’m guessing she’ll come up with a better analogy than anything I’ve got. Moms are good at that.

For the curious: these patterns live in something called “rules of thumb,” or heuristics. The model was never told that questions should be followed by answers, or that formal emails sound different from texts to friends. It just figured that out from reading trillions of examples. Like how you learned social norms by living in the world, not by reading a manual.

It can be wrong

“It doesn’t know what’s true. It knows what sounds right. Those are different things. It can write you a completely wrong answer and sound totally confident about it, because confidence is just another pattern it learned.”

This one might surprise her. I think a lot of people trust it like a search engine.

The technical term is “hallucination.” The model generates whatever sounds most probable, and a confident wrong answer can be more probable than “I don’t know.” It’s not lying. It genuinely doesn’t know the difference.

It’s a tool

“Think of it like a calculator for words. A calculator doesn’t understand math, it just follows rules really fast. ChatGPT doesn’t understand language, it just predicts really well. You wouldn’t trust a calculator to tell you which math problem to solve. Same idea.”

She understands tools.

Now for Dad

Dad operated mainframes for over thirty years. He knows what a job scheduler is. He knows batch processing. But everything he knows about computers is deterministic: you write a program, it executes the instructions, same input, same output, every time. LLMs are the opposite. There is no program. Ask it the same question twice, even back to back, and you’ll get different answers. Not because it’s confused. Because it’s picking from probabilities each time, and the dice roll differently. There is no “correct” output, just a range of likely ones. For someone who spent decades making sure the same job produced the same output every night, that’s going to feel fundamentally wrong.

So I’m going to meet him where he is: “Remember how you could look at a job and know it was going to fail before it ran? Nobody taught you that. You just saw enough jobs over thirty years that you developed instincts. That’s all an LLM is. Instincts built from reading, not experience. And just like yours, they’re usually right. But sometimes they’re not.”

He’ll get that. He’ll also immediately distrust it, which is probably the right response.

The real test

I wrote a 2,500-word technical explainer and I’m glad I did. If you’re building with these tools or making decisions about them, you should understand how they work. The paragraphs after each Mom analogy above are a middle ground if you want more depth without the full deep dive.

But the people in your life who just want to know what’s going on? Four sentences:

It guesses the next word. It’s read a lot. It’s not thinking. And it can be wrong.

I’ll report back.


How would you explain LLMs to someone non-technical? I’m still working on this.

<< Previous Post

|

Next Post >>