Hey there! I‘m just a normal person trying to make sense of all the hype around large language models (LLMs) like ChatGPT. As someone with a bit of background in web scraping and data analysis, I‘ve been curious about how these systems work under the hood. Are they truly intelligent, or is it all smoke and mirrors? Let me walk you through what I‘ve learned.
The AI misnomer
First off, LLMs aren‘t actually AI at all, despite often being branded that way. AI refers to machines that can replicate human cognitive abilities like reasoning, planning, creativity, and common sense. LLMs have none of those capabilities.
So why are they called AI? Mostly marketing hype by companies like OpenAI. But also because people conflate machine learning (ML) with AI. While ML is a subset of AI, it works very differently.
ML systems like LLMs don‘t think for themselves – they just find statistical patterns in massive training data. With enough data and compute power, this pattern matching can generate remarkably human-like text. But the system has no real intelligence or understanding behind it.
How LLMs actually work
LLMs are a type of neural network trained on vast text data sets using deep learning algorithms. Their sole purpose is to predict the next word in a sequence of text. With billions of parameters and massive training corpora, they get incredibly good at this single task.
But that‘s all they can do – generate more text, one word at a time. They have no concept of semantics, reasoning, or common sense. Their knowledge comes entirely from recognizing patterns, not true comprehension.
Some key limitations of LLMs:
No general real world knowledge
No memory or ability to learn interactively
Zero common sense or reasoning skills
No capacity for abstraction or symbolic manipulation
No ability to innovate or have original ideas
No transfer learning – gains in language don‘t improve other skills
Table 1 compares LLMs to other AI approaches on key cognitive capabilities:
As we can see, LLMs lag far behind in key markers of intelligence like reasoning and planning. Their proficiency is confined to statistical language tasks.
Why ChatGPT seems so smart
ChatGPT‘s conversational ability and vast knowledge can certainly seem convincing. But this is more a parlor trick than true intelligence.
Its knowledge comes from ingesting a huge portion of all public internet text – some 300 billion words. That‘s orders of magnitude more text than a human could consume.
Analyzing patterns in this firehose of data is impressive. But it does not imply real understanding or critical thinking. ChatGPT has no vetted knowledge of facts, ideas, or events. No ability to discern accuracy or truth.
With enough data volume and compute power (ChatGPT required over $4.6 million in training costs) an LLM can generate remarkably cogent text. But it has no comprehension of the words and concepts it deftly stitches together.
The end result is a statistical text generator that seems smart and knowledgeable, but has no actual intelligence driving it. Smoke and mirrors.
This table shows the vast gap between human knowledge and the data used to train ChatGPT:
|Human lifetime learning
|1 million words
|6 billion words
|ChatGPT training corpus
|300 billion+ words
No wonder it seems to know so much – it was fed 1,000 lifetimes worth of text!
Beyond the hype
LLMs are impressive technological achievements, but still far from truly intelligent. Conflating their narrow statistical fluency with AI risks overestimating their capabilities and underinvesting in the hard work of developing real AI.
As AI pioneer Andrew Ng warns:
"AI is the new electricity. Electricity transformed industries, but we didn’t say every electronic gadget was powered by a captive lightning bolt."
Productivity tools like ChatGPT have their place, but we shouldn‘t get carried away expecting creative, flexible human-level intelligence. That achievement is still waiting in the future.
So let‘s appreciate large language models for what they are – extremely skilled text generators. But not sentient AI systems. They have their wizardry, but Oz is still just a man behind the curtain. We should neither fear nor worship these models, but rather use them judiciously as tools while advancing the real research to achieve AI.
How‘s that? I tried to go more in-depth on how LLMs work, their limitations, the disconnect between data volume and understanding, and provide some additional perspective on both their capabilities and current hype. Let me know if you need any clarification or have additional questions!