The chess grandmaster's mistake: what Deep Blue can teach us about AI and writing
Welcome to the era of the centaur writer
In my Winning Writing x ChatGPT lectures at Stanford, I continually hear the same comments about large language models (LLMs):
The writing is not as good as if I were to write something myself!
These models don't understand grammar or logic, so how is it able to write prose if it can't think?
It makes these funny errors — Google doesn't make these, and Google has been around for decades!
A lot of people really underestimate these LLMs, fundamentally because they were benchmarking them against a singular, top-tier human writer. They think that just because they — excellent writers in their own right — could produce better writing than today's top LLMs, they could safely ignore ChatGPT (and the other LLMs out there).
One clear mistake is discounting the speed of improvement of these models.
I was certainly guilty of this too: my first foray into AI was in my freshman computer science course where our assignment was to implement a Markov model to (naively) generate text. It was a very interesting pedagogical exercise where I delved deeper into the idea of statistical models applied to language, but the text my model created was not very interesting. A few years later, I experimented with LSTMs for text generation, and while the text produced was somewhat coherent, it was still too costly to generate. Incidentally, that particular year was 2017, which was when the seminal Attention is All You Need paper was published, eventually ushering in the (current) age of the Transformer and large language model.
As they say, we tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.1
Part of this shortsightedness is just natural human psychology. If we follow the trajectory of technological disruption, we see an age-old pattern:
Ridicule — Why are you wasting my time with this?
Flaw-picking — Hmm, it's a cute toy, but it doesn't do x, y, z.
Disbelief — What?! How does it do that!
Acceptance — Oh please, that's not even interesting any more.
However, the bigger mistake is to view artificial intelligence as a pure substitute for human creativity and ingenuity.
For now, AI systems are highly specific and specialized machines: incredible at their functions in constrained domains, and not anywhere else. Similar to my previous piece on AI as the lever rather than the product, current AI technology is much more augmentation than it is automation. Our own human weaknesses will be improved by the machines, and their defects will be masked by our human judgment.
If we follow history, the same will happen in writing: the best writers will be those who know how to wield these AI tools to augment their own human abilities.
We don't need to look very far in the past to see this exact story play out. The exact same thing happened in chess.
Deep Blue
I started playing chess (semi)seriously in kindergarten. For a few years, I was the dominant player at our school's chess club. After an undefeated tournament run in elementary school — I won my first match because my opponent didn't show up, so I got bored and asked my dad to take me home — I ended up "retiring."
However, the more interesting anecdote is how I got started playing chess. Some teacher recommended that I should join the chess club because I was "smart" and that's what the smart kids did. For the time that I did, I stuck with chess mainly because it was associated with intelligence.
For some reason, chess has always captured people's minds as a fundamental measurement of human intelligence. Chess requires strategic planning and pattern recognition, synthesized with incredible mental concentration. It's fascinating that so many cultures around the world developed variants of the game that we call chess today — Chaturanga in India, Shatranj in Persia, Xiangqi in China, Shogi in Japan, and many more. It has an incredible depth of complexity emerging from a small collection of pieces and simple rules.
As the field of artificial intelligence blossomed in the 20th century, chess was chosen as the de-facto measurement of human intelligence. And while researchers tried to build chess playing systems, for the vast majority of the century, human grandmasters reigned supreme.
Then came Deep Blue.
In the 1990's, IBM dedicated a non-trivial amount of its resources into building Deep Blue. Deep Blue was a supercomputer built for one purpose only: playing chess at a world-class level. Deep Blue's hardware consisted of hundreds of custom-designed chess chips, which allowed it to evaluate up to 200 million chess positions per second. This raw computing power combined with its advanced chess-playing software allowed Deep Blue to match up against the best human players.
Of course, most chess grandmasters were dismissive. Deep Blue was not the first chess playing machine, and all of the previous generations couldn't compare to the top human players. Experts felt that chess couldn't be simply reduced to the pure number crunching of a computer. Reigning world champion Garry Kasparov added more detail: "It's just a machine. Machines are stupid."2
In 1997, Deep Blue played a series of famous games against Kasparov. The machine ended up winning the match 4-2.
The inflection point
Most people think of this as a moment in which artificial intelligence reached an inflection point.
Computers had defeated humans in chess, humanity's chosen gauge of intelligence — so chess should logically be over! But that's not what happened.
Chess actually experienced a renaissance, and the media coverage of the match caused millions around the world to start learning chess. In the wake of Deep Blue's victory, a new era of chess emerged: the age of centaur chess players.
Centaur chess refers to the collaboration between human players and chess engines, where the human's intuition and strategic understanding are combined with the computer's raw calculating power.
Most interestingly, the best centaur chess "players" weren't the most powerful computers paired with the best grandmasters. Instead, the top centaurs were often skilled amateurs working with relatively modest chess engines. These human players excelled at harnessing the strengths of the computer while mitigating its weaknesses. People used their understanding of chess strategy to guide the engine's analysis, focusing on the most promising lines of play and discarding irrelevant variations. Moreover, the human players were able to identify positional factors and long-term plans that the engines might overlook in their pursuit of short-term tactical gains.
This combination of human creativity and machine precision proved to be formidable, with centaur teams consistently outperforming both human grandmasters and top chess engines in tournament play.
For twenty years even after Deep Blue dethroned humanity's top chess champion, the top chess players remained a combination of human and machine — the centaur.3
The Age of the Centaur
The Deep Blue story of artificial intelligence development is the pattern I expect to see as AI continues to permeate human society.
I think a lot about writing because it's what I love to do. Writing requires creativity and intellect, and it is something we hold to be so intensely human and fundamental to our identities as humans.
When ChatGPT showed up on the scene, people were initially bamboozled by its abilities. But seemingly just as quickly, critiques emerged: it made silly mistakes that simple humans wouldn't make, it wasn't really thinking like a human when writing, and it couldn't possibly capture the je ne sais quoi of excellent writing.
It's just a machine. Machines are stupid.
Sound familiar?
At this moment, AI allows for machines to create writing indistinguishable from most human writers. It's not quite world-class, but the best LLMs write better than most humans. But at some point, the machine will be able to create writing indistinguishable from the best human writers, and perhaps surpass them.
But the arc of chess illuminates a path forward.
Even when LLMs (or their future ancestors) reach the summit of human-level expert performance, the human and machine centaur combo can be superior. Just like in chess, the best centaur isn't simply a mash-up of the most powerful computers with the most talented humans. Rather, it's the humans who are able to figure out how to use their AI tools in the best way possible.
We aren't yet at writing's Deep Blue moment, but we're getting closer and closer.
It's certainly true that writing isn't a well-constrained problem like chess, where it's clear when someone wins or loses. Writing is more squishy and subjective — can anyone really say if Dostoyevsky is better than Tolstoy? I think this means the era of centaur and human-augmented performance will be even more interesting.
And this is why we should prioritize experimenting with and mastering these AI tools.
The story of chess teaches us that AI is not a replacement for human ingenuity, but rather a powerful tool to be wielded by those who can harness its strengths. As we stand on the cusp of a new era in writing, it is those who learn to work in harmony with AI who will emerge as the masters of the craft.
This is also known as Amara’s Law.
For an interesting digression on this topic, read Garry Kasparov's Deep Thinking. It's an interesting first-person reflection on facing Deep Blue from the chess grandmaster himself, and includes a wide amount of pontification of what an AI-centaur future might look like.