Generative artificial intelligence (genAI) has generated a lot of things since ChatGPT was launched in November 2022, perhaps the most ubiquitous of which has been excitement. Bold claims have come from all quarters regarding the enormous impact it will have, ranging from the philosophical, that its rapid advancement has “brought humanity to a critical juncture, similar to the Renaissance,” to the profane, that “its sweet spot for rapid ROI includes marketing, finance, supply chain, and tax compliance.”
Some of these claims are wildly exaggerated. Like any tool, genAI is good at the limited number of things it’s good at. Also, like any other tool, outside of that range it’s almost useless.
What GenAI is good for
I recently posted about how GPT-4 can help content creators do our jobs better. It can summarize a transcript, analyze technical documents, draft routine copy, write click-attracting headlines, and create social media posts. I’ve tested it on all of these tasks and use it frequently for them. For jobs like these, which are constrained to documents you feed it, genAI is good because they draw on its strength, which is manipulating text.
But as I noted in that post, I wouldn’t use genAI to create finished copy. The most important reason is it can’t say anything original. Textual genAI tools like ChatGPT manipulate words. (Others, like DALL-E, work with images.) They train on a huge database of existing text to learn what sequences of words are likely to follow other sequences or be responsive to a question or command. Because it regurgitates preexisting information, genAI will always produce some version of stuff that has already been published. Busy executives aren’t interested in worked-over pablum, and search engines will pay scant attention to derived content.
Another major reason I won’t use it for finished copy is that genAI does not write well. Many commentators claim it’s making everyone a good writer, but that’s not true. It might write better than some, but any professional writer or editor will tell you it’s nowhere near good enough for submission to a quality business journal, for instance. Editors look for stories with a strong voice—stories that sing, not ones that mumble so blandly you forget them even as you’re reading. GenAI itself admits its writing isn’t quite up to par; asked if it’s good at writing fiction, Google’s Bard colorlessly conceded it could “generate text that is grammatically correct and coherent, but it often lacks the creativity and originality of human-written text.”
If genAI can’t write original and publishable copy, how can it possibly do some of the other things pundits predict, like streamline a supply chain, discover new drugs, or eliminate a quarter of all jobs?
It can’t, and it won’t. Here are some reasons.
GenAI’s Profound Limitations
Because genAI tools are, at heart, word manipulators, there are many things they can’t do that other tools do much better.
For example, I asked GPT-4 whether 17,077 is a prime number. It claimed to be “doing the calculation,” but 20 minutes later, I still didn’t have an answer. A search of “prime number checkers” gives you a long list of calculators that will find the solution for you in less than a second; Excel, too, can do it almost instantaneously if you just copy and paste 15 lines of VBA code into it. These tools simply divide 17,077 by every number from 2 to the square root of 17,077, looking for an integer return. When they don’t get one, they tell you 17,077 is prime.
GenAI is bad at math.
And chess. One commentator notes that chess apps in the 1990s played better than Chat-GPT. In 1997, IBM’s Deep Blue beat Gary Kasparov, and chess apps have continually improved. GenAI can only look at what other players did in similar circumstances; it can’t run through millions of scenarios to find the best move. And circumstances in chess are rarely replicated—one piece out of place can make a world of difference.
GenAI also can’t tell you how to make money. One man asked GPT-4 what to do with $100 to maximize his earnings in the shortest time possible. The program had him buy a domain name, build a niche affiliate website, feature some sustainable products, and optimize for social media and search engines. Two months later, our entrepreneur had a moribund website with one comment and no sales. So genAI is bad at business.
It’s also bad at operations. An expert with a supply-chain software company points out that genAI can’t manage a supply chain because it can’t predict what will happen (remember, it’s hopeless with numbers), and its recommendations are unreliable; they are different every time, and you can’t see how they were derived, so you would be foolish to allow it to schedule your next delivery to Walmart.
GenAI is not even good at its supposed core strength: generating text. Among its shortcomings:
- Hallucinations. GenAI’s answers are often invented but presented with a high level of confidence. One lawyer is in hot water for submitting a legal brief containing six cases “identified” by Chat-GPT, complete with quotes, citations, and sources. Sounds good, right? But in fact, Chat-GPT made them up and insisted they were real, and the lawyer never bothered to double-check. The judge was not amused.
- Lost context. I recently asked GPT-4 to summarize an interview transcription I had fed it about business resilience. Though I reloaded the document several times, it produced a summary of sustainable power generation instead. I have no idea where it got that content.
- Wrong answers. Some researchers in California recently discovered that Leonardo da Vinci studied the connection between gravity and acceleration. I was curious whether GPT-4 had been aware of da Vinci’s experiments when its database was locked in 2021, so I had the following conversation with it:
Who was the first to publish about the connection between gravity and acceleration?
The connection between gravity and acceleration was first formulated by Albert Einstein in his theory of general relativity…
What about Newton?
You are correct; my previous response didn’t mention Isaac Newton’s contribution to the connection between gravity and acceleration. Newton was indeed the first….
Did Da Vinci find a connection?
Leonardo da Vinci didn’t formulate a comprehensive theory connecting gravity and acceleration, but some of his observations and ideas hinted at an understanding of certain aspects related to gravity and motion…
So ChatGPT did “know,” but only if you asked specific questions to remind it of the details. And its first answer was just wrong.
Why GenAI has fired people’s imaginations
We’ve had artificial intelligence without the generative part in various forms since the 1950s, and there have been notable successes along the way, including Deep Blue’s chess app and visual recognition technology that enables autonomous driving and the analysis of medical images. Progress has been slow, and we are nowhere near having artificial general intelligence—a machine that can learn to accomplish any intellectual task that people can perform. But despite its limited capabilities, genAI has grabbed imaginations and fueled excitement like no other advance in AI.
Why?
Because it communicates with words in a way that makes it seem intelligent and even human. This has fooled millions into assuming it’s far more capable than it is.
GenAI may not be as dumb as a rock, but it’s close.
Think about some aspects of intelligence it does not have, such as human values, common sense, consciousness, self-awareness, and a true understanding of anything. It can’t anticipate the interactions between moving objects, intuit how a person feels, build causal models, reify learning in intuitive theories, or acquire knowledge and generalize it to particular tasks and situations.
Seventy years since the advent of AI research, almost all our applied use of AI is in various aspects of machine learning, such as image recognition, natural language processing, anomaly detection, and object recognition. GenAI adds the capability to generate text or images, but it’s still a million miles from true intelligence or the kind of robot consciousness depicted in dystopian science fiction.
In fact, we don’t even really know what those qualities mean in humans. A quarter-century ago, a neuroscientist named Christof Koch bet a philosopher, David Chalmers, a case of fine wine that by 2023, we’d have discovered a clear neural pattern underlying consciousness (a prerequisite for replicating it in a machine). We haven’t, so in June of this year, Koch gave Chambers his case of wine. And, ever the optimist, he renewed his bet for 2048.
According to Amara’s Law, we tend to overestimate the impact of technology in the short term and underestimate it in the long term. Think railroads and the dot-com boom. GenAI will eventually find its equilibrium and be a widely used and useful tool, but not in the short term, and it won’t be transformative.