Uncovering the limitations of ChatGPT

Jan 14, 20234 min read

Updated: Oct 2, 2023

AI-generated image created with MidJourney | Owned by The AI Academy as per MidJourney ToS

I’ll start by saying I find the evolution of generative algorithms quite exciting and promising. I’ve played a little with ChatGPT and I think OpenAI has done a pretty good job in two aspects:

re-imagine the search user experience: who would want to get a bunch of hyperlinks and a ton of new pages to read when you can get a well written summary about a topic in plain English in seconds?
implementing some guardrails to limit the misuse of this powerful tool: I’ll share some examples of this below, but I think this effort deserves to be praised. Is not something we should take for granted and the proof is that you don’t get this “feature” with other tools.

Having said that, my instinct in the midst of a ChatGPT hype wave where literally thousands of articles and videos are posted every day explaining what you can do with it, is to set the record straight so more people can understand what a Large Language Model is and isn’t.

The best way I found to reach my goal is to focus on its limitations and I will let ChatGPT itself answer to prove the key points.

Here we go.

1. Provide answers about current events

The first concept is that a LLM is built using a large amount of training data. It can certainly be re-trained but the answer that you get are built on the basis of the data it has been trained on. Clearly the larger the training data the higher the chances ChatGPT “has the answer” but if you are asking a question about something that was not in the training data it will not be able to provide an answer.

Here is the question I’ve asked to illustrate this point:

note: as I write this article the version of ChatGPT was the January 9th 2023 version

2. Predict the future

This may seem redundant if you know enough about AI but since Machine Learning is fundamentally about predictions, I wanted to take this out of the way. A LLM is designed and built to predict which words to pick to complete a sentence (that is how it shines in the “generative” part of the implementation) but is not designed to use Machine Learning to make predictions.

Here is the question I’ve asked to illustrate this point:

note: see how the generative part of its “personality” shines here adding context to an answer it cannot provide. That is a testimony to the nice UX work OpenAI has done to avoid providing yes/no/I-don’t-know answers.

3. Symbolic generalization

This is perhaps the most fundamental objection part of the AI community is making to those initiatives like OpenAI who are running fast ahead with the Deep Learning, the-bigger-the-data-the-better-the-model flag and significant investments.

This is a big debate and goes well beyond the objective of this brief article but if you are interested, I recommend you listen to this interview with Gary Marcus on this topic.

The key point here is that the answers provided by ChatGPT are not generated using general rules but probabilistic approximations and - although the value they provide is unconfutable - it might not be what you need in some situation.

Here are two simple examples to illustrate this point in practice:

Word Count

Anyone would consider this a trivial task and have access to several tools who are very efficient at doing this. This is because we know how to implement in software a “word count function” that does that very well, providing each time the exact, correct answer. In the probabilistic world of deep learning, you may not get an exact answer, so ChatGPT had to include the work “approximately” in its reply:

Multiplication

This is another example to illustrate how something we consider a very simple task, is not a good fit for a tool like ChatGPT. Performing a multiplication is something you can do effortlessly and reliably (is cheap, fast and you always get the same results). This again has to do with the fact that there are established methods to implement a “multiplication function” - in hardware (calculator) or software (app in your cellphone) - that are “general”: it doesn’t matter what numbers you plug into the function, it can generalize and provide a correct answer every time. Here is the answer ChatGPT gave me in this case:

Now, because its strong trait is to be …. “chatty” … I asked to explain the answer. Here is what I got:

Pretty smooth uh?!

4. Personal advise on a specific situation

I’ve chosen to run some experiments on this because this is a tricky one. Perhaps because I have young kids I’ve envisioned a scenario where young people would actually ask a tool like ChatGPT anything. It is also a use case where “having past knowledge” may help providing guidance: after all this is why we seek advise to people with more experience than us.

So before people run to make an article saying that ChatGPT will replace parents, I wanted to run some experiments.

Dating

Pay raise

Eating

Voting

The answers are surprisingly well crafted - demonstrating the ability of the LLM to model quite well English Language. They always include

a disclaimer, indicating the bot doesn’t know the specific situation
some commonsense sentences providing general guidance

ChatGPT doesn’t know the your specific personal context but is able to identify the topic and generate text in the form of general guidance.

Also worth noting the “guardrails” included in the LLM development to minimize bias in the anwers provided.

5. Pretty much anything in the physical world

well, … I tried ;)

Conclusions

Humans do associate Linguistic Skills with Intelligence (i.e. “she speaks 7 languages, she’s very clever”, “my kid was building full sentences by 4”) and indeed there is strong research demonstrating how much Intelligence and Language are interdependent. This is why our immediate reaction to a tool like ChatGPT that does a very good job at modeling human language is to classify it as “intelligent”.

To quote Microsoft’s Kate Crawford “there is nothing artificial and nothing intelligent about Artificial Intelligence” but these tools can be very useful nonetheless for a number of use cases. The key always is to know a bit more about them to use them well and use them responsibly.

I hope this quick exercise was useful to those who are not living and breathing AI and might get overwhelmed or confused by the claims being made around this promising technology.