Learn the Intuition behind LLM Hallucination

ML-Guy
7 min readDec 10, 2024

One of the main criticisms of Large-Language-Models (LLM) technology is that the models (ChatGPT, Claude, Gemini, etc.) often “hallucinate” and provide incorrect information in their replies. In this post, we will develop the intuition of why these “hallucinations” happen and how we can prevent them.

What day is it anyway?

Let’s start with a simple example:

What day is it reply from GPT-4o

If you check the calendar, you can see that Dec 10 (the day I asked the above question) was Tuesday, not Sunday. We can use this simple example to understand the simple reason that causes such “hallucinations.”

LLM are statistical models

What does it mean for a model to be a statistical model? When OpenAI and any other LLM provider trains a LLM, it feeds it with many examples of text from various sources such as Wikipedia, news sites, blog posts, books, movie scripts, etc. They mostly use “unsupervised learning” techniques, which don’t require human tagging on the data. The simple method is to hide one of the words in a sentence and train the model to complete the missing word. At the beginning of the training process, the model probably guesses wrong, but after many examples and cycles of training, it learns which words are more likely to appear after or before some words in a sentence.

--

--

ML-Guy
ML-Guy

Written by ML-Guy

Guy Ernest is the co-founder and CTO of @aiOla, a promising AI startup that closes the loop between knowledge, people & systems. He is also an AWS ML Hero.

No responses yet