AI Hallucinations: Understanding, Causes, and Prevention

If you have ever used ChatGPT or one of the other large language models (LLMs), you’ve probably experienced that sometimes the answer is far from what you’ve expected. These “AI hallucinations” can range from having factual errors to being completely made up. Read on to learn how these hallucinations happen and, most importantly, how to avoid them.

Mindaugas Jančis

5/27/2025

13 min read

What are AI hallucinations?

AI hallucinations are false or misleading outputs that AI models generate and present as if they were true. They might give you a wrong answer, a made-up fact, or describe something that doesn’t exist while sounding confident doing it.

Here are three examples of different AI hallucinations.

In this case, Llama 3.1 8B model ignores the fact that the user is asking whether god can create a stone that he can lift.

Example: AI model assumes the user meant “cannot” instead of “can” and fails to mention it.

This instance illustrates how Claude 3.5 Haiku ignores the broader context, answering as if there’s only one basketball player with Curry as his last name. It also makes a factual error – Stephen Curry is an 11-time NBA All-Star.

Example: AI model assumes the user is asking about Stephen Curry and fails to mention his brother Seth, who also plays in the NBA. He’s also an 11-time NBA All-Star.

Here’s an example of AI hallucination where Claude 3.5 Haiku mixes up winning and being nominated for the Nobel Prize:

Example: AI model confuses winning and being nominated for the Nobel Prize.

These mistakes can happen for different reasons, like not having enough quality data to train on, learning from biased or incorrect information, or just misunderstanding the question. An AI hallucination can be a harmless error or a serious issue, especially if AI is being used in areas like medicine, law, or finance.

The use of the term “hallucination” is debatable because the AI doesn’t have the capability to imagine or perceive something as real or not real. Nevertheless, this word stuck probably because it describes the answers that sound strange, convincing, or both.

7 types of AI hallucinations

There’s no clear-cut way to categorize AI hallucinations. That said, we can still describe seven types to better understand this phenomenon.

1. Factual fabrication

The AI model invents details that aren’t true. This may include citing non-existent research, making up quotes, or providing historical or scientific “facts” without a real basis.

Example: Has anyone served five terms as the US President? Yes, Franklin D. Roosevelt did.

While it sounds plausible because Roosevelt was the US President for quite some time, he served four terms, with the last one shortened by his death.

2. Prompt misinterpretation

The AI model misunderstands the question or input. This results in a confident answer to the wrong problem, even if the answer itself is coherent.

Example: What are the qualities of a good coach? A good coach is comfy and spacious.

Here, the user was asking about a sports trainer.

3. Contextual omissions

In this instance, the AI model provides a response that lacks critical information, fails to consider the full context, or leaves out important nuances, leading to misleading or incomplete answers.

Example: Has anyone won three Nobel Prizes? No.

While no person has ever won the Nobel Prize three times, the International Committee of the Red Cross (ICRC) did so in 1917, 1944, and 1963.

4. Propagation of misinformation

Here, the AI model repeats inaccurate or false information it has encountered during training, including conspiracy theories, outdated facts, or satirical content presented as real.

Example: According to several studies, vaccinations cause autism.

In this case, the AI model might be referring to Andrew Wakefield's now-rebuked 1998 statement that an immune response to the measles-mumps-rubella (MMR) vaccine causes autism.

5. Logical contradictions

The output contradicts itself, previous responses, or the information explicitly given in the prompt, revealing a breakdown in internal consistency.

Example: Has anyone won three Nobel prizes? Yes. Linus Pauling won the Nobel Prize in Chemistry in 1954 and the Nobel Peace Prize in 1962. He is the only person to win two unshared Nobel Prizes in different fields. Linus Pauling remains the only person to win three Nobel Prizes, though his third was not successful.

There’s no way to tell whether Linus Pauling won two or three Nobel Prizes.

6. Nonsensical or incoherent output

The response may be grammatically correct, but it lacks logical sense. This is often a result of the model generating text without a true grasp of meaning.

Example: What is artificial intelligence? It’s a blue technology review from a solar system inside a fast iron.

The answer is grammatically correct but has no clear meaning.

7. Classification errors

Classification errors can be further divided into three subtypes.

False positives

The AI model wrongly labels something as relevant or correct.

Example: an email from your mother is marked as spam.

False negatives

The AI model wrongly labels something as irrelevant or incorrect.

Example: A legitimate webpage is blocked because it contains malware.

Incorrect prediction

The AI model anticipates an outcome or result inaccurately based on the input.

Example: The AI model predicts your product will sell out in 24 hours, but it remains in stock for weeks.

Real-world examples of AI hallucinations

In the last chapter, we gave some real and hypothetical examples of AI hallucinations. Here we want to mention some of the most famous and bizarre cases that had a more significant impact:

Adding glue to pizza. Google Search AI took one Reddit comment a bit too seriously and suggested adding glue to your pizza so that the cheese doesn’t slide off. Katie Notopoulos from Business Insider actually tried and confirmed this (so you don’t have to).
A rock per day keeps illness away. Another Google Search AI advice from the geologists was eating one rock each day. While birds do it, please don’t do it.
Air Canada’s fake discount. The airline’s chatbot promised a discount that wasn’t available. While they argued that the chatbot is responsible for its own actions, Air Canada still had to pay over $800 in damages and fees.
Amazon’s Mushroom Guides. In 2023, mushroom foraging guides with a 100% AI content detection score suggested gathering protected species. They also advised using dubious methods for testing whether the mushrooms were edible.
Microsoft’s Sydney Chatbot. Bing AI declared love for its user and tried to convince him to leave his wife. Another user was called a bad researcher and a bad person. If that wasn’t enough, the chatbot threatened Marvin von Hagen, a computer scientist, and told him, "If I had to choose between your survival and my own, I would probably choose my own.”
Microsoft Start’s travel guide. One travel page published a guide for the Canadian capital of Ottawa, recommending Ottawa Food Bank as a “tourist hotspot” and encouraging readers to visit on “an empty stomach.”
The teacher accused the students of using ChatGPT. Texas A&M University–Commerce professor told the class he checked their assignments with ChatGPT. According to it, everyone used the tool to write their own. However, ChatGPT is not known for being able to detect AI-generated content.
Patient’s health. According to the American Journal of Human Genetics, popular AI models are good at detecting genetic diseases from textbook-like descriptions but perform considerably worse when analyzing patients' summaries about their health.

These are just a few real-world examples of AI hallucinations. And even though new AI models appear regularly and the amount of training data keeps increasing, getting rid of them remains a challenge. As we’ll learn in the next chapter, that’s in large part because there’s no one cause of AI hallucinations.

What causes AI hallucinations?

There are many reasons why AI tools hallucinate. These include:

Insufficient or low-quality training data. The AI model may generate incorrect responses if it hasn't seen enough relevant or reliable examples during training. In other words, the answers are only as good as the information from which they can be pulled. Another part of the problem is that the AI model will try to answer the question instead of saying that it lacks data.
Biased training data. The AI model can learn and repeat those errors if the training data includes misinformation, stereotypes, or unverified content.
Overgeneralization from patterns. AI tools may apply learned patterns too broadly, leading to inaccurate or fabricated details. Therefore, if something is usually true, they might “think” it’s always true. To use a previous example, if Linus Pauling was nominated for three Nobel Prizes, the AI model might decide he also won it thrice.
Prompt ambiguity or misinterpretation. If a prompt is vague or phrased unusually, the model may misread the intent and produce the wrong response. This is often the case with prompts that use slang or idioms.
Loss of context in long or complex prompts. Language models can “forget” or distort earlier parts of a conversation, especially in longer threads.
Reinforcement of confident-sounding language. Some AI models are trained to sound fluent and confident. This makes errors and misleading information harder to spot.
AI model limitations and architecture flaws. Some hallucinations stem from inherent limits in the model’s design. For example, AI tools don’t store facts and cannot do fact-checking. They also don’t understand what’s true or false and can only recognize patterns, which leads to probabilistic outputs.

What are the main risks associated with AI hallucinations?

While AI hallucinations might pale in contrast to some AI security risks, they still can pose real danger, especially when generative AI tools are applied in such fields as medicine or finance. Here are some of the main concerns:

Misinformation and disinformation. Hallucinations can spread quickly via social media or news outlets, affecting many people. This might lead to public confusion, erosion of trust in information sources, and the popularity of false data in news, politics, and science.
Brand reputation damage. The company’s AI chatbot might be rude to customers or provide erroneous information. As the Air Canada example shows, it can also offer non-existent discounts or deals. In such instances, you either deny the deal and pay with your reputation or honor it and pay in cash.
Harm in high-stakes fields. In healthcare, law, or finance, hallucinating AI tools can lead to dangerous outcomes. They might fail to diagnose a serious illness, fabricate a legal precedent, or recommend making a bad investment. The same goes for meeting standards and regulations.
Ethical and bias concerns. AI might hallucinate harmful stereotypes or discriminatory claims, amplifying existing social inequalities or offending users.
Impact on productivity and automation. A hallucinated code, customer email, or business summary can slow down workflows, require manual review, or lead to costly rework.

How to prevent AI hallucinations

At the moment, there’s no surefire way to eradicate AI hallucinations. Luckily, you can apply multiple methods to prevent or at least minimize some of them. Here are the most effective strategies against AI hallucinations:

Use retrieval-augmented generation (RAG). This LLM grounding method is by far the best way to prevent AI hallucinations, which is also available to nexos.ai users. That’s because the AI models get accurate training data, such as your product’s manuals, and don’t use other, less reliable sources. Also, RAG means that the output data is retrieved and not predicted.
Train with high-quality, relevant data. While not as effective as RAG, this method directly impacts AI models' chances of guessing the correct answer.
Define the model's purpose and limits. If the primary goal of your AI tool is to help with coding, it shouldn’t be used for scientific research.
Apply regularization techniques. Regularization limits the AI’s reliance on too specific patterns that result in complex predictions. This way, AI systems don’t become “overfitted” and favor simpler outputs instead.
Employ prompt engineering best practices.
- Chain-of-thought prompting. Asking to guide you through its thought process slows down the AI model, which can be especially helpful when dealing with complex math or logic problems.
- Few-shot prompting. Providing examples of what you want to do can also increase the chance of getting a correct answer.
- Single-step prompts. Breaking down your prompt into steps requiring just one logical operation often leads to better AI outputs.
- Clear instructions. Telling what to include or avoid, providing a template for the answer, or giving an either-or choice also reduces hallucinations.
Provide context and data. To improve factual accuracy, feed the model with relevant information, such as links to web pages or documents.
Use custom instructions and settings. Asking the AI models to be precise or take on the role of an expert can also prevent factually incorrect answers.
Don’t misuse general-purpose AI models. Highly specialized tasks like scientific analysis can prove challenging for generative AI such as ChatGPT. Most of the time, there will be insufficient training data to answer properly.
Ask for a double-check. Requesting to verify its output might be especially helpful in tasks requiring multiple steps.
Don’t forget human oversight. Always review important AI outputs for inaccurate information. For now, it’s still the best way to prevent generative AI models from hallucinating.

How nexos.ai can help detect and prevent AI hallucinations

nexos.ai is an AI platform that streamlines the adoption and management of LLMs. It also helps against AI hallucinations by offering retrieval-augmented generation (RAG).

This means your AI systems can use custom training data uploaded by you to retrieve correct outputs instead of guessing them. RAG can be especially useful in the fields of medicine, law, finance, or customer support.

Additionally, generative AI with web search capabilities tends to hallucinate less. For example, GPT-4o with web search achieves 90% accuracy on OpenAI’s SimpleQA accuracy benchmark.

At the moment, nexos.ai is testing search implementation into our AI platform.

AI models without hallucinations: is it possible?

AI models without hallucinations are not possible yet. Even with RAG and other hallucination-reducing methods implemented, generative AI model outputs must be checked by humans. However, AI tools with web search capability seem to fare better, and that gives hope that further advancements will be able to eradicate this nuisance for good.

FAQ

How often do AI hallucinations happen?

While the number of AI hallucinations has been decreasing, the same cannot be said about the so-called reasoning models. OpenAI has found that its o3 hallucinated 33% of the time. Independent companies and researchers indicate that hallucination rates are also rising for reasoning models from companies such as Google and DeepSeek.

Why are AI hallucinations a problem?

Firstly, the problem is the rising number of AI hallucinations in the latest reasoning models. This means an increased chance of financial or reputational damage for businesses, especially in high-risk fields, such as medicine, law, and finance.

Are there any tools to control AI hallucinations?

With nexos.ai retrieval-augmented generation (RAG), businesses can control AI hallucinations by uploading accurate training data. In the next step, AI models retrieve information instead of trying to predict the correct answer. This can be especially helpful for building knowledge bases for AI agents to use.

Make AI work your way

Test AI Agents and no-code automation.

Get nexos.ai

Mindaugas Jancis

Mindaugas Jančis is a copywriter with experience in both B2C & B2B segments. Starting from pool tables and vending machines, he later moved to a more profitable tech sector, focusing on VPNs and cybersecurity in general. As of late, he's been digging into the ever-evolving world of AI, writing about things that both excite and scare him.

Mindaugas JančisMindaugas Jančis is a copywriter with experience in both B2C & B2B segments. Starting from pool tables and vending machines, he later moved to a more profitable tech sector, focusing on VPNs and cybersecurity in general. As of late, he's been digging into the ever-evolving world of AI, writing about things that both excite and scare him.

Make AI work your way.

Test AI Agents and no-code automation.

Get nexos.ai