Information Technology

Hallucination rates for AI models

15 February 2025
Source: Vectara

Sometimes artificial intelligence (AI) algorithms produce outputs that are not based on training data, are incorrectly decoded by the transformer or do not follow any identifiable pattern. When a large language model (LLM), such as a generative AI platform, delivers outputs that are nonsensical or inaccurate, this is considered an AI hallucination.

These unrealistic outputs can be attributed to errors in encoding and decoding, high model complexity, and other factors. To help users shield against erroneous model findings, generative AI technology developer Vectara identified the top 15 AI LLMs with the lowest hallucination rates. The evaluation applied each LLM to 1,000 short documents along with a model to detect hallucinations and provide a percentage of factually inconsistent summaries.

Smaller or more specialized models, such as Zhipu AI GLM-4-9B-Chat, OpenAI-o1-mini or OpenAI-4o-mini represent smaller or more specialized models and have some of the lowest hallucination rates. Intel’s Neural-Chat 7B is also a smaller model. In terms of foundational models, Google’s Gemini 2.0 slightly outperforms OpenAI GPT-4 with a hallucination rate difference of just 0.2%.

To contact the author of this article, email GlobalSpecEditors@globalspec.com


Powered by CR4, the Engineering Community

Discussion – 0 comments

By posting a comment you confirm that you have read and accept our Posting Rules and Terms of Use.
Engineering Newsletter Signup
Get the GlobalSpec
Stay up to date on:
Features the top stories, latest news, charts, insights and more on the end-to-end electronics value chain.
Advertisement
Weekly Newsletter
Get news, research, and analysis
on the Electronics industry in your
inbox every week - for FREE
Sign up for our FREE eNewsletter
Advertisement