ChatGPT-4o Mini: Why Bigger AI Isn’t Always Better

The Gist

Cost-efficient upgrade. ChatGPT-4o mini offers significant savings with over 60% cheaper development costs.
Benchmark excellence. Outperforms rivals in multimodal and mathematical reasoning evaluations.
Tech ecosystem booster. Mini model enhances OpenAI's competitive edge and enterprise applications.

Like other generative AI providers, OpenAI continues to find ways to advance its ChatGPT platform. But this time it’s biggest advancement is its smallest model release.

Last week, OpenAI unveiled the ChatGPT-4o mini, a compact model praised for its cost-efficient AI performance. Set to replace the GPT-3.5 Turbo, it becomes the smallest model available from OpenAI. Consumers can access ChatGPT-4o mini through ChatGPT's web and mobile apps, while developers can incorporate it into their AI projects. The model has officially launched, with enterprise users gaining access this week.

The introduction of ChatGPT-4o mini furthers the trend toward smaller AI model applications and accelerates AI development for mobile devices.

This image shows a set of five traditional Russian Matryoshka dolls, arranged from largest to smallest. Each doll is painted with vibrant red and features a floral design with large pink and yellow flowers. The dolls have cheerful faces with blue eyes and blonde hair, symbolizing the layering concept akin to the "ChatGPT-4o Mini release," which emphasizes compact efficiency in technology scaling down in size but not in capability. — The introduction of ChatGPT-4o Mini furthers the trend toward smaller AI model applications and accelerates AI development for mobile devices.Delphotostock on Adobe Stock Photos

The Key Specs of ChatGPT-4o Mini

OpenAI is touting ChatGPT-4o mini as verifiable proof of its commitment to making artificial intelligence “as broadly as possible” by expanding the range of applications that incorporate AI.

Mini Improvements

ChatGPT-4o mini features the same improved tokenizer in GPT-4o. It further adds a context window that supports up to 128K tokens and up to 16K output tokens per request. Another feature is better relevancy on topics. Users will find its prompt responses reflect event knowledge up to October 2023. ChatGPT-4o mini can also handle non-English text.

Related Article: What Is ChatGPT? Everything You Need to Know

Higher Scores

The result is a larger range capacity, improved textual intelligence and improved multimodal reasoning that exceeds current benchmark performance, according to OpenAI. ChatGPT-4o mini scored 59.4% on a multimodal reasoning evaluation called MMMU. This was higher than its main rivals, Gemini Flash (56.1%) and Claude Haiku (50.2%).

This bar chart compares the accuracy scores of various AI models across multiple evaluation benchmarks. Models included are GPT-4o mini, Gemini Flash, Claude Haiku, GPT-3.5 Turbo, and GPT-4o. The chart features benchmarks like MMMU, GeRa, DROP, HellaSwag, HANS, MNLI, Winogrande, MMLU, and MATHQA, with bars representing the accuracy scores in percentage for each model. GPT-4o mini consistently performs competitively across most benchmarks, highlighting its efficiency as part of the ChatGPT-4o Mini release. — OpenAI

ChatGPT-4o mini also scored higher than its competitors on MGSM, a math reasoning score. Chat GPT-4o mini scored 87.0%, compared to 75.5% for Gemini Flash and 71.7% for Claude Haiku. Chat GPT-4o mini scored a little lower than the larger Chat GPT-4o model in the accuracy measures, but it significantly outperformed ChatGPT Turbo in each category.

Pushing the Small Language Model Boundary

In its marketing of ChatGPT-4o mini, OpenAI has highlighted that its model is pushing that small language model boundary with affordability as well. OpenAI claims overall development cost per token is more than 60% cheaper than that of GPT-3.5 Turbo. The typical cost developers pay is 15 cents per 1M input tokens and 60 cents per 1M output tokens. OpenAI estimates such costs are roughly equal to 2,500 pages in a standard book. The blend of affordability and increased modal capacity is a significant attraction to developers seeking to adopt small language models to reduce data training and development costs.

Staying Competitive: Keeping Up With The AI Joneses

All of this plays into the trend toward providing a multimodal large language model (MLLM) to users, a trend OpenAI must address to stay competitive. The interest in small language models has been bubbling among AI developers since AI platforms arrived in the consumer marketplace.

Current AI Solutions

The current AI solutions, like Claude, Gemini, and ChatGPT, are based on foundation models, a type of large-scale machine learning model created from a broad training data set. Foundation models introduced a new querying paradigm, shifting AI away from being trained on task-specific data to perform a narrow range of functions. The result was more adaptability and fine-tuning for a variety of applications and downstream media tasks.

Developers’ Aims

But training foundation models requires a large amount of memory, creating a huge expense and a daunting computational capacity to execute model training.

Thus as development sees performance gains, developers aim to deploy small language models that maintain performance and adaptability with less data training and lower computational requirements

Any tech company thinking of AI has a significant interest in multimodal language models operating from within smart devices. When I reported on Apple's Ferret LLM, the personal computer maker’s first open-source AI foray for developers, I noted the small LLM version because it was made with iOS device applications in mind. Having an in-house AI framework available for its smartphones and tablets would strengthen Apple’s tech ecosystems — it would give developers a way to develop AI-based applications more quickly for its device lineup and provide a means to integrate application features across devices.

Learning Opportunities

Webinar

Dec

Rebrand. Migrate. Optimize. How to Do It All (Without Slowing Down)

Cresta leveled up site speed, design flexibility and marketer sanity (in record time). Find out how.

Webinar

Dec

Beyond Composability: How Modern Marketers Build Connected Experiences

Ready to launch campaigns faster, personalize smarter and prove your marketing ROI? Discover the power of a modern DXP.

Webinar

Dec

Unlock Connected Service: How to Forecast, Staff & Support Every Channel

Stop juggling tools. 73% of CX leaders say silos damage CX. Build a seamless service operation instead.

Webinar

Dec

Empowering Non-Profits: Smarter Crisis Communication and Community Engagement

The cost of miscommunication is measured in more than words. Learn to deliver outreach that's fast, clear and trusted.

Webinar

Dec

Roundtable: Turning Real-Time CX Signals into Business Results

Four big brands. One live, unscripted discussion on how modern CX teams move from dashboards to real impact.

Webinar

On demand

From Manual to Magical: How AI Transforms CX Teams

Learn how to replace manual support processes with automation that actually delivers.

Watch Now

Webinar

Dec

Rebrand. Migrate. Optimize. How to Do It All (Without Slowing Down)

Cresta leveled up site speed, design flexibility and marketer sanity (in record time). Find out how.

Webinar

Dec

Beyond Composability: How Modern Marketers Build Connected Experiences

Ready to launch campaigns faster, personalize smarter and prove your marketing ROI? Discover the power of a modern DXP.

Webinar

Dec

Unlock Connected Service: How to Forecast, Staff & Support Every Channel

Stop juggling tools. 73% of CX leaders say silos damage CX. Build a seamless service operation instead.

For OpenAI, the launch of a mini version of ChatGPT will provide the company with a similar tech ecosystem advantage — one that marketers working on AI initiatives should monitor as the AI tech space evolves.