Researchers Determine Google Gemini Falls Short of GPT-3.5 Turbo in Performance

ByYasmeeta Oon

Dec 22, 2023

In a recent study, researchers from Carnegie Mellon University and BerriAI have scrutinized Google’s latest AI offering, Gemini Pro, comparing its performance with OpenAI’s GPT-3.5 Turbo. Despite high expectations following a polished demonstration video, Gemini Pro has been found to lag slightly behind GPT-3.5 Turbo in various tasks. This comes as a surprise considering that OpenAI’s model is older and available for free, while Google’s Gemini has been in development for months.

OpenAI users, including those with ChatGPT Plus and Enterprise subscriptions, have been using GPT-4 and GPT-4V (the multimodal version) for a significant part of the year, which may have set a high benchmark for Gemini Pro. The research paper, titled “An In-depth Look at Gemini’s Language Abilities,” published on arXiv.org, highlighted Gemini Pro’s slightly inferior accuracy compared to GPT 3.5 Turbo as of December 19, 2023.

Google, however, stands by its product. In response to these findings, a Google spokesperson claimed that their internal research indicates Gemini Pro outperforms GPT-3.5. They also mentioned an upcoming version, Gemini Ultra, which they say surpasses GPT-4 in Google’s tests.

The researchers also examined Gemini’s performance in specific areas. They noted that Gemini showed a tendency to choose the ‘D’ option in multiple-choice questions, unlike the more balanced approach of GPT models. This could suggest a lack of instruction tuning for such questions in Gemini. Furthermore, Gemini struggled with questions on human sexuality, formal logic, elementary math, and professional medicine, partly due to its restrictive content guidelines.

The study also included comparisons with other models like GPT-4 Turbo and Mixtral 8x7B, an open-source model from Mistral. These models were tested on LiteLLM, an AI aggregator site, using various prompts across different subjects. The findings from this comprehensive research offer a detailed insight into the current landscape of large language models and their capabilities.

Yasmeeta Oon

Just a girl trying to break into the world of journalism, constantly on the hunt for the next big story to share.