Did Google Outperform GPT-4? Introducing Gemini Ultra

Did Google Outperform GPT-4? Introducing Gemini Ultra

Few days back Google released the highly anticipated LLM Gemini Ultra, along with its ChatGPT competitor Gemini Advanced. Does it herald a new era amidst the ongoing race for Generative AI supremacy, or is it poised to be another misfire?


Going back to the pilot when google made the claim that it had created an AI superior to GPT-4 but then people wanted to see if that really happened. So eventually Google released this mind blowing video where the guys were having a conversation with Gemini Ultra like a human. But then later it turned out that the video was mostly fake, well sort of. In contrast to the appearance of seamless voice interaction depicted in the video, the demonstration did not involve Gemini responding in real-time to its environment. Rather, the video was crafted by assembling still image frames from the footage and instructing Gemini through text prompts. The AI in the demonstration was interpreting human-generated prompts provided to Gemini, while being presented with static images. Apart from that the description of the video also carried a disclaimer saying ‘For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.’

As of today, Google has deprecated Bard, it's now called Gemini. To access the cutting-edge Gemini Ultra model, you'll need to pay. This raises a moral dilemma. Should I support Sam Altman by buying a subscription for ChatGPT or pay Google to become part of their surveillance capitalism product by purchasing Bard Advanced, or perhaps Bard Ultra? Oh, wait, my mistake—it's Gemini Ultra. To be precise, it's Gemini Advanced featuring the Gemini Ultra model. Apparently naming products seems to be the hardest part of product management. The Gemini Advanced, along with the Gemini Ultra model, is available for $20 a month through their Google One plan, which also includes 2TB of cloud storage and Google Workplace Features.

Now let’s check to find out if its really better than GPT-4

Firstly, the most noticeable aspect is the speed of the answers provided by Gemini. Reports suggest it's at least two or three times faster than GPT. That’s a nice feature to have, but what matters most is response quality. Google released a statement explaining that it didn’t release Gemini Ultra to the public due to safety concerns. Rest assured, Gemini is the safest and the wokest AI we’ve ever witnessed. The guardrails are strong with this one.

As developers, one of the reasons we use these large language models is for reading and writing code. Both models excel in reading and analyzing code, making it perhaps a tie in this domain. However, can they write code? It's noteworthy that Gemini has a context length of 32k tokens, whereas GPT-4 turbo boasts 128k. In theory, this means GPT-4 could potentially write better code within the context of a massive codebase, but in practice, it's not that straightforward. Gemini provides links to relevant code when generating a result, creating a sense of transparency regarding the data it used to train the model which is not present in GPT.

The next question which arises is can it run code? Like ChatGPT it can run basic python scripts but GPT-4 is way better at it. One can’t attach data like a csv file to have it analyze it. Instead the code can be exported to a Google Collab or replit to run in the cloud Now another big advantage that GPT-4 has is its new market place, where developers can extend it with their own plugins. Gemini also has extensions but they’re currently not open to developers yet, they’re all google based but can do some cool stuff like find you cheap flights via google flights or if you’re sick of watching a youtube video, you can simply paste in any youtube URL and get a summary of what was talked about.

In addition to the points mentioned earlier, Gemini falls short in situations where logical reasoning is necessary. For instance, when confronted with a well-known weight problem presented in a modified form—asking whether one pound of steel is heavier than two pounds of wool—Gemini provides an incorrect answer. In this scenario, despite the slight alteration in the problem statement, Gemini errors by suggesting that both are equal, relying on traditional trivia it may have been trained on.

Well I would say that Gemini is still naive in some aspects and needs polishing and further training. It's almost astonishing that GPT-4 which has been built on Transformer architecture which was produced by the researchers at Google still has some sort of upper hand on Google’s latest tech. To be fair Gemini is quite premature, to come on par with GPT-4 it still needs to undergo its share of reiteration and rework.


Quite fascinating, but the pivotal question remains: Will Gemini truly outshine ChatGPT?

While not groundbreaking, Gemini stands at least equal to, if not slightly ahead of, ChatGPT. However, it's certain that Sam and Satya won't take this lightly, and now we anticipate the arrival of GPT-5.