Could Google overthrow Open AI’s ChatGPT dominance with Gemini AI?

Google'S Meena Model, Once The Best Large Language Model In The World, Sparked An Internal Memo Predicting The Increasing Integration Of Language Models Into Our Lives. However, Google'S Model Was Soon Surpassed By Openai'S Gpt-3. Google Is Now Expected To Reach Gpt-4'S Total Pre-Training Flops By 5X Before The End Of The Year.

Google’s MEENA model, once the best large language model in the world, sparked an internal memo predicting the increasing integration of language models into our lives. However, Google’s model was soon surpassed by OpenAI’s GPT-3. Google is now expected to reach GPT-4’s total pre-training FLOPS by 5x before the end of the year.

A recent article published by SemiAnalysis also discusses the disparity between GPU-Rich and GPU-Poor entities, with the former having access to thousands of GPUs and the latter struggling with far fewer. Nvidia is highlighted as a dominant player with its DGX Cloud service. The article suggests that Google could challenge Nvidia’s dominance with its efficient infrastructure and advanced chips. Could Gemini AI upset the pecking order?

Breaking the Dominance of OpenAI and ChatGPT?

The post, by Dylan Patel and Daniel Nishball of research firm SemiAnalysis, argues that Google’s anticipated Gemini AI model looks ready to blow OpenAI’s AI model out of the water. They claim it does this by packing in a lot more computing power.

From data crunched from a Google supplier, the crux of the analysis boils down to Google having access to infinitely more top-flight chips and its model outdoing GPT-4 on a performance measure relating to computer calculations known as FLOPS.

What is Gemini AI?

Google’s research lab, DeepMind, is developing a large language model, Gemini AI, expected to outperform OpenAI’s ChatGPT. CEO of DeepMind, Demis Hassabis, revealed that the development cost of Gemini AI is in the hundreds of millions. The model is being developed using techniques from AlphaGo, a previous AI program by DeepMind that defeated a champion Go player. Gemini AI is expected to replace Google’s current AI model, PaLM 2, and will be used in Google’s AI services. The model is designed to be multimodal, efficient at tool and API integrations, and built for future innovation.

Google’s objective with Gemini AI is not to merely replicate existing models like GPT-4. Instead, the focus is on delivering superior capabilities. Gemini AI is expected to leverage advancements in reinforcement learning to address the challenges that current language models face. Reinforcement learning involves providing rewards for desired behaviours and applying punishments for undesired ones, enabling the system to learn and exhibit appropriate behaviours in specific situations.

Gemini AI could incorporate transfer learning techniques, allowing it to leverage knowledge gained from one conversational domain to improve its performance in another. This could lead to the creation of more adaptable and versatile conversational agents.

Additionally, Gemini AI could be enhanced to engage in dialogue-based gaming. It could play interactive text-based games with users, dynamically adapting its responses based on the game state and user inputs, providing a challenging and immersive gaming experience.

Google’s Future Plans

Despite having the potential to lead in the field, Google did not fully utilise its resources. However, it is predicted that Google is now making strides and is expected to surpass GPT-4’s total pre-training FLOPS by five times before the end of the year. The company’s current infrastructure buildout suggests a clear path to achieving 20 times the current capacity by the end of next year.

The discussion now revolves around Google’s training systems for Gemini, the iteration velocity for Gemini models, Google’s Viperfish (TPUv5) ramp, and Google’s competitiveness against other frontier labs. However, whether Google will release these models publicly without compromising their creativity or existing business model remains to be seen.

The GPU-Rich and GPU-Poor

Access to computing or “compute” is a bimodal distribution. A few firms have 20k+ A/H100 GPUs, and individual researchers can access hundreds or thousands of GPUs for pet projects. However, there are also startups and open-source researchers who struggle with far fewer GPUs. These researchers often spend significant time and effort on tasks that may not be beneficial or relevant, such as fine-tuning models with GPUs that don’t have enough VRAM.

These GPU-poor researchers often use larger LLMs to fine-tune smaller models for leaderboard-style benchmarks, which may not accurately reflect the model’s usefulness or accuracy. They often overlook that pretraining datasets and IFT data need to be significantly larger/higher quality for smaller open models to improve in real workloads.

Google, despite using GPUs internally and selling a significant number through GCP, has a few advantages that could make it the most compute-rich firm in the world. These include Gemini and the next iteration which has already begun training. The most important advantage they have is their unbeatably efficient infrastructure.

Google’s growth in advanced chips added by quarter is significant. Even when giving OpenAI every benefit of the doubt, Google’s growth, which only includes TPUv5 (Viperfish), is impressive. This growth does not include their entire existing fleet of TPUv4 (Pufferfish), TPUv4 lite, and internally used GPUs.

Key Message

Google’s MEENA model, a large language model, was briefly the best in the world, outperforming OpenAI’s GPT-2, but was soon surpassed by OpenAI’s GPT-3. Despite this, Google is predicted to significantly increase its model training capacity, potentially outpacing competitors by the end of the year, according to authors at SemiAnalysis.