Google’s next-generation AI model “Gemini” to launch this fall [that can generate images and text]

byJoaquim Cardoso

8 de setembro de 2023

3 minute read

the health strategist

institute for
strategic health transformation
and digital health
[digital connected, data driven & AI powered]

Joaquim Cardoso MSc.

Chief Research Officer (CRO), Chief Editor,
Chief Strategy Officer (CSO), and
Independent Senior Advisor

What is the message?

Google is set to launch a next-generation AI model called “Gemini” in the fall.

Gemini is described as a multimodal AI model that can generate images and text and has been trained on YouTube video transcripts.

Additionally, Gemini boasts remarkable coding capabilities, raising the bar for AI performance across various domains.

It is expected to compete with OpenAI’s GPT-4 and will be made available to AI app developers.

Key Google executives and teams are involved, including Deepmind and Google Brain, in the development of Gemini.

The model’s has a very large scale, with at least a trillion parameters and uses tens of thousands of Google’s TPU AI chips in its training.

One page summary

In a groundbreaking development set to reshape the AI landscape, Google is gearing up to launch its next-generation AI model, “Gemini,” in the upcoming fall.

This revelation, reported by The Information and attributed to an anonymous source closely involved in Gemini’s development, marks a significant milestone in artificial intelligence and its applications.

Gemini is described as a cutting-edge multimodal AI model, poised to take on OpenAI’s GPT-4.

What sets Gemini apart is its versatility; it is capable of generating both images and text, a feat that opens new horizons for AI-powered applications.

Notably, Gemini has been trained on YouTube video transcripts, hinting at its potential to produce simple videos, similar to existing tools like RunwayML Gen-2 or Pika Labs.

Additionally, Gemini boasts remarkable coding capabilities, raising the bar for AI performance across various domains.

One key aspect of this development is the integration of Gemini into Google’s ecosystem.

Google plans to gradually incorporate Gemini into its products, including the Bard chatbot and Google Docs or Slides.

Furthermore, the company intends to make Gemini accessible to external developers through the Google Cloud, expanding its reach and potential applications.

The magnitude of this project is evident in the significant workforce dedicated to its development.

According to The Information, a team of at least two dozen executives is actively involved in crafting Gemini.

This team, comprised of experts from Google Brain and DeepMind, spans several hundred employees.

Notably, DeepMind, a vital contributor to the project, underwent a merger and continues to refine its operational policies and technological approaches.

Leading this formidable team is DeepMind founder Demis Hassabis, supported by two DeepMind executives, Oriol Vinyals and Koray Kavukcuoglu, as well as former Google Brain chief Jeff Dean.

Surprisingly, even Google co-founder Sergey Brin is reportedly contributing to the development of Gemini, participating in the training and evaluation of the model.

However, the journey to Gemini’s launch has not been without its challenges.

The model’s training materials have undergone meticulous scrutiny by Google’s legal department, leading to the removal of copyrighted content from training data. Additionally, Gemini was inadvertently exposed to “offensive” content during training, necessitating a partial re-training of the model.

Gemini was officially unveiled in May, and early speculations hinted at its colossal scale, with a staggering trillion parameters.

Training this behemoth reportedly involved tens of thousands of Google’s TPU AI chips, underlining the resource-intensive nature of the project. In a statement, Gemini CEO Demis Hassabis hinted at the model’s potential to merge the strengths of AlphaGo-like systems with the remarkable language capabilities of large models while introducing innovative features.

To summarize

Google’s imminent launch of the “Gemini” AI model signifies a major stride in the field of artificial intelligence.

Its versatility, combined with Google’s commitment to integration and the involvement of top-tier talent, positions Gemini as a formidable competitor in the AI landscape.

As it becomes accessible to external developers, its impact on various industries and applications is eagerly anticipated, promising a transformative future for AI technology.

Based on the article “Google’s next-generation AI model “Gemini” to launch this fall”, published on The Decoder, and authored by Matthias Bastian, on
September 2, 2023

Originally published at https://the-decoder.com on September 2, 2023.

Names mentioned

Leading this formidable team is DeepMind founder Demis Hassabis, supported by two DeepMind executives, Oriol Vinyals and Koray Kavukcuoglu, as well as former Google Brain chief Jeff Dean.

Surprisingly, even Google co-founder Sergey Brin is reportedly contributing

xxx

Author

Joaquim Cardoso

Deixe um comentário Cancelar resposta

The Growing Costs and Opportunities of Generative AI

the health strategist institute forhealth transformationand digital health strategy Joaquim Cardoso MSc. Chief Research and Strategy Officer (CRSO),Chief Editor…

byJoaquim Cardoso

Addressing the 9 Challenges of Generative AI

the health strategistinstitute for strategic health transformation & digital technology Joaquim Cardoso MSc. Chief Research and Strategy Officer (CRSO),Chief Editor and…

byJoaquim Cardoso