Google unveiled the Gemini 1.5 Pro upgrade in mid-February, surprising AI fans with a massive upgrade for its large language model (LLM). Gemini Pro powers the free Gemini product that anyone can access. Gemini Ultra is the version you have to pay for, via a Google One subscription.

Gemini 1.5 Pro is already as powerful as Ultra and recently got a significant upgrade: a context window of up to 1 million tokens. That means you can feed it prompts of around 700,000 words, over 30,000 million lines of code, 11 hours of audio, or 1 hour of video content.

Fast-forward to mid-April and Google announced that Gemini 1.5 Pro is available for testing to enterprise users via the Vertex AI development platform. The testing will include support for using audio files in prompts, which is an amazing feature to have from a genAI product. Unfortunately, however, not everyone currently has access to Gemini 1.5 Pro yet.

Those lucky enough to test Gemini 1.5 Pro will be able to upload audio files of any kind and ask the AI for information based on those files. As someone who has been using a ChatGPT-powered app called Whisper to transcribe audio files, I’ll say this Gemini 1.5 Pro feature is something I want to see from other genAI products.


Support for audio files opens up so many doors. I use the feature for interviews and video calls, as it significantly improves my ability to recall details. This feature obviously also makes transcription easier.

