HOVSOL Technologies

AI Innovation

Meet VEO 2: Google’s Answer to OpenAI Sora

Published by HOVSOL Technologies | December 20, 2024

If you think 2023 was the year of text-to-image AIs, 2024 is surely owned by text-to-video AIs. 

Just days ago OpenAI stunned the world by officially launching Sora, and Google DeepMind announced Veo 2—its latest text-to-video AI update, on Dec 17, Monday. 

Veo 2 is already being hailed as a groundbreaking competitor to existing text-to-video generation models like Meta’s Movie Gen, Kling AI, and OpenAI’s Sora. In fact, many are considering it a cut above the rest.

Though still in limited release, it’s sparking excitement in the tech world with its ability to generate ultra-realistic 4K videos from simple text or image prompts.

But what makes VEO 2 stand out? And why is it making the buzz? 

Let’s take a deep dive into its features, capabilities, and what it signals for the future of AI-powered creativity.

What is Google VEO 2?

Veo 2 is the latest iteration of Veo, Google’s text to video generation AI model. 

Veo was announced on May 14, 2024, during Google’s I/O event. 

VEO 2 is a breakthrough in AI-powered text-to-video technology developed by Google DeepMind. This advanced tool turns simple text prompts or visual references into high-quality, ultra-realistic videos. 

Built on previous models like Imagen Video and Lumiere, VEO 2 offers an unmatched level of creative control. It can produce videos with cinematic effects like time-lapse or slow motion. 

VEO 2 meets and exceeds industry standards with its capability to fine-tune or expand videos through prompt feedback.

What Features Set VEO 2 Apart?

  • VEO 2 generates 4K resolution videos, a significant leap from the earlier 1080p limit
  • Interprets prompts with impressive detail and generates life-like realistic movements 
  • Unlike OpenAI’s Sora, which caps videos at 20 seconds, VEO 2 can create clips longer than one minute 
  • Unparalleled creative control, where users can adjust camera angles, lighting, transitions, and even character expressions
  • Fully compatible with Google Workspace
  • Avoids issues like flickering or distorted objects
  • Uses latent diffusion models to produce compressed video frames without delays

Aiding Creative Professionals 

To showcase VEO’s capabilities and address concerns about AI replacing human creativity, Google teamed up with actor and filmmaker Donald Glover during mid-May, 2024. 

In a video shared by Google, Donald can be seen with his production team and Google’s team working on a short film using Veo. 

They can be seen feeding multiple prompts to produce various movie scenes from different angles and aesthetics. The short film is said to be released soon. 

How Does VEO 2 Work?

Google VEO 2 combines a number of advanced diffusion models and GANs for lifelike motion and textures. It uses Google’s TPU cloud infrastructure to deliver videos faster. 

While it’s not yet known what data Google used to train the model, it’s evident the datasets were vast to ensure faithfully replicating realistic motion, lighting, and physics with accuracy. 

Here’s how you can get your hands on Veo 2:

  1. Sign up on the VideoFX page (if it support in your country) to get access notifications 
  2. Complete registration with your email and password
  3. Access the dashboard for all tools and features
  4. Describe your video idea for precise results
  5. Adjust lighting, effects, and cinematic details
  6. Review the video, refine as needed, and export the final version

Safety Measures in Google VEO 2

Google claims to have designed robust safety protocols to ensure responsible and ethical use of VEO 2. After all, AI image and video generation tools are always at risk of being misused such as producing deep fakes. 

  • Digital Watermarking with SynthID: Google DeepMind employs SynthID to embed imperceptible digital watermarks into every frame of videos generated by VEO 2 as it helps identify AI-generated content. 
  • Built-in Safety Filters: VEO 2 includes filters to prevent the creation of harmful or inappropriate content, strictly adhering to Google’s Responsible AI Principles.
  • Data Governance and Privacy: Google Cloud ensures that customer data is never used to train its AI models. All data is processed solely according to user instructions, maintaining strict compliance with privacy standards.
  • Copyright Indemnity: Google offers an industry-first copyright indemnity for its generative AI services to avoid potential copyright issues.

The VEO 2 vs. Sora Showdown

The AI supremacy battle between Google and openAI has surely intensified with VEO 2 and Sora AI-driven video generation tools. We ran some tests ourselves and here’s what we found… 

[create a side by side comparison table] 

FeatureVEO 2Sora
Video LengthsGenerates videos up to 1 minute in length, with options to extend to 2 minutes or more in private previewsLimited to a maximum of 20 seconds per clip, even with the advanced Sora Turbo version
Output QualityOffers up to 4K resolution, delivering cinematic visuals with rich textures and fluid motionCapped at 1080p, sufficient for mobile or web content but falls short for professional-grade video projects
User ExperienceFiner post-generation editing controls, allowing camera angles, lighting, and transitions refinementStands out with simplicity
Ecosystem AdvantageDeeply integrated with Google’s ecosystemBenefits from OpenAI’s integration with ChatGPT
Physics and MotionDemonstrates a better grasp of realistic physics and human motion
Advancements in TechnologyUtilizes Google’s Genie 2 world model, offering a better understanding of spatial intelligence and 3D environmentsSomewhat lags in handling complex scenarios such as intricate object interactions
Use Cases and Target AudienceTailored for professional creators and businessesBest suited for quick, mobile-friendly video generation

We ran the same prompt in both the tools. Check out the results: 

Prompt 1

An extreme close-up shot focuses on the face of a female DJ, her beautiful, voluminous black curly hair framing her features as she becomes completely absorbed in the music. Her eyes are closed, lost in the rhythm, and a slight smile plays on her lips. The camera captures the subtle movements of her head as she nods and sways to the beat, her body instinctively responding to the music pulsating through her headphones and out into the crowd. The shallow depth of field blurs the background. She’s surrounded by vibrant neon colors. The close-up emphasizes her captivating presence and the power of music to transport and transcend.

Compare 1 

OpenAI Sora sample output 1:

Google Veo 2 sample output 1:

Prompt 2

Low-angle tracking shot, 18mm lens. The car drifts, leaving trails of light and tire smoke, creating a visually striking and abstract composition. The camera tracks low, capturing the sleek, olive green muscle car as it approaches a corner. As the car executes a dramatic drift, the shot becomes more stylized. The spinning wheels and billowing tire smoke, illuminated by the surrounding city lights and lens flare, create streaks of light and color against the dark asphalt. The cityscape – yellow cabs, neon signs, and pedestrians – becomes a blurred, abstract backdrop. Volumetric lighting adds depth and atmosphere, transforming the scene into a visually striking composition of motion, light, and urban energy.

Compare 2

OpenAI Sora sample output 2

Google Veo 2 sample output 2

Lastly, Google released results from a benchmark study where human raters casted their preferences towards Google’s Veo 2 against other top AI text to video generator tools such as OpenAI’s Sora Turbo, Meta AI’s Movie Gen, Kuaishou’s Kling 1.5, and MiniMax’s HailuoAI

Potential Applications of VEO 2

Google’s VEO 2 is poised to revolutionize various industries with its abilities: 

  1. Marketing and Advertising
    • Create high-quality Cinematic Ads without costly equipment or large teams
    • Tailor videos to emotionally resonate with audience
    • Produce custom content that perfectly matches brand identity
  2. Education
    • Turn complex topics into visually immersive, easy-to-understand animations
    • Use visuals to make learning more accessible and appealing for students of all ages
  3. Entertainment and Media
    • Simplify trailers, short films, and interactive media creation 
    • Render intricate visuals for fantasy or action sequences 
  4. Social Media
    • Quickly produce high-quality, share-worthy content for social platforms 
    • Experiment with cinematic effects and styles 
  5. Personal Projects
    • Empowers small creators to produce polished, professional-quality videos
  6. Other Key Features in Action
    • Transform still images into dynamic clips with full customization
    • Generate extended views for video 

How VEO 2 is Already Empowering Companies

Google’s VEO 2 on Vertex AI is already streamlining video creation for businesses with high-quality enterprise grade videos. Here’s how industry leaders are planning to utilize Veo; 

1. Agoda

Agoda, a global travel platform, is aiming to harness the power of Google’s AI models like VEO, Imagen, and Gemini to enhance advertising efforts

“We are exploring the media generation capabilities of Google Cloud AI, using Imagen to create unique visuals of dream destinations in various styles. These images are then brought to life as videos through experiments with VEO’s image-to-video technology. These tools have the potential to reduce our content creation process from days to hours, inspiring future adventures.”
Matteo Frigerio, Chief Marketing Officer, Agoda

2. Mondelez International

The global snack giant is adopting AI to fuel innovation and efficiency

“Partnering strategically with Google Cloud as our core AI platform allows us to advance our business capabilities. This is not simply technology adoption; it’s a deep collaboration leveraging cutting-edge AI for growth and innovation.”
Tiffani Sossei, SVP Chief Digital Experience Officer, Mondelez International

3. WPP 

Global Advertising Leader WPP is leveraging VEO to bring creative visions to life, with photo-realistic visuals and campaign-ready content faster.

“With VEO and Imagen, we are narrowing the gap between imagination and execution, enabling our teams to produce high-quality visuals in minutes.”
Stephan Pretorius, Chief Technology Officer, WPP

4. Poe by Quora

Quora’s Poe platform is using VEO to democratize creative possibilities for millions of users.

“Through partnerships with leaders like Google, we’re enabling users to bring their ideas to life with stunning, high-quality generative videos.”
Spencer Chan, Product Lead, Poe by Quora

5. Honor

The smartphone manufacturer Honor integrates Google’s Imagen and VEO models to elevate customer experiences.

“With Imagen, customers can effortlessly create and reimagine visuals directly on their smartphones. VEO continues to push the boundaries of creative expression.”
George Zhao, CEO, Honor

What This Means for the AI Video Generation Landscape

Google VEO 2 marks a transformative moment in AI video generation. Undoubtedly, it has set a new standard for what’s possible and raised the bar for competitors like OpenAI’s Sora and Runway ML. 

We can fairly anticipate rival platforms like Lumiere, Kling and Sora will gear up their innovation, giving us even more powerful AI tools in the coming years. 

However, Veo’s use of YouTube videos for training has ignited debates around data ethics and copyright, reflecting broader concerns.

For businesses, marketers, and creative professionals, VEO 2 is a sure game-changer. 

As Google strategically rolls out tools like VideoFX in the U.S. and ImageFX in over 100 countries, the global implications of VEO 2’s capabilities become clear. Yet, challenges like copyright and the unauthorized use of creative works highlight the urgent need for clear and transparent data usage policies.

When Will VEO 2 Launch for Public?

As of now, Google VEO 2 remains in the testing phase, accessible only to select developers and AI researchers. 

Google DeepMind is prioritizing rigorous evaluations to fine-tune the tool’s performance and address potential concerns.

Although Google hasn’t confirmed an official release date, there is speculation about a phased launch beginning in 2025. 

Our experience says Google will likely priorities the following aspects before a launch, 

  1. Ensuring seamless output quality and user experience 
  2. Strengthening safeguard measures like watermarks, metadata, and content moderation tools to prevent misuse
  3. Leveraging insights from early testers to refine features and functionality

While still in its early stages, VEO 2’s potential is evident in applications like Agoda’s visually stunning campaigns. 

As Google refines its capabilities, anticipation grows for reshaping storytelling, marketing, and creative expression.

Stay tuned for updates as VEO 2 redefines the future of content creation.