Google is secretly testing Gemini 3, an advanced multimodal AI with 'deep think' and 'agent mode' capabilities, while an open-source model called OV can generate realistic talking head videos, and OpenAI's study suggests AI could automate nearly half of all jobs.
Takeways• Google's Gemini 3, with 'deep think' and 'agent mode,' is poised to revolutionize AI capabilities and user ecosystems.
• The OV open-source model enables local, realistic talking head video generation from text or images.
• OpenAI's research indicates AI could automate a significant portion of jobs, sparking debate on human vs. AI roles.
The AI world is rapidly advancing with new developments from Google, open-source communities, and OpenAI. Google is internally benchmarking Gemini 3, which features advanced coding, multimodal understanding, and autonomous 'agent mode' capabilities, with a staggered public rollout planned through 2026. Simultaneously, a new open-source model, OV, allows users to generate short, realistic talking head videos with synced audio locally. Amidst these innovations, an OpenAI study highlights AI's increasing capability to outperform humans in many job categories, sparking debate about the future of employment.
Google Gemini 3's New Features
• 00:00:31 Google is internally testing Gemini 3, a significant upgrade featuring 'Pro' and 'Flash' variants for advanced reasoning and speed, respectively. Early testers report its prowess in complex coding tasks, like generating perfect SVG graphics, and enhanced multimodal understanding. The model also introduces 'deep think' for multi-step reasoning and an 'agent mode' enabling browser control for autonomous actions like research and data entry, transforming AI Studio into a more integrated ecosystem.
Gemini 3 Rollout and Competition
• 00:02:55 Google's rollout strategy for Gemini 3 involves early access for enterprise users through Vertex AI, followed by developer access via cloud tiers, and a broad consumer launch by early 2026, integrating with Android 17, Google Search, Chrome, and Workspace. This staggered approach aims to stress-test the model before widespread adoption, positioning Google competitively against OpenAI's GPT-5 and Elon Musk's Grok 4. The current AI race focuses on building autonomous ecosystems, not just benchmarks, with Google integrating Gemini across its product suite while OpenAI pursues a platform route with its apps and agent kit.
OV Open-Source Video Generation
• 00:03:59 An open-source model named OV, based on the WAN 2.25b backbone, has emerged, capable of generating 5-second, 720p videos at 24 frames per second with synced audio from text or still images. Users can input dialogue to animate a character speaking with matched mouth movements, using ComfyUI for local or server-based operation. While current limitations include random voice selection, fixed video length, and lack of reference audio, artists are already experimenting to create short films, signaling a major step in accessible video and audio AI generation.
AI's Impact on Jobs
• 00:07:33 OpenAI's study, 'Measuring the Performance of Our Models on Real-World Tasks,' found AI models performed as well as or better than humans in nearly half of tested job tasks across nine industries, overwhelmingly outperforming in roles like retail clerks, sales managers, and shipping clerks. While creative and leadership positions currently show more resistance, OpenAI CEO Sam Altman predicts AI could automate 40% of all jobs, including customer support and potentially even his own CEO role. However, IBM CEO Arvind Krishna offers a more conservative view, suggesting AI might automate 20-30% of coding tasks, emphasizing that many functions will remain human-centric for a long time.