Google Just Dropped AI That Fixes The Internet

TLDR

Google DeepMind has unveiled CodeMender, an AI that autonomously finds and fixes security flaws in open-source projects, and Gemini 2.5 Computer Use, a model that operates computer interfaces like a human.

Takeways

• CodeMender autonomously identifies and fixes security flaws across large open-source projects.

• Gemini 2.5 Computer Use operates digital interfaces like a human, performing complex tasks autonomously.

• Both AIs feature advanced reasoning capabilities and integrated safety protocols for responsible deployment.

Google DeepMind's CodeMender AI autonomously identifies, patches, and validates security vulnerabilities across large open-source projects, demonstrating deep code understanding and proactive safety enhancements. Concurrently, their new Gemini 2.5 Computer Use model can interact with user interfaces like a person, performing complex digital tasks by interpreting screenshots and reasoning about on-screen elements, marking a significant step toward fully autonomous AI.

CodeMender: AI for Security

• 00:00:27 CodeMender is Google DeepMind's AI code developer specifically designed for open-source security, having already contributed 72 verified fixes to massive projects within six months. This AI surpasses traditional vulnerability scanners by understanding code logic, identifying root causes, generating patches, and validating them automatically before human review. Powered by Google's Gemini DeepThink models, CodeMender can debug, patch, and rewrite extensive code sections while maintaining consistent style and functionality.

CodeMender's Advanced Capabilities

• 00:01:33 CodeMender has successfully addressed complex issues, such as a heap buffer overflow in XML stack management and critical vulnerabilities in the libwebp image compression library. The AI employs an arsenal of program analysis tools, including static and dynamic analysis, fuzzing, differential testing, and SMT solvers, along with a multi-agent system where specialized sub-agents perform tasks like code critique. It can proactively rewrite code to enhance safety, for instance, by adding F-bound safety annotations to prevent future exploits, with an LLM judge verifying patch integrity.

Gemini 2.5 Computer Use

• 00:06:29 Gemini 2.5 Computer Use is a specialized version of Gemini 2.5 Pro that enables AI agents to operate software through browser or mobile interfaces by seeing and interacting with user interfaces. The model works by analyzing screenshots, user requests, and recent actions to output function calls like 'click,' 'type,' or 'scroll,' repeating this loop until a task is completed. While currently optimized for web browsers and showing promising results for mobile UI, it is not yet ready for desktop operating system tasks.

Impact and Safety of Gemini 2.5

• 00:09:28 This model has already been deployed internally at Google for UI testing, significantly accelerating software development and automatically recovering broken workflows. External testers report substantial improvements in speed and reliability compared to other systems, with Google's Payments platform team noting a 60% rehabilitation rate for broken UI tests. DeepMind has integrated safety guardrails and a per-step safety service into the model, checking every proposed action to prevent risky behaviors and allowing developers to enforce custom safety rules for high-stakes actions.