Top Podcasts
Health & Wellness
Personal Growth
Social & Politics
Technology
AI
Personal Finance
Crypto
Explainers
YouTube SummarySee all latest Top Podcasts summaries
Watch on YouTube
Publisher thumbnail
AI Revolution
13:232/13/26

New GLM 5, OpenAI SKILLS, Slime Intelligence, AI Wikipedia and More AI News

TLDR

Jepu AI's GLM5 model leads in AI reliability and enterprise-ready agent capabilities, while OpenAI enhances deep research and hints at a new skills layer, and open-source agents like Deep Agent achieve near-human-level intelligence.

Takeways

Jepu AI's GLM5 sets a new standard for AI reliability and efficiency, featuring enterprise-ready agent capabilities and aggressive pricing.

OpenAI is enhancing guided deep research in ChatGPT and developing a native skills layer for standardized automation.

Open-source agents like Deep Agent are achieving near-human intelligence by leveraging advanced system designs for complex, real-world tasks.

Jepu AI launched GLM5, an open-source model that sets a new industry standard for reliability by refusing to hallucinate, boasting 744 billion parameters and efficient training with its 'Slime' engine. This model offers advanced agentic engineering for enterprise tasks, producing usable files directly and showcasing strong performance on benchmarks like SWE bench and Vending Bench 2. Concurrently, OpenAI is upgrading ChatGPT's deep research for guided, iterative workflows and is rumored to be developing a native 'skills' layer for standardized automation, while open-source agents like Deep Agent are reaching near-human intelligence on complex problem-solving benchmarks.

GLM5 Model Innovations

00:00:26 Jepu AI introduced GLM5, an open-source model notable for its 'minus-1' hallucination reliability, meaning it excels at admitting uncertainty rather than bluffing. This 744-billion-parameter model uses a Mixture of Experts architecture and a 'Slime' reinforcement learning engine, which accelerates training by allowing parallel task execution and incorporating an 'April' trick to optimize time-consuming processes. GLM5 integrates DeepSk sparse attention for a 200,000-context window, significantly reducing costs while enabling the processing of massive documents and code.

Enterprise AI and Agentic Engineering

00:03:10 GLM5 offers 'end-to-end knowledge work' with native agent mode capabilities, converting prompts and source material directly into usable files like .docx, .pdf, and .xlsx, positioned as an office tool for the AGI era. This agentic engineering approach allows humans to set quality gates while the AI executes subtasks, demonstrated by its ability to generate detailed financial reports and complex spreadsheets. Benchmarks show GLM5 as the strongest open-source model, surpassing competitors and achieving near-human performance on task-like environments like SWE bench and Vending Bench 2.

OpenAI's Research and Skills

00:09:11 OpenAI revamped ChatGPT's deep research capabilities, transitioning to a guided, iterative research session powered by GPT 5.2 that allows users to constrain sources, integrate app context, and interrupt/redirect mid-run. This update includes a dedicated full-screen review view for verifying citations, catering to professionals requiring high accuracy for repeat research. Additionally, there are rumors of a first-party 'skills' layer for ChatGPT, which would enable users to install and edit reusable workflow modules for consistent, standardized task execution.

Open-Source Agent Advancement

00:11:03 Open-source agents built on the Open Juan project, specifically Deep Agent and Deep Search, are setting new performance benchmarks, with Deep Agent achieving 91.69% on the Gaya benchmark—close to human-level intelligence (92%). Deep Agent's effectiveness stems from its system design, featuring internal loops for planning, execution, monitoring, and error correction, along with layered memory and a unified tool gateway. Deep Search leads the Browse Comp++ benchmark with 80% accuracy in deep research, handling multi-step searches, cross-referencing, and exploring multiple reasoning paths simultaneously to find the most promising solutions.