AI News Daily 07-22

AI Daily | Updated 8 AM | All-network Data Aggregation | Frontier Science Exploration | Industry Voice | Open-Source Innovation Power | AI and Human Future | Visit Web Version

GeminiCli2API is a powerful local proxy project that wraps Google Gemini CLI’s capabilities into a local API service. With it, you can easily bypass the tight quota limits of the official free API and seamlessly integrate Gemini’s top models into any client or application you love.

Key Highlights:

✨ Seamless OpenAI Compatibility: This project offers fully compatible interfaces with the OpenAI API, letting your existing tools (like LobeChat, NextChat) tap into Gemini’s powerful features with zero modifications and zero cost.
🚀 Break Through Quota Limits: By leveraging Gemini CLI’s account authorization mechanism, you can enjoy daily request limits far exceeding the official free API, freeing your applications and ideas from previous constraints.
🔐 Enhanced Controllability: With its powerful built-in logging system, you can capture all request prompts, making auditing, debugging, and even building your private datasets a breeze, essentially solidifying your data assets.
🛠️ Easy Deployment & Expansion: Built on Node.js, the installation and startup process is super straightforward. Its clear code structure also makes it an ideal foundation for secondary development, letting you easily add custom features like unified prompts, caching, or content filtering.

Whether you’re looking to integrate Gemini into your existing workflows or deeply customize AI services, GeminiCli2API stands out as an ideal choice, blending performance, compatibility, and flexibility.

AI Content Summary

OpenAI plans to expand to a million-level GPUs with Project Stargate, while ByteDance is testing the Chimera digital human platform.
JD open-sourced its multi-agent system, which performed excellently in the GAIA benchmark test; multi-agent collaboration is becoming a new trend.
Frontier research utilizes new methods like reinforcement learning to enhance AI's capabilities in multimodal reasoning and visual grounding.
Mixture-of-Experts (MoE) architecture is becoming the mainstream for open-source large models, while giants like Apple face severe AI transformation challenges.
AI Agents are evolving from auxiliary tools to autonomous task execution, aiming to reshape future workflows through automation.

AI Product & Feature Updates

OpenAI is gearing up to unleash a compute tidal wave! 🌊 CEO Sam Altman recently dropped a bombshell on social media, officially announcing the company’s plan to expand its GPU count to an astonishing over 1 million by the end of 2025! 🤯 This ambitious initiative, codenamed “Project Stargate,” is set to pour a staggering $500 billion over the next four years into building the world’s largest AI training cluster in Texas, spanning thousands of acres. This “Game of Thrones” starring tech giants like SoftBank, Oracle, Arm, Microsoft, and Nvidia not only signals that Artificial General Intelligence (AGI) development is hitting hyper-speed but could also completely reshape the global GPU market’s supply-demand dynamics, making already scarce compute resources even hotter commodities. We’re standing on the eve of a technological singularity; are you ready for it?
ByteDance is quietly rolling out another ace in the digital human race! Its subsidiary, Volcengine, is secretly testing a new generation of digital human platform called “Chimera” via an invite-only model. 🤫 This mythical-sounding platform is no ordinary feat; it heavily relies on Volcengine’s in-house AI large model tech, offering an “all-in-one” service from digital human image generation and one-click photo outfit changes to cross-language video translation—a true boon for content creators. While currently in a free closed beta, it’s expected to launch a paid model by the end of this month, signaling its commercial ambitions. From gaining industry certification in 2022 to now unleashing the powerful “Chimera,” Volcengine is fast-tracking its AI digital human solutions, aiming to carve a path into various business sectors like finance, live streaming, and marketing. 🤖
While “996” (9 AM to 9 PM, 6 days a week) might be a thing of the past, Greptile, a rising star in AI code review, is boldly proclaiming a “007” motto, demanding employees have “no work-life balance.” What’s jaw-dropping is that this extreme “wolf culture” hasn’t scared off investors; instead, it’s successfully wooed top-tier VC firm Benchmark, reportedly closing a massive $30 million Series A funding round, skyrocketing the company’s valuation to $180 million. 💰 Founded by a mere 22-year-old graduate and incubated at YC, this startup claims its AI robot can accurately review code like the most seasoned colleague. However, with fierce rivals like Graphite and Coderabbit circling, is this “no effort, no gain” extreme overtime culture truly the catalyst for its success, or a ticking time bomb for future collapse? 🤔 The market is watching with keen interest.
E-commerce giant JD has finally revealed its ace to the open-source community, officially launching its product-level, end-to-end universal multi-agent system, JoyAgent-JDGenie - AI News , declaring “all gods return to their thrones!” ⚔️ This system is no mere lab toy; it dominated the GAIA benchmark test, dubbed the “AI college entrance exam,” with an astonishing 75.15% accuracy rate, showcasing its extraordinary capability to handle complex real-world tasks. It’s not just a powerful, out-of-the-box framework; it integrates multiple specialized sub-agents for report generation, code writing, PPT creation, and more. Plus, through innovative multi-level collaboration design and cross-task memory mechanisms, it covers everything from simple information queries to complex project execution. JD’s move undoubtedly drops a bombshell for rapid enterprise AI application deployment, potentially unifying the multi-agent arena. 🏆
The era of a single AI model flying solo might truly be over, because AI Agents have learned to “call for backup!” Stanford University recently open-sourced an “Octopus Bro” AI Agent called OctoTools - AI News . It acts like a clever project manager, intelligently orchestrating over 11 different specialized tools to work together. 🐙 When faced with complex reasoning tasks in fields like math, science, or medicine, it consistently finds the most suitable “expert” to solve the problem. Its core innovation lies in the “tool card” design, which standardizes and encapsulates the capabilities of various tools. Then, a “planner” brain devises a meticulous battle plan, finally handed over to an “executor” for faithful implementation. This clearly defined, highly collaborative team model marks a new leap in AI’s ability to solve complex problems, promising more powerful and flexible AI applications in the future. 🛠️

AI Frontier Research

Traditional AI training methods often swing between two extremes: either shackling models with rules from the get-go, limiting creativity, or letting them “explore freely,” which can lead to deviations or even “bad habits.” Researchers at Meituan bravely said “no” to this and introduced a brand-new framework called Metis-RISE , cleverly employing a new “free-range, then controlled-rearing” training strategy. 🐑 First, they use Reinforcement Learning (RL) as an incentive, like free-range farming, to encourage the model to boldly explore various possibilities and fully unleash its potential. Then, they conduct targeted “catch-up lessons” via Supervised Fine-Tuning (SFT) to consolidate strengths and correct errors, refining it as if in a controlled environment. 🎓 This unconventional training combo delivers astonishing results: their 72B parameter model shot up to fourth place on the authoritative OpenCompass multimodal reasoning leaderboard, even surpassing some well-known commercial closed-source models. You can dive into the detailed technical specifics in the paper - AI News .
When faced with an information-packed, high-resolution large image, AI often acts like a headless chicken, drowning in a sea of irrelevant details and missing the point. 🕵️‍♀️ To tackle this tricky pain point, researchers from Fudan University and Nanyang Technological University teamed up to propose the MGPO framework. It successfully taught Large Multimodal Models (LMMs) a killer trick: Visual Grounding. This is like giving AI a pair of “fiery golden eyes”—before answering a question, the model can first predict the key area in the image based on the query, then “zoom in” to examine those details just like a human, finally delivering a precise answer. 🎯 What’s most mind-blowing is that this powerful capability “emerged” through reinforcement learning self-play, requiring no expensive human-annotated data whatsoever; it can self-evolve and iterate purely based on the correctness of the final answer. This groundbreaking research has been published in the paper - AI News , and they’ve generously open-sourced the code - AI News .
Spatial transcriptomics data is like a microscopic map brimming with life’s genetic codes, but it often frustrates scientists with its low resolution and high noise, making it tough to decipher. Now, research teams from the University of Tokyo and McGill University have developed the SUICA model, which acts like a masterful “data alchemist.” 🧙‍♂️ This model innovatively combines graph autoencoders and Implicit Neural Representation (INR) technology to denoise, enhance, and super-resolve this high-dimensional, sparse biological data, truly turning “waste into treasure.” Data processed by SUICA not only boasts higher visual quality but also contains stronger biological signals, revealing intricate tissue structures and cell states previously unobservable. 🧬 This research, accepted into top conference ICML 2025, provides a more robust data foundation for AI-assisted pathological diagnosis and drug development. Both its paper - AI News and open-source project - AI News are now available for global researchers.

AI Industry Outlook & Societal Impact

In 2025, the open-source large model arena is witnessing an absolutely dazzling “clash of the titans,” and the Mixture-of-Experts (MoE) architecture is undoubtedly the brightest star of the show. 👑 From DeepSeek-V3’s ultimate 9-expert design to Qwen3’s decisive innovation of ditching shared experts, and the rumored trillion-parameter “juggernaut” size of Kimi-K2, all top vendors are furiously “racing” on this golden MoE track. Meanwhile, small and medium-sized models, exemplified by SmolLM3-3B, are challenging the dominance of the “big guns” with astonishing efficiency and performance through clever architectural optimization and massive data pre-training. This tech wave not only signals the graceful exit of traditional dense models from the historical stage but also presents developers with the “happy dilemma” of balancing extreme performance with controllable costs. This is, without a doubt, one of the most exciting chapters in the current AI News landscape.
Apple, true to form, is still the master of making money, but under the AI wave, its “AI flavor” seems to be losing its punch. 🍎 The company’s “slow pace” in the AI domain is gradually eroding Wall Street’s patience, with some prominent analysts even openly discussing CEO Tim Cook’s future. While Cook, with his unparalleled operational excellence, has steadily propelled Apple’s market cap to an epic $3.1 trillion peak, the lackluster AI performance at last month’s WWDC Global Developers Conference—especially the delayed highly anticipated Siri overhaul—has intensified external disappointment. ⏳ Critics argue that the AI era calls for bold product visionaries like Steve Jobs, not just operational maestros who excel at calculations. The legendary helmsman who led Apple into its “golden decade” now faces a grim test: whether he can usher in the next AI chapter.

Top Open-Source Projects

NextChat: Your Cross-Platform AI Buddy, Light and Swift. Are you still vexed by fragmented AI chat experiences across different devices? With a whopping 84,000 GitHub Stars, NextChat - AI News eloquently proves itself as the ultimate answer to this pain point. 🤝 It’s an ultra-lightweight, lightning-fast, cross-platform AI assistant designed to seamlessly support all major operating systems, including Web, iOS, MacOS, Android, Linux, and Windows. This means no matter where you are or what device you’re using, you’ll have a unified, private, and incredibly smooth AI companion, letting your inspiration and creativity stretch anytime, anywhere. 📱💻
crawl4ai: The “Web Intelligence Agent” Built for Large Models. Want your LLM to break free from its “knowledge cutoff date” shackles and truly grasp the ever-changing internet? Then crawl4ai - AI News , boasting 48,000 Stars, is your indispensable open-source web crawler and scraping tool. 🕸️ Designed specifically for AI application scenarios, it efficiently and intelligently collects, cleans, and structures data from vast amounts of web information, feeding your large models the freshest, richest “mental nourishment.” With it, your AI application’s answers will no longer be confined to outdated training data; instead, they’ll be able to cite sources, speak with substance, and truly possess the ability to gain insights into the present. 🧠
dashy: The “Central Command Console” for Your Digital Life, Blending Style and Substance. In an era flooded with services and applications, your digital life desperately needs a capable manager, and dashy - AI News , with 21,000 Stars, is precisely that ideal open-source, all-in-one, completely free candidate. 📊 This highly customizable personal dashboard lets you deploy it on your own server, consolidating all your personal services, applications, and website links into one place. Not only does it integrate service status checks and handy widgets, but it also offers a massive library of themes and icons, allowing you to control all your digital assets from a single interface, showcasing your inner geek and mastery. 🎨
better-auth: The “Authentication Annihilator” for TypeScript Developers. User authentication systems are the indispensable cornerstone of every application, yet they’re also one of the most headache-inducing development stages for countless developers, filled with repetition and trivialities. better-auth - AI News , with 17,000 Stars, aims to be the most comprehensive and user-friendly TypeScript authentication framework, rescuing developers from this quagmire. ✅ It provides a battle-tested, secure, and reliable complete solution, letting you totally ditch the hassle of reinventing the wheel and instead focus 100% of your valuable energy on innovating and implementing core business logic. 🔐
ConvertX: Your Personal Online File “Format Conversion Factory.” Have you ever found yourself bouncing between different file formats, just trying to find a tool that could open or edit them? Well, why not give ConvertX - AI News , a self-hosted online file converter with 4,000 Stars, a try? 🔄 It’s like an omnipotent “format conversion Swiss Army knife,” supporting mutual conversions for over 1000 file formats, from common documents and images to specialized audio and video formats—it practically does it all. Best of all, you can easily deploy it on your own server, giving you a completely secure, private, and powerful personal file processing center. 📁

Social Media Buzz

When AI Agents Encounter Production Environment “Ghost Stories.” Every software engineer has faced that maddening moment of despair: “But it worked perfectly on my machine!” This is also the nightmare of AI coding assistants. 👻 Without the real runtime context of the production environment, even the smartest AI coding assistant is like a “blind person,” unable to understand why code behaves abnormally. A tool named Hud is trying to crack this nut; it acts like a detective, capturing the true behavioral traces of code in production and feeding these crucial clues directly to the AI, helping the AI truly see the problem. This might just be the beacon of hope to end the age-old dilemma of “Why does it crash the moment it hits production?” 🩺
AI Agent’s “Parenting Guide”: Seven Golden Rules from Manus. Building a smart, reliable AI Agent is akin to raising a child, where methodology is paramount. 👶 After four grueling major refactorings and millions of real user sessions, the Manus team has generously shared their “parenting guide.” 📜 They discovered that effectively utilizing Prompt Caching to speed up responses, keeping the tool list concise and stable, and cleverly using the file system as the Agent’s “long-term memory” carrier are key to boosting its performance and efficiency. These invaluable lessons, forged through countless failures, are undoubtedly a priceless practical guide - AI News for all Agent developers.
The Revelation of Claude Code: Taming All Complex Software with “Human Talk.” The command line, once a “black hole interface” that made countless non-technical folks tremble and dread, is now being tamed by Claude Code using the most natural human language. 🗣️ Users simply say, “Help me deploy this application to the server,” in plain English, and all the complex operations are handled by AI. This revolutionary breakthrough uncovers a massive, multi-billion-dollar market opportunity: every industry has its “terminal,” whether it’s Photoshop’s intricate toolbars or Excel’s dizzying pivot tables. In the future, software’s value won’t hinge on how complex its features are but on how simple it is to use, and mastering “prompt engineering” will become a new superpower. 🪄 Click to read in-depth analysis - AI News .
AI Agent User Manual: More Tools Aren’t Better; Less is More (and Smarter). Think stuffing an AI Agent with a ton of tools will turn it into a “hexagonal warrior,” mastering all 18 martial arts? Big mistake, buddy; it’ll probably just make it dumber. 🤔 One insightful view points out that providing an Agent with too many or unclearly described tools, especially when functionally similar ones exist, can easily lead to “decision paralysis,” causing it to pick the wrong or inefficient solutions. The real best practice is: at the start of a task, clearly provide a small, highly relevant set of tools, explaining their purpose and boundaries in clear, unambiguous language. Instead of chasing a quantitative “big and comprehensive” approach, it’s better to meticulously refine the quality of a few core tools—that’s the only way - AI News to boost an Agent’s intelligence. 🎯
The True AI Revolution: Not About You Using Tools Better, But AI Using Them For You. From AI-assisted coding to AI-assisted photo editing and video cutting, many current AI applications merely “make tools easier to use.” But essentially, you’re still the operator glued to the screen. The real paradigm shift lies with AI Agents. In that world, you just set the goals and acceptance criteria like a boss, and the Agent autonomously plans tasks, selects, and operates a series of tools until the final outcome is delivered. 🤖 This is the ultimate leap from “freeing your hands” to “freeing your brain,” a genuine productivity revolution capable of disrupting existing workflows. A brand new era is dawning upon us. 🧠 Click to view opinion - AI News .
When Robots Learn to Hug: The Ultimate Goal of Design is to Create Happiness. A new book on robot design reveals several heartwarming moments that could melt anyone’s heart: engineers cheering on Pepper, a robot struggling to restart; strangers in France spontaneously hugging a street Pepper that only “asks for hugs”; and nursing home residents not caring if Pepper’s answers are correct, only wishing its hands were warm. ❤️ These stories profoundly inspired the author to leave his team, which pursued extreme efficiency, and instead create Lovot, a robot designed to bring happiness. This gently reminds us that technology’s ultimate value might not always lie in boosting efficiency or solving problems, but in warming hearts - AI News . 🤗
Veo 3’s “Magic Moment”: When a Logo Seamlessly Transforms into a Product. Google’s ace text-to-video model, Veo 3, continues to demonstrate its astonishing creativity and vitality. ✨ In a recent test video, it showcased the “magic” of seamlessly and smoothly transforming a static brand logo into a dynamic product. This silky-smooth transition and incredibly creative visual storytelling are practically tailor-made for the final shot of a brand commercial, making it utterly unforgettable. This isn’t just cool; it’s a whole new way of brand storytelling, letting us see the huge potential - AI News for AI to create endless possibilities in commercial advertising. 🎬
Is AI “Killing” the Internet, or Reshaping It? The authoritative magazine The Economist recently issued a thought-provoking warning: AI is killing the web. 💀 The article points out that generative AI, exemplified by ChatGPT, is fundamentally eroding the traditional economic foundation upon which the internet thrives—the model where users support content creators by visiting websites and viewing ads. When users can directly get integrated, click-free answers from AI, who’s going to bother visiting those original links? This paradigm shift triggered by AI is forcing us to rethink the future of the internet, and whether, and how, we can save that once open, diverse, and vibrant online world - AI News . 🌐
A Must-Read for Developers: When Large Models Meet AIOps. AIOps (Intelligent Operations), an increasingly vital field in the developer community, is now experiencing a game-changing empowerment from Large Language Models (LLMs). 📈 A survey article, which deeply analyzes over 180 relevant top conference papers, clearly points out that applying LLM’s powerful reasoning and generative capabilities to AIOps in production environments is one of the most noteworthy and investable tech trends right now. This not only significantly boosts the efficiency and intelligence of tasks like troubleshooting, performance monitoring, and root cause analysis, but also carves out brand-new application scenarios and career paths for a broad spectrum of developers, making it a key technology stack for the future. 🛠️ Click to view details - AI News .

Listen to the Audio Version of AI Daily

🎙️ Xiaoyuzhou FM	📹 Douyin
Afterlife Pub	Self-media Account

07-23 AI News 07-21 AI News