AI Dev and Latest Releases

[Open Source AI] Alibaba Releases Tongyi DeepResearch: A 30B-Parameter Open-Source Agentic LLM Optimized for Long-Horizon Research. Tongyi DeepResearch-30B-A3B is an open-source agentic MoE model (~30.5B total, ~3–3.3B active) built for long-horizon web research. It combines a 128K context window with dual rollout modes—ReAct for intrinsic tool use and IterResearch “Heavy” for test-time scaling—backed by an automated agentic data engine (CPT→SFT) and on-policy RL using GRPO with token-level gradients. Reported results show strong performance on deep-research suites (HLE 32.9; BrowseComp 43.4 EN/46.7 ZH; xbench-DeepSearch 75). Weights, inference/eval scripts, and licensing are released under Apache-2.0.

[Agent] Google AI Introduces Agent Payments Protocol (AP2): An Open Protocol for Interoperable AI Agent Checkout Across Merchants and Wallets. Built on Verifiable Credentials (VCs), AP2 defines mandate types—Intent, Cart, and Payment Mandates—to eliminate ambiguity around authorization, authenticity, and accountability in autonomous and semi-autonomous checkout flows. The protocol extends Agent2Agent (A2A) and Model Context Protocol (MCP) to standardize how agents, merchants, and payment processors exchange verifiable evidence across the full intent → cart → payment pipeline. Already supported by 60+ organizations including American Express, Mastercard, PayPal, Coinbase, Worldpay, and Adyen, AP2 is payment-method agnostic, initially covering cards while adding support for real-time bank transfers and crypto via an A2A x402 extension.

[Agent Protocol] Bringing AI Agents Into Any UI: The AG-UI Protocol for Real-Time, Structured Agent–Frontend Streams. AG-UI is emerging as a standard protocol for connecting AI agents to user interfaces, replacing ad-hoc APIs and sockets with a unified event-driven model. It streams structured events like TEXT_MESSAGE_CONTENT, TOOL_CALL_*, and STATE_DELTA over SSE or WebSockets, ensuring real-time synchronization between backend logic and frontend UIs. Major frameworks including Mastra, LangGraph, CrewAI, Agno, LlamaIndex, and Pydantic AI already support it, with more integrations underway. The AG-UI Dojo provides runnable demos and validation tools for developers, while the CLI (npx create-ag-ui-app@latest) enables quick prototyping.

[Open Source AI] MoonshotAI Released Checkpoint-Engine: A Simple Middleware to Update Model Weights in LLM Inference Engines, Effective for Reinforcement Learning. It enables in-place weight updates for trillion-parameter LLMs in ~20 seconds across large GPU clusters. It supports both broadcast and peer-to-peer modes, uses an optimized pipeline that overlaps communication with memory copy, and integrates with vLLM for large-scale inference. Benchmarks show strong scalability, making it particularly effective for reinforcement learning pipelines that require frequent model updates, though current limitations include memory overhead, experimental FP8 support, and reliance on vLLM.

[Computer-User] H Company Releases Holo1.5: An Open-Weight Computer-Use VLMs Focused on GUI Localization and UI-VQA. H Company’s Holo1.5 is a GUI-centric VLM series (3B/7B/72B) for computer-use agents, optimized for UI element localization and UI-VQA at high resolution (up to 3840×2160). On six localization tracks, Holo1.5-7B averages 77.32 vs 60.73 for Qwen2.5-VL-7B and hits 57.94 vs 29.00 on ScreenSpot-Pro; UI-VQA averages are ~88.17 (7B) and ~90.00 (72B) across VisualWebBench, WebSRC, and ScreenQA. The 7B model ships under Apache-2.0 for commercial use, while 3B and 72B remain research-only. In deployment, Holo1.5 functions as a perception layer—screenshots in, coordinates and short answers out—requiring integration with planning/safety policies and replication of results under your capture pipeline.

Editor’s Pick

[OCR Model] IBM AI Releases Granite-Docling-258M: An Open-Source, Enterprise-Ready Document AI Model. IBM’s Granite-Docling-258M is an open-source (Apache-2.0) compact vision-language model for document conversion, succeeding SmolDocling with a Granite 165M backbone and SigLIP2 vision encoder. It outputs structured DocTags to preserve layout, tables, code, and equations with measurable accuracy gains across OCR, equations, and tables, plus improved stability. The model includes experimental multilingual support (Japanese, Arabic, Chinese), integrates with the Docling pipeline, and is available on Hugging Face in Transformers, ONNX, vLLM, and MLX formats for enterprise-ready, structure-preserving document AI….

From our Sponsor

Date and Time: 30th September, 5:00 PM CET [45 minutes with Q&A]

Adversaries are increasingly targeting Managed Service Providers (MSPs) with sophisticated tactics and techniques. According to the Acronis Cyberthreats Report, H2 2024, sophisticated APT-linked ransomware groups are eyeing MSPs—exploiting PowerShells, weak RDP passwords, unpatched devices, and compromised VPN credentials. The adversaries are relentless. But how can MSPs shift from a reactive approach and get proactive to reduce the blast radius?

Join us for an exclusive session with James Abercrombie, Technology Evangelist, Acronis, and Naren Vaideeswaran, Head of Product Marketing, NetBird, as they discuss how the integration works, the benefits, and how MSPs can effectively shrink the attack surface.

In this webinar, you will learn:

  • The impact of lateral movement and how ransomware is affecting businesses and reputation

  • How a multi-layered defense paves the way for effective prevention, detection, and disaster recovery readiness

  • How NetBird and Acronis integrate to contain evolving threats and protect your business.

(Sponsored)

How was today’s email?

Awesome  |   Decent    |  Not Great

Keep Reading

No posts found