👋 Hello. You’re reading the AI Dev Brief by MarkTechPost — the daily signal for AI engineers and researchers who build with AI, not just talk about it. No hype. No filler. Just the research, releases, and infrastructure moves that actually matter.
Want to promote your AI Product, GitHub repo, HuggingFace model, product release, or webinar in front of 1,000,000+ AI practitioners? Connect with us
🔥 TODAY’S BRIEFING — STORIES WORTH 5 MINUTES
1. Microsoft Releases Fara1.5: Browser Computer-Use Agents (4B/9B/27B) That Beat OpenAI Operator and Gemini 2.5 — Fara1.5, an open family of browser computer-use agent models fine-tuned from Alibaba's Qwen3. Fara1.5-27B scores 88.6% on Online-Mind2Web — edging out OpenAI Operator at 87.0% and beating Gemini 2.5 Computer Use outright. Three model sizes (4B, 9B, 27B) accommodate different cost and performance constraints.
2. NVIDIA AI Releases Gated DeltaNet-2: Linear Attention That Decouples Erase and Write — Beats Mamba-3 and KDA at 1.3B — Gated DeltaNet-2, a new linear attention architecture that fixes a fundamental flaw in all prior delta-rule models: using one scalar gate to handle both erasing old content and writing new content simultaneously. GDN-2 decouples these with separate channel-wise gates. At 1.3B parameters trained on 100B FineWeb-Edu tokens, it achieves the strongest overall results among Mamba-2, Gated DeltaNet, and KDA.
3. Mistral Vibe now moves coding agents to the cloud so you can run several in parallel and stop being the bottleneck on every step the agent takes. Each session runs in an isolated sandbox. Start from the Vibe CLI or Le Chat, inspect file diffs, tool calls, and progress states as they run, and come back to a finished branch or draft PR. Already working locally? Teleport your session to the cloud and keep going without losing context. Available on Le Chat Pro and Team. Get Started with Vibe (promoted)
4. Microsoft Webwright: Terminal-Native Web Agent — 60.1% on Odysseys, Nearly Double GPT-5.4's Base 33.5% — It gives the model a terminal, a local workspace, and the freedom to write Playwright code that launches and inspects browser sessions — no click loops, no screenshots. Powered by GPT-5.4, it scores 60.1% on Odysseys (200 long-horizon tasks) — +15.6 points over prior SOTA and 86.7% on Online-Mind2Web — highest among all open-sourced harnesses.
5. Nous Research Releases CNA: Steer Any LLM Behavior by Ablating 0.1% of MLP Neurons — No Training, No Weight Changes — Contrastive Neuron Attribution (CNA), a method that identifies the 0.1% of MLP neurons whose activations most distinguish harmful from benign behavior — then ablates them to steer the model. No SAE training. No fine-tuning. No weight modification whatsoever. Steer refusal, bias, or any target behavior at high strength while preserving output quality.
When Postgres Optimization Stops Working and What's Next
Meet the Optimization Treadmill - where every “correct” Postgres fix (indexes, partitions, replicas) buys less time while the ceiling stays the same. Analytical workloads expose mechanical limits in MVCC, row storage, planning costs, and WAL that compound as data grows. Learn how to recognize when you’re optimizing… and when the architecture itself is the problem.
📰 Secondary News
Perplexity Open-Sources Bumblebee: Read-Only Supply Chain Scanner for Developer Endpoints — Bumblebee is a read-only Go-based scanner that checks developer endpoints for supply-chain package exposure — no write access, no side effects, just visibility. As AI agents increasingly interact with production endpoints, supply chain risk at the API layer is the new attack surface. Bumblebee is the answer.
How CopilotKit Is Redefining the Agentic AI Stack in 2026 — CopilotKit is redefining the agentic AI stack by launching three vendor-neutral infrastructure tools designed to move AI agents into production-grade applications. Their horizontal platform features AG-UI for real-time user-agent interaction, AIMock for deterministic full-stack testing, and Pathfinder for self-hosted knowledge retrieval, effectively bridging the critical deployment gaps that typically stall enterprise agent software development. [promoted]
Tencent Open-Sources TencentDB Agent Memory: 4-Tier Local Pipeline — 61% Token Reduction, +51% Task Success — TencentDB Agent Memory, an MIT-licensed fully local memory system for AI agents combining symbolic short-term memory with a 4-tier long-term memory pipeline — zero external API dependencies. Benchmarks show 61% token reduction and +51% task success rate over standard context-window approaches. Works out of the box with OpenClaw and Hermes
Accio Work: Your Business, On Autopilot
Run your business effortlessly with Accio Work. Our specialized AI agents handle sourcing, supplier deals, store management, and marketing—all automatically. Backed by Alibaba.com’s vast product network, execution is fast and reliable. Skip the complexity—get results instantly while staying in full control of your growth.
🛠️ More Releases/Updates for AI Devs
A. Google: Launched Gemini 3.5 Flash, the first model in its new 3.5 series combining frontier intelligence with agentic action. It surpasses Gemini 3.1 Pro on coding, agentic, and multimodal benchmarks at 4× faster output speeds than competing frontier models. Available now via the Gemini API in Google AI Studio, Antigravity 2.0, Android Studio, and the Gemini app globally. Gemini 3.5 Pro is in internal testing and expected next month.
B. Google: Announced Antigravity 2.0 at Google I/O, its upgraded agent-first development platform. The new version ships with a Managed Agents API — a single API call that spins up a fully orchestrated, stateful agent in an isolated sandbox — plus an Antigravity CLI and native voice support available globally starting today. Also previewed WebMCP, a proposed open web standard that lets you expose JavaScript functions and HTML forms directly to browser-based agents.
C. Cursor: Released Composer 2.5, its latest in-house AI coding model, alongside the Cursor SDK for building custom agents in Python and TypeScript. Composer 2.5 is built on Kimi K2.5, trained on 25× more synthetic coding tasks than its predecessor, and scores 79.8% on SWE-Bench Multilingual — matching Claude Opus 4.7 and GPT-5.5 at roughly 1/10th the token cost. Also crossed $3B annual revenue run rate this week.
D. Anthropic: Launched self-hosted sandboxes (public beta) and MCP tunnels (research preview) inside Claude Managed Agents. Self-hosted sandboxes move tool execution into infrastructure you control — via Cloudflare, Daytona, Modal, or Vercel — while Anthropic continues managing orchestration. MCP tunnels let agents reach private internal databases, APIs, and knowledge bases without exposing them to the public internet. A lightweight gateway makes one outbound connection, encrypted end-to-end, no inbound firewall rules required.
E. Anthropic: Shipped a batch of Claude Code developer platform updates this week, including: ANTHROPIC_WORKSPACE_ID for workload identity federation, claude agents --cwd <path> to scope session lists to a directory, a "Summarize up to here" rewind option to compress earlier context, and background agents that now preserve the current permission mode on launch. Cache diagnostics launched in public beta to explain prompt cache misses.
F. OpenAI: Rolled out a dual-layer AI content provenance system combining C2PA conformance and Google DeepMind's SynthID invisible watermarking across images generated via ChatGPT, Codex, and the OpenAI API. Also launched a public verification tool in preview at openai.com/verify — anyone can upload an image to check whether C2PA credentials, a SynthID watermark, or both are present. Google Search and Chrome will gain native SynthID detection in the coming months.
[Partner with us] Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us


