Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.
Need to partner with us for promoting your GitHub Repo | Hugging Face Page OR Product Release OR Webinar etc.? Connect with us
UCSD and Together AI Research Introduces Parcae: A Stable Architecture for Looped Language Models That Achieves the Quality of a Transformer Twice the Size
Parcae is a stable looped transformer that addresses residual state explosion in looped architectures by recasting the forward pass as a dynamical system and constraining the spectral norm of the state transition matrix via negative diagonal parameterization and ZOH discretization. Combined with a prelude normalization fix and a corrected per-sequence depth sampling algorithm, Parcae reduces validation perplexity by up to 6.3% over prior looped models and outperforms parameter-matched standard Transformers at every scale tested — with a 770M Parcae model matching the downstream quality of a 1.3B Transformer. The research further establishes the first scaling laws for layer looping, showing that compute-optimal training scales mean recurrence and data in tandem following power laws, that test-time looping saturates at a ceiling set by training depth, and that both behaviors unify into a single predictive law with under 1.31% average error on held-out models...… Read the full analysis/article here.
TinyFish just shipped four products under one API key: Web Search, Web Fetch, Web Browser, and Web Agent
TinyFish launched a four-product web infrastructure platform for AI agents — Web Search, Web Fetch, Web Browser, and Web Agent — all under one API key and credit system. Web Search returns structured JSON at ~488ms P50 (competitors average 2,800ms+), Web Fetch renders full pages in a real browser and strips irrelevant markup before returning clean Markdown or JSON, Web Browser provides managed stealth Chrome sessions via CDP with sub-250ms cold start and 28 C++-level anti-bot mechanisms, and Web Agent sits at #1 on Mind2Web with 89.9% accuracy across 300 tasks. All four endpoints are accessible via CLI (npm install -g @tiny-fish/cli) with an Agent Skill that teaches coding agents like Claude Code, Cursor, and Codex to use every endpoint automatically — no manual integration. CLI operations consume ~100 tokens per task versus ~1,500 over MCP, write output to the filesystem instead of the context window, and deliver 2× higher task completion on complex multi-step workflows. 500 free steps at tinyfish.ai, no credit card required...… Read the full analysis/article here.
Promoted
Qwen Team Open-Sources Qwen3.6-35B-A3B: A Sparse MoE Vision-Language Model with 3B Active Parameters and Agentic Coding Capabilities
The Qwen team has open-sourced Qwen3.6-35B-A3B under Apache 2.0 — a sparse Mixture of Experts model with 35B total parameters but only 3B activated at inference, making it significantly cheaper to run than its size suggests. The architecture pairs Gated DeltaNet linear attention with Grouped Query Attention (16Q / 2KV) across 40 layers, with 256 MoE experts per layer (8 routed + 1 shared per token) and a native context of 262,144 tokens extensible to ~1M via YaRN. Its strongest area is agentic coding — 73.4 on SWE-bench Verified, 51.5 on Terminal-Bench 2.0 (highest among all compared models), and 1,397 on QwenWebBench — while also posting 92.7 on AIME 2026, 86.0 on GPQA Diamond, and 81.7 on MMMU on the multimodal side. A new Thinking Preservation feature lets agents retain and reuse reasoning traces across turns via preserve_thinking: true, reducing redundant computation and improving KV cache efficiency. ..… Read the full analysis/article here.
Promoted
Project Notebooks/Tutorials
▶ A Coding Implementation to Build Multi-Agent AI Systems with SmolAgents Using Code Execution, Tool Calling, and Dynamic Orchestration Codes Tutorial
▶ How to Build a Universal Long-Term Memory Layer for AI Agents Using Mem0 and OpenAI Codes Tutorial
▶ A Coding Implementation of Crawl4AI for Web Crawling, Markdown Generation, JavaScript Execution, and LLM-Based Structured Extraction Codes Tutorial
▶ Google ADK Multi-Agent Pipeline Tutorial: Data Loading, Statistical Testing, Visualization, and Report Generation in Python Codes Tutorial
▶ A Hands-On Coding Tutorial for Microsoft VibeVoice Covering Speaker-Aware ASR, Real-Time TTS, and Speech-to-Speech Pipelines Codes Tutorial
▶ An Implementation Guide to Building a DuckDB-Python Analytics Pipeline with SQL, DataFrames, Parquet, UDFs, and Performance Profiling Codes Tutorial
▶ How to Build a Secure Local-First Agent Runtime with OpenClaw Gateway, Skills, and Controlled Tool Execution Codes Tutorial
▶ How to Deploy Open WebUI with Secure OpenAI API Integration, Public Tunneling, and Browser-Based Chat Access Codes Tutorial