Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.
Andrej Karpathy Open-Sources 'Autoresearch': A 630-Line Python Tool Letting AI Agents Run Autonomous ML Experiments on Single GPUs
Andrej Karpathy has open-sourced autoresearch, a minimalist ~630-line Python framework that effectively turns AI agents into autonomous ML researchers. By stripping down the nanochat core for single-GPU use, the tool allows agents to iterate on training code through five-minute sprints, committing only improvements that lower validation bits-per-byte (BPB) scores. The results are already tangible: Shopify CEO Tobi Lutke (on a tweet) utilized the loop to boost model performance by 19%, proving that smaller, agent-optimized models can outpace larger ones when left to relentlessly refine hyperparameters and architecture. It is essentially ‘grad student descent’ as a service, shifting the engineer's role from manual tuning to designing the ideal research prompt......… Read the full analysis/article here.

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Compact Multimodal Model for Math, Science, and GUI Understanding
Microsoft’s Phi-4-reasoning-vision-15B is a 15B open-weight multimodal reasoning model that combines Phi-4-Reasoning with SigLIP-2 in a mid-fusion architecture to handle image-and-text tasks with lower compute requirements than much larger vision-language models. Microsoft team trained it on 200B multimodal tokens and designed it around 2 practical ideas: preserve high-resolution visual detail for dense documents and interfaces, and use a mixed reasoning setup so the model can switch between direct responses and explicit reasoning when needed. The result is a compact model aimed at math, science, document understanding, OCR, and GUI grounding, with reported strong results on benchmarks such as AI2DTEST, ChartQATEST, MathVistaMINI, OCRBench, and ScreenSpotv2.....… Read the full analysis/article here.

Latest Releases in Last 72 Hours
Fractals (TinyAGI)
Exa Deep (Exa)
Paperclip (Paperclip AI)
Google Workspace CLI (Google)
Symphony (OpenAI)
Vulnerability Fixer (OpenHands AI)
Project Notebooks/Tutorials
▶ Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation Codes Tutorial
▶ How to Design an Advanced Tree-of-Thoughts Multi-Branch Reasoning Agent with Beam Search, Heuristic Scoring, and Depth-Limited Pruning Codes Tutorial
▶ How to Build an EverMem-Style Persistent AI Agent OS with Hierarchical Memory, FAISS Vector Retrieval, SQLite Storage, and Automated Memory Consolidation Codes Tutorial
▶ How to Design a Production-Grade Multi-Agent Communication System Using LangGraph Structured Message Bus, ACP Logging, and Persistent Shared State Architecture Codes Tutorial
▶ A Coding Implementation to Build a Hierarchical Planner AI Agent Using Open-Source LLMs with Tool Execution and Structured Multi-Agent Reasoning Codes Tutorial