Andrej Karpathy Open-Sources 'Autoresearch' | Microsoft Releases Phi-4-Reasoning-Vision-15B

Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.

Andrej Karpathy Open-Sources 'Autoresearch': A 630-Line Python Tool Letting AI Agents Run Autonomous ML Experiments on Single GPUs

Andrej Karpathy has open-sourced autoresearch, a minimalist ~630-line Python framework that effectively turns AI agents into autonomous ML researchers. By stripping down the nanochat core for single-GPU use, the tool allows agents to iterate on training code through five-minute sprints, committing only improvements that lower validation bits-per-byte (BPB) scores. The results are already tangible: Shopify CEO Tobi Lutke (on a tweet) utilized the loop to boost model performance by 19%, proving that smaller, agent-optimized models can outpace larger ones when left to relentlessly refine hyperparameters and architecture. It is essentially ‘grad student descent’ as a service, shifting the engineer's role from manual tuning to designing the ideal research prompt......… Read the full analysis/article here.

Check out the Repo here

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Compact Multimodal Model for Math, Science, and GUI Understanding

Microsoft’s Phi-4-reasoning-vision-15B is a 15B open-weight multimodal reasoning model that combines Phi-4-Reasoning with SigLIP-2 in a mid-fusion architecture to handle image-and-text tasks with lower compute requirements than much larger vision-language models. Microsoft team trained it on 200B multimodal tokens and designed it around 2 practical ideas: preserve high-resolution visual detail for dense documents and interfaces, and use a mixed reasoning setup so the model can switch between direct responses and explicit reasoning when needed. The result is a compact model aimed at math, science, document understanding, OCR, and GUI grounding, with reported strong results on benchmarks such as AI2DTEST, ChartQATEST, MathVistaMINI, OCRBench, and ScreenSpotv2.....… Read the full analysis/article here.

Check out the Paper here

Latest Releases in Last 72 Hours

Fractals (TinyAGI)
Exa Deep (Exa)
Paperclip (Paperclip AI)
Google Workspace CLI (Google)
Symphony (OpenAI)
Warper
Vulnerability Fixer (OpenHands AI)
and many more…..

Project Notebooks/Tutorials

▶ Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation Codes Tutorial

▶ How to Design an Advanced Tree-of-Thoughts Multi-Branch Reasoning Agent with Beam Search, Heuristic Scoring, and Depth-Limited Pruning Codes Tutorial

▶ How to Build an EverMem-Style Persistent AI Agent OS with Hierarchical Memory, FAISS Vector Retrieval, SQLite Storage, and Automated Memory Consolidation Codes Tutorial

▶ How to Design a Production-Grade Multi-Agent Communication System Using LangGraph Structured Message Bus, ACP Logging, and Persistent Shared State Architecture Codes Tutorial

▶ A Coding Implementation to Build a Hierarchical Planner AI Agent Using Open-Source LLMs with Tool Execution and Structured Multi-Agent Reasoning Codes Tutorial

150+ more open codes/notebooks here ➡️

Andrej Karpathy Open-Sources 'Autoresearch' | Microsoft Releases Phi-4-Reasoning-Vision-15B

Andrej Karpathy Open-Sources 'Autoresearch': A 630-Line Python Tool Letting AI Agents Run Autonomous ML Experiments on Single GPUs

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Compact Multimodal Model for Math, Science, and GUI Understanding

Latest Releases in Last 72 Hours

Project Notebooks/Tutorials

How was today’s email?

Awesome | Decent | Not Great

Keep Reading

The newsletter platform built for AI Devs