Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.

Liquid AI Releases LFM2.5-1.2B-Thinking: a 1.2B Parameter Reasoning Model That Fits Under 1 GB On-Device

LFM2.5-1.2B-Thinking is a 1.2 billion parameter reasoning model that runs fully on device under 1 GB of memory. The model offers a 32,768 token context window and produces explicit thinking traces before final answers, which is useful for agents, tool use, math, and retrieval augmented generation workflows. It delivers strong results for its size, including 87.96 on MATH 500, 85.60 on GSM8K, and competitive performance with Qwen3 1.7B in thinking mode. A multi stage pipeline with supervised reasoning traces, preference alignment, and RLVR reduces doom looping from 15.74 percent to 0.36 percent. LFM2.5-1.2B-Thinking runs efficiently on AMD and Qualcomm NPUs and CPUs with runtimes like llama.cpp, FastFlowLM, and NexaML, and is available in GGUF, ONNX, and MLX formats through Hugging Face and partner ecosystems..… Read the full analysis/article here.

Zhipu AI Releases GLM-4.7-Flash: A 30B-A3B MoE Model for Efficient Local Coding and Agents

Z.ai releases GLM 4.7 Flash, a 30B A3B Mixture of Experts model that targets efficient local deployment for coding and agentic workloads. The model has 31B parameters and a 128k token context length, and it is trained for English and Chinese chat use cases. Benchmarks in the official card show leading or competitive performance versus Qwen3-30B-A3B-Thinking-2507 and GPT-OSS-20B on AIME 25, GPQA, LiveCodeBench v6, SWE bench Verified, τ² Bench, and BrowseComp. GLM 4.7 Flash exposes documented evaluation settings and a Preserved Thinking mode for multi turn agent tasks. Z.ai provides ready configurations for vLLM and SGLang, along with a Transformers reference script..… Read the full analysis/article here.

Microsoft Research Releases OptiMind: A 20B Parameter Model that Turns Natural Language into Solver Ready Optimization Models

OptiMind is a 20B parameter Mixture of Experts model that converts natural language optimization problems into mixed integer linear programming formulations and runnable GurobiPy code. Built on openai/gpt-oss-20b, OptiMind SFT uses about 3.6B active parameters per token and supports a 128000 token context length, so it can handle long specifications and reasoning traces. It is trained on cleaned OR Instruct and OptMATH data and evaluated on IndustryOR and Mamo Complex, with a class based error analysis and hint pipeline for 53 optimization problem types. The framework improves formulation accuracy by 20.7 percent across multiple benchmarks and reaches performance that is competitive with larger proprietary models...… Read the full analysis/article here.

Project Notebooks/Tutorials

▶ [Open Source] Rogue: An Open-Source AI Agent Evaluator worth trying Codes & Examples

▶ A Coding Guide to Anemoi-Style Semi-Centralized Agentic Systems Using Peer-to-Peer Critic Loops in LangGraph Codes Tutorial

▶ A Coding Implementation of a Comprehensive Enterprise AI Benchmarking Framework to Evaluate Rule-Based LLM, and Hybrid Agentic AI Systems Across Real-World Tasks Codes Tutorial

▶ How to Build Ethically Aligned Autonomous Agents through Value-Guided Reasoning and Self-Correcting Decision-Making Using Open-Source Models Codes Tutorial

▶ How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines3 Codes Tutorial

▶ How I Built an Intelligent Multi-Agent Systems with AutoGen, LangChain, and Hugging Face to Demonstrate Practical Agentic AI Workflows Codes Tutorial

How was today’s email?

Awesome  |   Decent    |  Not Great

Keep Reading