Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.
Robbyant Open Sources LingBot World: a Real Time World Model for Interactive Simulation and Embodied AI
LingBot World, released by Robbyant from Ant Group, is an action conditioned world model that turns text and control inputs into long horizon, interactive video simulations for embodied agents, driving and games. Built on a 28B parameter mixture of experts diffusion transformer initialized from Wan2.2, it learns dynamics from a unified data engine that combines web videos, game logs with actions and Unreal Engine trajectories, with hierarchical captions that separate static layout from motion. Actions enter the model through camera embeddings and adaptive keyboard adapters, which are fine tuned while the visual backbone stays frozen. A distilled variant, LingBot World Fast, uses block causal attention and diffusion forcing to reach about 16 frames per second at 480p on 1 GPU node with under 1 second latency, and achieves leading VBench scores with strong emergent memory and structural consistency....… Read the full analysis/article here.
DeepSeek AI Releases DeepSeek-OCR 2 with Causal Visual Flow Encoder for Layout Aware Document Understanding
DeepSeek-OCR 2 is an open source document OCR and understanding system that replaces a CLIP ViT style encoder with DeepEncoder V2, a Qwen2 0.5B based transformer that converts 2D pages into causal visual sequences aligned with a learned reading order. An 80M parameter SAM backbone with multi crop global and local views keeps the visual token budget between 256 and 1120 tokens per page while preserving layout information. The model is trained in 3 stages, encoder pretraining, joint query enhancement with DeepSeek 3B A500M, and decoder only finetuning on an OCR heavy mixture that emphasizes text, formulas, and tables. On OmniDocBench v1.5 DeepSeek-OCR 2 reaches 91.09 overall, improves reading order and element level edit distances over both DeepSeek-OCR and Gemini 3 Pro, reduces repetition in production logs....… Read the full analysis/article here.
AI2 Releases SERA, Soft Verified Coding Agents Built with Supervised Training Only for Practical Repository Level Automation Workflows
AI2’s SERA, Soft Verified Efficient Repository Agents, shows that coding agents don't need reinforcement learning pipelines or test suites to work well. The flagship SERA 32B model fine tunes Qwen 3 32B on 25,000 synthetic trajectories from a GLM 4.6 teacher using a method called Soft Verified Generation that compares two rollouts of the same change and scores line overlap between patches as a soft correctness signal. Trained on data from 121 Python repositories, SERA 32B reaches 49.5 percent on SWE bench Verified at 32K context and 54.2 percent at 64K, while the full pipeline costs about 40 GPU days. With about 8,000 trajectories per repo, SERA can also specialize to projects like Django and SymPy and match or beat larger teacher models.....… Read the full analysis/article here.
Latest Releases in Last 72 Hours
Project Notebooks/Tutorials
▶ [Open Source] Rogue: An Open-Source AI Agent Evaluator worth trying Codes & Examples
▶ A Coding Implementation to Training, Optimizing, Evaluating, and Interpreting Knowledge Graph Embeddings with PyKEENs Codes Tutorial
▶ How to Build a Robust Multi-Agent Pipeline Using CAMEL with Planning, Web-Augmented Reasoning, Critique, and Persistent Memory Codes Tutorial
▶ How to Build Contract-First Agentic Decision Systems with PydanticAI for Risk-Aware, Policy-Compliant Enterprise AI Codes Tutorial
▶ How to Build Production-Grade Agentic Workflows with GraphBit Using Deterministic Tools, Validated Execution Graphs, and Optional LLM Orchestration Codes Tutorial