Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.

StepFun Releases Step DeepResearch, A 32B Atomic Capability Agent For Long Horizon Research

StepFun has introduced Step DeepResearch, a 32B parameter deep research agent built on Qwen2.5 32B Base that targets long horizon research tasks instead of short fact lookup. The system internalizes 4 atomic capabilities, planning, deep information seeking, reflection and verification, and professional report generation, trained with dedicated data pipelines for each skill. A three stage pipeline, mid training, supervised fine tuning and reinforcement learning, scales context to 128k tokens and optimizes behavior with a rubric based judge. At inference time a single ReAct style agent drives batch web search, todo, shell and file tools, backed by a Search API grounded in more than 20M papers and 600 premium indices plus curated trusted domains. Step DeepResearch reaches 61.42 percent on Scale Research Rubrics and 67.1 percent win or tie rate on ADR Bench.… Read the full analysis/article here.

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

Qwen researchers from Alibaba Cloud have released Qwen3-TTS, an Apache 2.0 multilingual text to speech suite for production use. The stack includes 0.6B and 1.7B models that cover 3 second voice cloning, preset CustomVoice speakers, and VoiceDesign for creating new voices from natural language descriptions. All models use a 12Hz discrete speech tokenizer with 16 codebooks, which enables low bitrate streaming and real time synthesis. Reported first packet latency is about 100 ms on a single GPU, with around 320 ms of audio per packet. Qwen3 TTS is trained on more than 5 million hours of speech across 10 languages and uses a multi stage alignment pipeline with DPO, GSPO and speaker tuning. Benchmarks show low word error rate, strong speaker similarity, and state of the art English zero shot cloning on Seed TTS among evaluated systems..… Read the full analysis/article here.

Project Notebooks/Tutorials

▶ [Open Source] Rogue: An Open-Source AI Agent Evaluator worth trying Codes & Examples

▶ A Coding Implementation to Automating LLM Quality Assurance with DeepEval, Custom Retrievers, and LLM-as-a-Judge Metrics Codes Tutorial

▶ How an AI Agent Chooses What to Do Under Tokens, Latency, and Tool-Call Budget Constraints? Codes Tutorial

▶ How Machine Learning and Semantic Embeddings Reorder CVE Vulnerabilities Beyond Raw CVSS Scores Codes Tutorial

How was today’s email?

Awesome  |   Decent    |  Not Great

Keep Reading