NVIDIA Releases Nemotron-Terminal | Google AI Introduces Gemini Embedding 2

Here is your today’s AI Dev Brief from Marktechpost, covering core research, models, infrastructure tools, and applied updates for AI developers and researchers.

NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents

NVIDIA has introduced Terminal-Task-Gen and the Terminal-Corpus dataset to address the data scarcity bottleneck hindering the development of autonomous terminal agents. By utilizing a "coarse-to-fine" strategy that combines the adaptation of existing math, code, and software engineering benchmarks with the synthesis of novel tasks from a structured taxonomy of primitive skills, they developed the Nemotron-Terminal model family. The 32B variant achieved a 27.4% success rate on the Terminal-Bench 2.0 evaluation, significantly outperforming much larger models like the 480B Qwen3-Coder. This research demonstrates that high-quality data engineering—specifically the use of pre-built domain Docker images and the inclusion of unsuccessful trajectories to teach error recovery—is more critical for terminal proficiency than sheer parameter scale...… Read the full analysis/article here.

Check out the Paper here

Google AI Introduces Gemini Embedding 2: A Multimodal Embedding Model that Lets Your Bring Text, Images, Video, Audio, and Docs into the Embedding Space

Google AI Releases Gemini Embedding 2, a natively multimodal model that maps Text, Image, Video, Audio, and PDF into a single latent space for more accurate and efficient Retrieval-Augmented Generation (RAG). The model’s standout feature is Matryoshka Representation Learning (MRL), which allows devs to truncate the default 3,072-dimension vectors down to 1,536 or 768 dimensions with minimal accuracy loss, significantly reducing vector database storage costs and search latency. With an expanded 8,192-token context window and high scores on the MTEB benchmark, it provides a unified, production-ready solution for developers looking to build scalable, cross-modal semantic search systems without managing separate embedding pipelines for different media types...… Read the full analysis/article here.

Check out the details here

Latest Releases in Last 72 Hours

WorkBuddy (Tencent)
llmock (Copilotkit)
Mobile-Agent (Tongyi Lab)
Copilot Cowork (Microsoft)
Code Review (Anthropic)
Chartli
CLI-Anything (HKUDS)
Fractals (TinyAGI)
Exa Deep (Exa)
and many more…..

Project Notebooks/Tutorials

▶ How to Build a Self-Designing Meta-Agent That Automatically Constructs, Instantiates, and Refines Task-Specific AI Agents Codes Tutorial

▶ How to Build a Risk-Aware AI Agent with Internal Critic, Self-Consistency Reasoning, and Uncertainty Estimation for Reliable Decision-Making Codes Tutorial

▶ Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation Codes Tutorial

▶ How to Design an Advanced Tree-of-Thoughts Multi-Branch Reasoning Agent with Beam Search, Heuristic Scoring, and Depth-Limited Pruning Codes Tutorial

150+ more open codes/notebooks here ➡️

NVIDIA Releases Nemotron-Terminal | Google AI Introduces Gemini Embedding 2

NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents

Google AI Introduces Gemini Embedding 2: A Multimodal Embedding Model that Lets Your Bring Text, Images, Video, Audio, and Docs into the Embedding Space

Latest Releases in Last 72 Hours

Project Notebooks/Tutorials

How was today’s email?

Awesome | Decent | Not Great

Keep Reading

The newsletter platform built for AI Devs