Editor’s Pick

You should not miss this one

[Open Source Voice AI] StepFun AI Releases Step-Audio-EditX: A New Open-Source 3B LLM-Grade Audio Editing Model Excelling at Expressive and Iterative Audio Editing. Step Audio EditX is an open source 3B parameter LLM based audio model that uses the Step Audio dual codebook tokenizer and a diffusion transformer plus BigVGANv2 decoder to treat speech as discrete tokens and perform expressive, iterative editing of emotion, speaking style, and paralinguistics, while also providing robust zero shot TTS for Chinese, English, Sichuanese, and Cantonese, and experiments on the Step Audio Edit Test benchmark with Gemini 2.5 Pro as a judge show that it outperforms MiniMax 2.6 hd and Doubao Seed TTS 2.0 on fine grained control tasks.

[Open Source] Memori: An Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems Codes & Examples. Memori enables any LLM to remember conversations, learn from interactions, and maintain context across sessions with a single line: memori.enable(). Memory is stored in standard SQL databases (SQLite, PostgreSQL, MySQL) that you fully own and control.

AI Dev and Latest Releases

[Agentic GUI] Gelato-30B-A3B: A State-of-the-Art Grounding Model for GUI Computer-Use Tasks, Surpassing Computer Grounding Models like GTA1-32B. Gelato 30B A3B is a Qwen3 VL based mixture of experts grounding model trained on the Click 100k dataset that maps natural language instructions and screenshots to precise click coordinates, achieves 63.88 percent on ScreenSpot Pro and 69.15 percent on OS World G with 74.65 percent on OS World G Refined, outperforms GTA1 32B and larger VLMs like Qwen3 VL 235B A22B Instruct, and is available as an image to text model on Hugging Face for easy integration into agent stacks.

[Machine Learning] Google AI Introduce Nested Learning: A New Machine Learning Approach for Continual Learning that Views Models as Nested Optimization Problems to Enhance Long Context Processing. Nested Learning reframes a neural network as a Neural Learning Module made of nested optimization problems, where architecture and optimizer are the same system operating at different update frequencies and compressing their own context flows into associative memories, and Google’s HOPE architecture, a self modifying Titans variant augmented with Continuum Memory System and deep optimizers, shows that this design gives lower perplexity, higher accuracy, and stronger long context continual learning than standard Transformers and modern recurrent baselines on public language modeling and reasoning benchmarks.

[Open and LLM Add-on] Moonshot AI Releases Kosong: The LLM Abstraction Layer that Powers Kimi CLI. Kosong is Moonshot AI’s LLM abstraction layer that unifies message structures, asynchronous tool orchestration and pluggable chat providers for modern agent applications. It exposes a small Python API around generate, step, ChatProvider, Message and Toolset, and currently ships a production ready Kimi provider that also powers Kimi CLI under the hood. For AI engineers, it offers a minimal, Apache 2.0 licensed foundation to build tool using agents without committing to a heavy framework.

Project Notebooks/Tutorials

[Open Source] Memori: An Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems Codes & Examples

▶ A Coding Implementation to Build Neural Memory Agents with Differentiable Memory, Meta-Learning, and Experience Replay for Continual Adaptation in Dynamic Environments Codes Tutorial

▶ How to Build an Agentic Voice AI Assistant that Understands, Reasons, Plans, and Responds through Autonomous Multi-Step Intelligence Codes Tutorial

▶ Build a Multi-Agent System for Integrated Transcriptomic, Proteomic, and Metabolomic Data Interpretation with Pathway Reasoning Codes Tutorial

▶ A Coding Implementation to Build a Transformer-Based Regression Language Model to Predict Continuous Values from Text Codes Tutorial

How was today’s email?

Awesome  |   Decent    |  Not Great

Keep Reading

No posts found