MIT Researchers Enhanced Artificial Intelligence (AI) 64x Better at Planning, Achieving 94% Accuracy and more...

AI Dev and Latest Releases

[Voice AI + Open Source] Xiaomi Released MiMo-Audio, a 7B Speech Language Model Trained on 100M+ Hours with High-Fidelity Discrete Tokens. Xiaomi’s MiMo-Audio is a 7B audio-language model trained on over 100M hours of speech using a high-fidelity RVQ tokenizer and a patchified encoder–decoder architecture that reduces 25 Hz streams to 6.25 Hz for efficient modeling. Unlike traditional pipelines, it relies on a unified next-token objective across interleaved text and audio, enabling emergent few-shot skills such as speech continuation, voice conversion, emotion transfer, and speech translation once scale thresholds are crossed. Benchmarks show state-of-the-art performance on SpeechMMLU and MMAU with minimal modality gap, and Xiaomi has released the tokenizer, checkpoints, evaluation suite, and public demos for open research use.

[FREE WEBINAR] Contain Lateral Movement and Protect Your Clients using NetBird Integration with Acronis_(Sponsored)

[xAI] xAI launches Grok-4-Fast: Unified Reasoning and Non-Reasoning Model with 2M-Token Context and Trained End-to-End with Tool-Use Reinforcement Learning (RL). Grok-4-Fast is xAI’s new cost-optimized, prompt-steerable model that unifies “reasoning” and “non-reasoning” behaviors in a single Transformer with a 2M-token context, trained with tool-use RL to decide when to browse or call functions. xAI reports frontier-level benchmark scores with ~40% fewer “thinking” tokens than Grok-4, and exposes two API SKUs (reasoning / non-reasoning) at aggressive pricing starting at $0.20/M input and $0.50/M output, with cached input at $0.05/M. It’s live across Grok apps and the xAI API, ranks #1 on LMArena’s Search Arena (codename “menlo”)

[Reasoning] Alibaba Qwen Team Just Released FP8 Builds of Qwen3-Next-80B-A3B (Instruct & Thinking), Bringing 80B/3B-Active Hybrid-MoE to Commodity GPUs. Alibaba’s Qwen team released FP8 checkpoints for Qwen3-Next-80B-A3B in Instruct and Thinking variants, using fine-grained FP8 (block-128) to cut memory/bandwidth while retaining the 80B hybrid-MoE design (~3B active, 512 experts: 10 routed + 1 shared). Native context is 262K (validated ~1M via YaRN). The Thinking build defaults to <think> traces and recommends a reasoning parser; both models expose multi-token prediction and provide serving commands for current sglang/vLLM nightlies. Benchmark tables on the model cards are from the BF16 counterparts; users should re-validate FP8 accuracy/latency on their stacks. Licensing is Apache-2.0

[Open Source CV] NVIDIA AI Open-Sources ViPE (Video Pose Engine): A Powerful and Versatile 3D Video Annotation Tool for Spatial AI_(Sponsored)

Editor’s Pick

[OCR Model] MIT Researchers Enhanced Artificial Intelligence (AI) 64x Better at Planning, Achieving 94% Accuracy. The research team introduces PDDL-INSTRUCT, an instruction-tuning recipe that grounds chain-of-thought in PDDL semantics and uses the VAL verifier for stepwise truth-checking; on PlanBench, a Llama-3-8B model reaches 94% valid plans with an absolute +66% gain over baseline, and Mystery Blocksworld jumps from 1%→64% (≈64×), trained on 2× RTX 3080 GPUs. The method trains models to explain planning failures, reason over preconditions/effects, and iteratively refine with detailed validator feedback before a final evaluation without feedback—yielding verifiable, machine-checkable plans rather than plausible text….

Image source: marktechpost.com

For content partnership with marktechpost.com, please TALK to us

From our Sponsor

[FREE WEBINAR] Contain Lateral Movement and Protect Your Clients using NetBird Integration with Acronis.

Date and Time: 30th September, 5:00 PM CET [45 minutes with Q&A]

Adversaries are increasingly targeting Managed Service Providers (MSPs) with sophisticated tactics and techniques. According to the Acronis Cyberthreats Report, H2 2024, sophisticated APT-linked ransomware groups are eyeing MSPs—exploiting PowerShells, weak RDP passwords, unpatched devices, and compromised VPN credentials. The adversaries are relentless. But how can MSPs shift from a reactive approach and get proactive to reduce the blast radius?

Join us for an exclusive session with James Abercrombie, Technology Evangelist, Acronis, and Naren Vaideeswaran, Head of Product Marketing, NetBird, as they discuss how the integration works, the benefits, and how MSPs can effectively shrink the attack surface.

In this webinar, you will learn:

The impact of lateral movement and how ransomware is affecting businesses and reputation
How a multi-layered defense paves the way for effective prevention, detection, and disaster recovery readiness
How NetBird and Acronis integrate to contain evolving threats and protect your business.

_(Sponsored)