site:www.marktechpost.com

Microsoft AI Debuts MAI-Image-1: An In-House Text-to-Image Model that Enters LMArena’s Top-10

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data ...

marktechpost

Google Introduces Speech-to-Retrieval (S2R) Approach that Maps a Spoken Query Directly to an Embedding and Retrieves Information without First Converting Speech to Text

In the traditional cascade modeling approach, automatic speech recognition (ASR) first produces a single text string, which is then passed to retrieval. Small transcription errors can change query ...

marktechpost

5 Most Popular Agentic AI Design Patterns Every AI Engineer Should Know

As AI agents evolve beyond simple chatbots, new design patterns have emerged to make them more capable, adaptable, and intelligent. These agentic design patterns define how agents think, act, and ...

marktechpost

Sentient AI Releases ROMA: An Open-Source and AGI Focused Meta-Agent Framework for Building AI Agents with Hierarchical Task Execution

ROMA provides a setup.sh quick start with Docker Setup (Recommended) or Native Setup, plus flags for E2B sandbox integration (--e2b, --test-e2b). The stack lists Backend: Python 3.12+ with ...

marktechpost

Agentic Context Engineering (ACE): Self-Improving LLMs via Evolving Contexts, Not Fine-Tuning

ACE positions “context engineering” as a first-class alternative to parameter updates. Instead of compressing instructions into short prompts, ACE accumulates and organizes domain-specific tactics ...

marktechpost

Stanford Researchers Released AgentFlow: In-the-Flow Reinforcement Learning RL for Modular, Tool-Using AI Agents

Flow-GRPO (Flow-based Group Refined Policy Optimization) converts long-horizon, sparse-reward optimization into tractable single-turn updates: Benchmarks. The research team evaluates four task types: ...

marktechpost

Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities

What if an AI agent could localize a root cause, prove a candidate fix via automated analysis and testing, and proactively rewrite related code to eliminate the entire vulnerability class—then open an ...

marktechpost

How to Evaluate Voice Agents in 2025: Beyond Automatic Speech Recognition (ASR) and Word Error Rate (WER) to Task Success, Barge-In, and Hallucination-Under-Noise

Optimizing only for Automatic Speech Recognition (ASR) and Word Error Rate (WER) is insufficient for modern, interactive voice agents. Robust evaluation must measure ...

marktechpost

This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)

Most learning-based speech enhancement pipelines depend on paired clean–noisy recordings, which are expensive or impossible to collect at scale in real-world conditions. Unsupervised routes like ...

marktechpost

Google Proposes TUMIX: Multi-Agent Test-Time Scaling With Tool-Use Mixture

TUMIX runs a group of heterogeneous agents—text-only Chain-of-Thought, code-executing, web-searching, and guided variants—in parallel, then iterates a small number of refinement rounds where each ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results