Introducing MLRun 1.10: Build better agents, monitor everything

At MLRun, we’re proud to announce a series of advancements in MLRun v1.10 designed to power your end-to-end orchestration layer for agentic AI applications. From a powerful and versitile prompt engineering upgrade, support for remote models and a brand new interface to monitor agent performance, MLRun is continuously evolving to meet the demands of cutting-edge AI applications.

Introducing the Prompt Artifact: Build better agents

Prompt engineering is at the heart of agentic AI, but it’s often messy and hard to scale. That’s why we’re introducing the LLM Prompt Artifact: a new way to turn each LLM + prompt + configuration into a reusable, version-controlled and production-ready building block.

For teams building complex gen AI pipelines, where each task might use a different prompt or model, this feature gives you the flexibility to experiment and optimize at every step, while keeping your workflows clean and production-ready.

With LLM Prompt Artifacts, you can:

Bundle the prompt, model, and configurations into a single, production-ready artifact.
Experiment faster by testing different prompts, models, and generation settings without breaking your workflow.
Swap artifacts seamlessly to iterate on multi-step pipelines.

Build multi-step agent pipelines where each task uses different prompts or models

The LLM Prompt Artifact turns prompt engineering into a structured, repeatable process,making it easier to build, test, and deploy agents that work.

New Dashboard: A Unified View for All Your Monitoring Applications

While monitoring deterministic AI systems is fairly straightforward, monitoring gen AI systems is new and complex territory. The AI monitoring ecosystem doesn’t offer a one size fits all solution, and many use cases call for a mix of different tools that can account for guardrails, hallucinations, compliance, security risks and performace degradation over time. One of MLRun’s main strengths is its open architecture, which lets you integrate with any third-party service. With MLRun you can integrate a custom monitoring set up that goes well beyond standard built-in dashboards.

As part of our ongoing work on the future of monitoring for gen AI, MLRun 1.10 introduces the Monitoring Applications view: a single, centralized dashboard that consolidates all your monitoring apps into one place. Instead of jumping between tools or manually checking individual apps, you now have a unified view of their status, activity, and results.

With this new UI, you can:

See all your monitoring apps in one place: Get a complete list of your monitoring applications, their statuses (running or failed), and key metrics like message lag and processing throughput.
Track endpoint performance to detect issues and bottlenecks over time.
Monitor the LLM Prompt end point to moniotor each LLM call separately
Drill down into metrics and artifacts for deeper debugging and optimization.
Visualize detections and shard-level performance to understand throughput, message lag, and system health.

This dashboard gives you the tools to monitor and refine every part of the process, from prompt engineering to model evaluation. Now you can confidently deploy and scale agentic AI systems with the data to continuously improve them.

Simplify Hybrid Workflows with Remote Model Support

Agentic AI often requires combining the best tools and models from multiple sources, whether they’re stored locally or hosted on platforms like Hugging Face. But managing these external models can quickly become a headache, with duplicated files, scattered tracking, and unnecessary storage costs. Now, you can register and manage these remote models directly in MLRun without duplicating files or uploading them to your datastore.

With this feature, you can:

Combine local and remote models in hybrid workflows.
Centralize governance and tracking for all your models in one place.
Reduce storage costs and complexity by avoiding unnecessary duplication of large model files

Run Agent Pipelines On Demand with Serving Graph Jobs

Need to run batch inference, scheduled evaluations, or one-time scoring tasks? With MLRun 1.10, you can now deploy serving graphs as Kubernetes jobs. This makes it easy to evaluate multiple prompts, compare agents in parallel, or run bulk tasks without spinning up unnecessary infrastructure.

A note from the team

MLRun 1.10 is more than just a version update, it’s a toolkit for building smarter, faster, and more reliable agentic AI applications. Beyond these features, this release is full of numerous bug fixes, documentation improvements and user requests. We want to extend a huge thank you to the mlrun community for your contributions and feedback.

Stay tuned for the next round of improvements. Ready to get started with MLR 1.10? Check out the release notes for more details, or dive into the docs to start exploring the new features.

We can’t wait to see what you build with MLRun 1.10. As always, we’re here to support you every step of the way.

Happy building!
The MLRun Team

Table of contents:

Introducing the Prompt Artifact: Build better agents
New Dashboard: A Unified View for All Your Monitoring Applications
Simplify Hybrid Workflows with Remote Model Support
Run Agent Pipelines On Demand with Serving Graph Jobs
A note from the team