MLRun + NVIDIA NeMo: Building Observable AI Data Flywheels in Production

We’ve integrated MLRun with NVIDIA NeMo microservices, to extend NVIDIA’s Data Flywheel Blueprint. This integration lets you automatically train, evaluate, fine-tune and monitor AI models at scale, while ensuring low latency and reduced resource use. Read on for all the details:

What are NVIDIA NeMo Microservices?

NVIDIA NeMo is a modular microservices platform for building and continuously improving agentic AI systems.

It provides:

RAG implementations
Model customization
Model evaluation
Guardrails for optimized agent behavior

What is an AI Data Flywheel?

A data flywheel is a process that continuously improves models and AI agents using production feedback loops. Inference results, business data and user preferences. Are fed back to the models, creating a continuous loop where AI models improve over time. According to NVIDIA, a high level flow of a Data Flywheel flow looks like this:

How MLRun + NeMo Work Together

Iguazio has collaborated with NVIDIA to power enterprise data flywheels with MLRun. MLRun acts as the flywheel orchestrator, wrapping the flywheel and powering training, fine-tuning to a specific use case, evaluation and monitoring. NeMo is the customizer and evaluator.
How the integration works:
1. Monitor – MLRun ingests interaction logs, evaluates performance, stability and resource usage. This helps organizations detect and mitigate risks associated with GenAI and AI.
2. Train & Evaluate – NVIDIA NeMo Customizer trains and fine-tunes with LoRA, p-tuning and supervised fine-tuning. NVIDIA NeMo Evaluator benchmarks candidate models with zero-shot, RAG and LLM-as-a-Judge. This is orchestrated by MLRun.
3. Feedback – MLRun orchestrates feedback from human-in-the-loop decisions.
4. Deploy – MLRun automates updates and redeployments.
Use case example:

Let’s say we want to improve a small model’s performance to match a larger model. The data Flywheel runs experiments against production logs against candidate models and surfaces efficient models that meet accuracy targets.

The Benefits of Using the Data Flywheel

60% code reduction
End-to-end automation of monitoring, training, evaluation and fine-tuning
Continuous improvement
Faster and simpler LLM tuning
Scalability across multiple models, workflows, and environments.
Lower inference costs + reduced latency.
Future-Proof – Models stay current via ongoing optimization.

Explore the joint Iguazio MLRun and NVIDIA blueprint to try for yourself.

Table of contents:

What are NVIDIA NeMo Microservices?
What is an AI Data Flywheel?
How MLRun + NeMo Work Together
The Benefits of Using the Data Flywheel