New Blog: Bringing (Gen) AI from Laptop to Production with MLRun

Accelerating and Scaling AI Deployments Across Hybrid Environments

AI drives digital transformation for telecom leaders, but scaling it across complex hybrid environments remains one of the toughest challenges. Safaricom, one of Africa’s most advanced and AI-mature mobile operators, faced this head-on. With over 50 million users and an AI ecosystem spanning cloud and on-prem infrastructure, Safaricom needed to move faster, from model development to deployment, without compromising reliability, governance, or business impact.

In this blog, we explore how Safaricom reimagined its AI operations using MLRun and Iguazio to overcome legacy bottlenecks, standardize processes, and achieve 5X faster time-to-production.

This blog post is based on a webinar with Hillary Murefu Wangila, Head of AI, and Anthony Nyaga Irugu, Lead ML Engineer, from Safaricom, and Salesh Bhat, Principal Architect from Iguazio (a McKinsey company). You can dive deeper into the use cases, architectures and demo by watching the full webinar here.

How to Reliably Scale AI for 50M Users

Safaricom is one of the most successful mobile operators in East and Central Africa, serving ~50 M users. However, their legacy Al stack lacked standardization and was complicated and siloed. Achieving secure, production-grade scalability and reliability required immense effort, and production timelines were five times longer than they needed to be.

In addition, their data science and MLE teams were disparate, resulting in long processes which delayed models from reaching production. It took weeks of refactoring just to move code from notebooks to production.

The team needed an Al infrastructure that would allow them to focus on their expertise rather than tech complexity, and enable them to get to production with ease. They were looking for a solution that would let each expert focus on their own craft: data scientists on modeling, MLOps engineers on deployment and scalability, and data engineers on pipelining data.

At the same time, the solution had to enable the same experience across cloud and on-premises, streamline, automate and accelerate the Al process, and provide the foundation for new Gen AI use cases.

What is the Iguazio AI Factory?

The Iguazio AI factory is the enterprise-grade MLRun, and allows for continuous delivery, automatic deployment and monitoring of AI apps. The factory is based on 4 pipelines:

  1. Data management – Ensures data availability, quality and control to feed the ML system.
  2. Development – Standardizes processes and tooling to improve team efficiency and solution performance.
  3. Deployment – Standardizes processes and provisions tooling to reliably deploy solutions with “One Click”.
  4. LiveOps – Monitors models to maintain reliable performance and drive continuous improvement.

The platform integrates seamlessly with CI/CD tools, infrastructure as code, and both on-prem and cloud environments, enabling scalable, production-ready AI pipelines. It also allows continuous model monitoring for drift, bias, and hallucinations.

Users can also use an open-source MLRun marketplace with pre-defined functions, codes and notebooks. This simplifies pipeline development, training, inference and serving across the lifecycle.

Safaricom Deploys MLRun

Safaricom chose MLRun and the Iguazio Gen Al Factory to automate and accelerate operationalization of their Al applications in live environments. The value was demonstrated through the migration of three leading use cases, resulting in 5x acceleration of Al:

  1. Optimizing MPESA Apps – Mini apps that provide personalized services like upselling, cross-selling and customer feedback in real-time. 59% of Kenya’s GDP is processed through MPESA.
  2. Customer segmentation based on 30 metrics to garner feedback & enrich NPS
  3. Predictive modeling of customer actions & customer segmentation to decrease churn and increase upsells

(Below, we’ll dive deeper into use cases #1 and 2).

The impact:

  • 5x faster time to production
  • Standardized & automated AI operationalization
  • Gen AI-ready infrastructure
  • Support for hybrid environment: AWS and on-prem

AI Use Case Migration from On-prem to AWS with Iguazio and MLRun

As part of a strategic exercise, 16 Al use cases were planned to be migrated to AWS from on premise infra. The goal was to unify on-prem and cloud environments into a hybrid setup, enabling seamless failover and deployment between them.

  • Process steps were mapped, and the code and pipelines were moved to Iguazio which was pre deployed on AWS
  • With Iguazio and the underlying MLRun abstraction layer was provided, which ensured no breakages or consistent migration process
  • Lift and shift of pipelines and code on Iguazio, with build, test and deploy steps
  • All use cases were checked for errors, and the minimal errors were recited

Impact:

  • Simplified Al governance
  • Cost saving in terms of access to scalable AWS Infra
  • Actual migration process took just 2-3 days which massively reduced complexity, instead of weeks
  • Increased developer productivity due to enhanced experience on AWS – empowering data scientists to focus purely on modeling while MLOps engineers handled scalability and orchestration, each excelling in their craft without friction.

Use Case #1: Giga and Mini Apps Use Cases

This use case involved serving real-time propensity models to a mobile app built on MPESA. Previously, the workflow relied on manual handoffs: passing Java configs, simulation files, and notebooks between teams.

Old State Architecture

Previous workflow steps:

  1. Data collection – Massive customer behavior data was stored in a big data platform.
  2. Data preparation – Data was pre-stitched and transferred into a Postgres database (or another open-source system).
  3. Scheduling – Airflow was used to run workflows daily or monthly for different outcomes.
  4. System coordination – Multiple tools and servers had to be maintained independently and kept in sync.
  5. Metrics writing – The system wrote back run metrics (e.g., job results, performance) into Postgres.
  6. Modeling – Data scientists worked in notebooks to design and test models, spending about 80% of the manual labor here.
  7. Handover – Once modeling was done, the data scientist handed off the project to the MLOps engineer.
  8. Optimization & scaling – The MLOps engineer optimized and scaled the solution for production.
  9. Containerization – The model was containerized using Docker.
  10. Deployment orchestration – Airflow was used again to orchestrate and deploy the model into production.
  11. System maintenance – All these tools and servers had to work together seamlessly, requiring significant effort to maintain and synchronize.

New State Architecture: Giga and Mini Apps Use Cases on Iguazio Simplify the Development Lifecycle

The new Iguazio-based workflow replaced a fragmented, multi-tool pipeline with a unified, automated MLOps system powered by MLRun, as the open-source orchestration layer.

Instead of using multiple separate tools for data prep, feature engineering, scheduling, and deployment, the team now works within a single environment where preprocessing, feature management, model training, deployment, and monitoring all happen seamlessly in one place.

This reduced the workflow, cutting out manual DevOps work like writing YAMLs, Jenkins pipelines, or Docker files. Data scientists can now build, deploy, and monitor models directly from their Jupyter notebooks, achieving true “DevOps as code” and dramatically speeding up experimentation and time to production.

New workflow steps:

  1. Data ingested into the Iguazio platform HDFS/New Data lake via remote Spark
  2. Feature sets are created as a post step of aggregations from feature engineering
  3. Feature vectors created from all the feature sets
  4. Model development and training using feature vectors as model input
  5. Model Inference as a next step, typically in a batch manner
  6. Monitoring via platform UI & Grafana and retraining (if needed)
  7. Model outputs are served to the downstream applications
  8. Integration with existing Safaricom tech
  9. CI/CD uses existing Jenkins or other OTS Cl tools.

Use Case #2: NPS Corrector

Old State Architecture

This use case focused on NPS and customer behavior initially involved large, diverse datasets and a tangled network of 10+ data sources and pipelines. The original process was so complex it took nearly a year to complete when done manually.

Current State Architecture: NPS Corrector Use Cases on Iguazio to Accelerate AI Development Lifecycle

By identifying overlaps and duplications in preprocessing and aligning it with the same standardized structure used in earlier use cases, the team was able to streamline and automate the entire pipeline using MLRun as the central orchestrator. MLRun automates data ingestion, feature engineering, training, and deployment. MLRun packages their scripts, manages scaling through Kubernetes, and integrates automatically with monitoring tools like Grafana to ensure reliability and performance.

This setup eliminates the need for external schedulers like Airflow or manual DAG scripting. Data scientists can now build, schedule, deploy, and scale their models directly from Jupyter notebooks in a secure, automated, and production-grade environment.

Go deeper into these workflows in the webinar.

Demo Time

To see a demo of how to accelerate and automate complex AI workflows, watch the full webinar. Safaricom presents how the platform automatically handles containerization, networking, scaling, and security. Then, they show how GenAI can accelerate development workflows through a custom code-generation tool. This tool allows users to generate working AI applications and deploy them instantly, using natural-language prompts and a few lines of code.

Watch the full webinar here.

Recent Blog Posts
Introducing MLRun v1.10: Build better agents, monitor everything
We’re proud to announce a series of advancements in mlrun v1.10 designed to power your end-to-end or...
Michal Eshchar
March 24, 2026
Introducing MLRun Community Edition
Gilad Shapira
March 24, 2026