Accelerating and Scaling AI Deployments Across Hybrid Environments
AI drives digital transformation for telecom leaders, but scaling it across complex hybrid environments remains one of the toughest challenges. Safaricom, one of Africa’s most advanced and AI-mature mobile operators, faced this head-on. With over 50 million users and an AI ecosystem spanning cloud and on-prem infrastructure, Safaricom needed to move faster, from model development to deployment, without compromising reliability, governance, or business impact.
In this blog, we explore how Safaricom reimagined its AI operations using MLRun and Iguazio to overcome legacy bottlenecks, standardize processes, and achieve 5X faster time-to-production.
This blog post is based on a webinar with Hillary Murefu Wangila, Head of AI, and Anthony Nyaga Irugu, Lead ML Engineer, from Safaricom, and Salesh Bhat, Principal Architect from Iguazio (a McKinsey company). You can dive deeper into the use cases, architectures and demo by watching the full webinar here.
Safaricom is one of the most successful mobile operators in East and Central Africa, serving ~50 M users. However, their legacy Al stack lacked standardization and was complicated and siloed. Achieving secure, production-grade scalability and reliability required immense effort, and production timelines were five times longer than they needed to be.
In addition, their data science and MLE teams were disparate, resulting in long processes which delayed models from reaching production. It took weeks of refactoring just to move code from notebooks to production.
The team needed an Al infrastructure that would allow them to focus on their expertise rather than tech complexity, and enable them to get to production with ease. They were looking for a solution that would let each expert focus on their own craft: data scientists on modeling, MLOps engineers on deployment and scalability, and data engineers on pipelining data.
At the same time, the solution had to enable the same experience across cloud and on-premises, streamline, automate and accelerate the Al process, and provide the foundation for new Gen AI use cases.
The Iguazio AI factory is the enterprise-grade MLRun, and allows for continuous delivery, automatic deployment and monitoring of AI apps. The factory is based on 4 pipelines:
The platform integrates seamlessly with CI/CD tools, infrastructure as code, and both on-prem and cloud environments, enabling scalable, production-ready AI pipelines. It also allows continuous model monitoring for drift, bias, and hallucinations.
Users can also use an open-source MLRun marketplace with pre-defined functions, codes and notebooks. This simplifies pipeline development, training, inference and serving across the lifecycle.
Safaricom chose MLRun and the Iguazio Gen Al Factory to automate and accelerate operationalization of their Al applications in live environments. The value was demonstrated through the migration of three leading use cases, resulting in 5x acceleration of Al:
(Below, we’ll dive deeper into use cases #1 and 2).
The impact:
As part of a strategic exercise, 16 Al use cases were planned to be migrated to AWS from on premise infra. The goal was to unify on-prem and cloud environments into a hybrid setup, enabling seamless failover and deployment between them.
Impact:
This use case involved serving real-time propensity models to a mobile app built on MPESA. Previously, the workflow relied on manual handoffs: passing Java configs, simulation files, and notebooks between teams.
Previous workflow steps:
The new Iguazio-based workflow replaced a fragmented, multi-tool pipeline with a unified, automated MLOps system powered by MLRun, as the open-source orchestration layer.
Instead of using multiple separate tools for data prep, feature engineering, scheduling, and deployment, the team now works within a single environment where preprocessing, feature management, model training, deployment, and monitoring all happen seamlessly in one place.
This reduced the workflow, cutting out manual DevOps work like writing YAMLs, Jenkins pipelines, or Docker files. Data scientists can now build, deploy, and monitor models directly from their Jupyter notebooks, achieving true “DevOps as code” and dramatically speeding up experimentation and time to production.
New workflow steps:
This use case focused on NPS and customer behavior initially involved large, diverse datasets and a tangled network of 10+ data sources and pipelines. The original process was so complex it took nearly a year to complete when done manually.
By identifying overlaps and duplications in preprocessing and aligning it with the same standardized structure used in earlier use cases, the team was able to streamline and automate the entire pipeline using MLRun as the central orchestrator. MLRun automates data ingestion, feature engineering, training, and deployment. MLRun packages their scripts, manages scaling through Kubernetes, and integrates automatically with monitoring tools like Grafana to ensure reliability and performance.
This setup eliminates the need for external schedulers like Airflow or manual DAG scripting. Data scientists can now build, schedule, deploy, and scale their models directly from Jupyter notebooks in a secure, automated, and production-grade environment.
To see a demo of how to accelerate and automate complex AI workflows, watch the full webinar. Safaricom presents how the platform automatically handles containerization, networking, scaling, and security. Then, they show how GenAI can accelerate development workflows through a custom code-generation tool. This tool allows users to generate working AI applications and deploy them instantly, using natural-language prompts and a few lines of code.