AI/ML Ops — Trivia & Interesting Facts¶

Surprising, historical, and little-known facts about MLOps and machine learning operations.

MLOps was inspired by DevOps, but the term appeared in 2018¶

The term "MLOps" first gained traction around 2018, roughly a decade after DevOps emerged. It was popularized by teams at Google and Uber who realized that deploying ML models had unique operational challenges — like data drift and model decay — that traditional CI/CD pipelines couldn't handle.

Only 22% of ML models make it to production¶

A 2020 Gartner study found that only 22% of machine learning projects ever reach production deployment. The gap between "works in a Jupyter notebook" and "runs reliably in production" became known as the "last mile problem" of ML, and it's the core reason MLOps exists as a discipline.

Uber's Michelangelo was one of the first MLOps platforms¶

Uber built Michelangelo starting in 2015 and publicly described it in 2017. It handled the full ML lifecycle — feature engineering, training, evaluation, deployment, and monitoring. At its peak, it managed thousands of models serving millions of predictions per second across Uber's services.

Google's TFX predates the MLOps term by three years¶

Google's TensorFlow Extended (TFX) was developed internally starting around 2015 and described in a 2017 KDD paper. It introduced concepts like data validation, model analysis, and serving infrastructure that later became standard MLOps components. The paper has been cited over 1,000 times.

Feature stores were invented at Uber in 2017¶

The concept of a centralized "feature store" — a shared repository of computed ML features — originated at Uber as part of Michelangelo. Uber engineers found that different teams were recomputing the same features independently, wasting compute and creating inconsistencies. Hopsworks and Feast later open-sourced the concept.

Model drift can make predictions useless in days¶

During the COVID-19 pandemic, many ML models experienced catastrophic drift within days as human behavior patterns changed overnight. Fraud detection models at multiple banks had to be rolled back to rule-based systems because the ML models had learned "normal" patterns that no longer existed.

MLflow has been downloaded over 30 million times¶

Databricks' MLflow, released as open source in June 2018, became the most widely adopted MLOps framework. By 2024, it had surpassed 30 million PyPI downloads. Its creator, Matei Zaharia, also created Apache Spark — making him responsible for two of the most important tools in the data/ML ecosystem.

Training costs have dropped 99.5% in six years¶

The cost to train a model to ImageNet-level accuracy dropped from approximately $1,100 in 2017 to about $5 in 2023, according to Stanford's AI Index. This 99.5% cost reduction reshaped MLOps from "how do we afford training" to "how do we manage thousands of cheap training runs."

Kubeflow started as a way to run TensorFlow on Kubernetes¶

Kubeflow began in late 2017 as a simple project to deploy TensorFlow jobs on Kubernetes. It was started by Google engineers David Aronchick and Jeremy Lewi. The scope expanded dramatically, and by 2020 it had become a full MLOps platform — though many teams found it overly complex for their needs.

The ML reproducibility crisis mirrors science's replication problem¶

A 2021 survey found that only 14% of ML papers published their code, and only 6% provided sufficient detail to reproduce results. This "ML reproducibility crisis" drove the creation of tools like DVC, Weights & Biases, and MLflow that track experiments, data versions, and hyperparameters.

Shadow deployments are the most common ML rollout strategy¶

Unlike traditional software where canary or blue-green deployments dominate, ML systems most commonly use "shadow mode" — running the new model alongside the old one, comparing predictions, but only serving the old model's results. This lets teams catch accuracy regressions before users are affected.