What is MLOps?
In a hypothetical world, you built an AI-powered self-driving car. It has all the capabilities— from intricate design, implementation, training, and testing. Would you keep it locked inside a garage? We hear your resounding answer, No! The car will make its way to the roads, encountering unknown topography and unforeseen issues. Congratulations! Your proof-of-concept has graduated to a mature system which will now need a team to maintain it. The upkeep of its infrastructure needs and deployment cannot simply be addressed by DevOps, for this AI-powered car involves data, its associated complexities and requires long-term model coherence.
Apart from the mechanical wear and tear, your AI-powered car will need monitoring and handling of issues that may arise during its operation. For instance, the slew of real-world training images of pedestrians, blockages, and other vehicles make for a dynamic dataset. The models that power the car need to incorporate this volatility. Moreover, adhering to new requirements for user privacy or legal provisions might require enhancements that do not affect your business objectives. Suppose that this car was well received, and now you have been assigned to use your AI model to build a smart mini airplane. Does it make sense to build the AI-powered plane from scratch? Would it not be better to have integrable parts of your model available in a repository that controls versions and permissions? MLOps encompasses such provisions that would ensure a smooth-running AI-powered vehicle in the long term.
MLOps is the process that ensures the continuous delivery of ML software. It encompasses the ability to manage ML models as reusable software artifacts and maintain the statistical properties of the data through the collaboration of different functionalities. While it might sound reminiscent of DevOps, it has its distinctive principles.
Here we are talking about the difference between MLOps, DevOps, and ModelOps.
The Lifecycle of MLOps
Incorporating MLOps can be a game-changer for your machine learning model lifecycle management. More so, since such requirements are governed by dynamic scenarios of volatile data and demography changes that may affect the ability to realize your business objectives. Some numerous repositories and MLOps resources can initiate you to MLOps, but we suggest you read further to get all the information we have put together.
In analogy with software development lifecycle, MLOps too has its lifecycle. The lifecycle across ML development is divided into four recurring steps, where each step is an umbrella for multiple tasks.
Now let us see what each step of MLOps does under the hood and which MLOps tools will best help you implement them.
- The build step focuses on the dynamics of the training data and the data that the model(s) will encounter post-deployment
- Such data considerations try to minimize the gap between the two and ensure that the real-time data does not compromise model performance. Apache Superset allows for rich visualizations and the building of dashboards for data exploration. If working on big datasets, Dask proves to be a savior by scaling the Python ecosystem to distributed computing
- Another concern addressed in this step is if the variables that improve the model’s predictive performance can be reproduced in production. For instance, the data input by a user would be transformed before being used by the model. Such transformations should continue to give sensible results for a deployed model that takes dynamic consumer data
- This step also considers the complexity that accounts for feature engineering and explainability
- A set-up that works on data pipelines can streamline, automate and monitor the workflows for the above requirements. Apache Zeppelin is a sophisticated solution that interprets several language backends to enable visualization-driven analytics. Apache Airflow orchestrates scheduling and execution of the pipelines through an easy-to-use, powerful interface. Luigi and Prefect are alternatives that provide similar functions
- Argo is specific to Kubernetes-based deployment, and Cronitor is a lightweight solution to monitor cron jobs. Horovod supports distributed deep learning frameworks
- Lastly, the build step of the MLOps pipeline explores how the models will be tested to reflect the rigors of a production environment. Kubeflow offers the infrastructure for distributed training and management of resources such as configuration and model files, metadata, and model tracking information
- The manage step of the MLOps lifecycle aims to track models for their conception, version control, approval for deployment, testing, and eventual release to production. A central repository that may also store metadata, performance metrics, and links to datasets helps achieve this extensive management
- Analogous to source code and document files, ML models also need to be tracked within organizations. More so, since the models may be re-used in different systems to foster collaboration between teams
- Tracking the data used for training, testing, and environment helps streamline change management. This is particularly useful to enforce compliance. DVC and Pachyderm are some powerful tools that can be used for this purpose
- A trained and tested model in a central location saves on efforts and time through re-use
- A tool like Feast can facilitate a centralized registry that publishes and maintains features and promote their sharing among teams
- While the role may not be clearly defined at an organizational level, this is where the functions of an MLOps engineer come to the fore. This particular role manages the model governance as well as versioning, permissions, tracking, and re-use
- MLlib is a scalable ML library by Apache that offers a rich collection of ML algorithms and support for their pipelining, persistence, and utilities like feature engineering and hyperparameter tuning
- The creation of multi-owner workflows can be done through Flyte
- The deploy and integrate step is where a simple ML pipeline graduates from the development environment to an independent form that may translate to a business application. For instance, a model written as a Python script will now be published as a SQL code to be executed on a hosted database or a web-based interface
- The term deployment refers to the conversion of a model from a development environment to an executable suitable for a production environment. Integration, in the meanwhile, incorporates the deployed model to external interfaces. We can say that deployment and integration go hand in hand
- Consider the example of a loan application system that takes user information, predicts their creditworthiness, and outputs if the application should be accepted or rejected for further consideration. While a data scientist develops such a predictive model, MLOps tools translate the tested model to accessible REST APIs. A user-facing interface will now use the APIs. . The user can simply access the system through an ATM or website and check for loan credibility
- Some common methods of model deployment include CodeGen for mainstream languages like C++, Python, and Java, serverless methods that publish functions as service (FaaS), containers such as ones provided by Docker that are extremely popular, traditional server-based methods, and model interchange standards such as ONNX, PMML, PFA and so on
- KFServing provides serverless publishing on Kubernetes, while Streamlit translates data scripts into shareable web applications. Seldon Core is a popular means to bring maturity to model deployment. It provides advantages like seamless integration with any cloud platform since it functions on Kubernetes. Moreover, it facilitates language wrapping and deployment add-ons like canary rollouts and A/B tests
- The integration endpoints can be a batch job, an interactive application, real-time streaming to enable fast predictions or a connected device driven by the model output
- BentoML makes it very easy for teams to develop production-ready endpoints. Another platform, Cortex, provides deployment, management, and scaling of deployed models. It facilitates quick real-time responses through its auto-scaling APIs. Additionally, Cortex also allows batch processing of jobs
- The monitor step ensures smooth functioning of the deployed model or even protects it from going obsolete. It uses three classes of metrics, namely statistical, performance, and business or ROI. In combination with re-training and re-modeling, these metrics are used to keep the model relevant to operational needs
- The work is not done after deploying the machine learning model to production. A continuous review of the model is also critical. A statistical metric might set a threshold to the model's accuracy and initiate re-training or re-modeling if the performance falls below the threshold. Secondly, a champion-challenger practice may pit the latest version of the deployed model against a different candidate model. The third statistical metric checks if the model predictions have withstood the changes in time and concerned demography
- Organizations also need to ensure that they have provided the resources for a deployed model to run seamlessly. The monitoring of performance metrics like execution times, numbers of records saved per second, and so on to inform that the deployed model has not fallen short of memory or processing power
- The monitor step also tries to measure if the deployed model meets the business objectives that were the basis when it was envisaged. For instance, a model deployed by a social media platform to gauge user engagement will use metrics such as click-through rates
- The operationalization of an ML model will show return on investment (ROI) when the statistical metrics agree with business metrics. For instance, a recommender system may be contributing to X% of sales for a firm. A champion–challenger approach can be used to determine that another candidate model with a better accuracy leads to 2xX% conversion to sales. Thereby, the firm achieved a better ROI by using the monitor function of the MLOps platform
- Some common monitoring platforms include Amazon Sagemaker, Hydrosphere, and Censius
When Monitoring Saves the Day
“The deeper the waters are, the more still they run.”- A Korean proverb
Traditional software is stable as they run on consistent APIs. On the contrary, machine learning in production is driven by dynamic variables of real-world data. Changes in their statistical properties, inter-relationships, and the occasional creation of new variables lead to inevitable model drift.
The most common way to counter model drift is to either re-train it on fresh data or simply remodel it. Re-training a model would use the same set of data variables and model structure but with new values, while re-modeling will involve modifying the data variables. This process is more complex and includes the following steps:
- Assessment of model accuracy and re-evaluation
- Assessment of model attributes such as business goals, overfitting tendencies, bias, or explanation-mandated factors like feature attribution. The Champion-challenger method might be used here
- Diagnostic assessment of the model through statistical hypothesis tests, ROC, and so on
- Versioning of the model and audit for compliance
The extent of observations and assessment is huge, and doing them manually can be tedious and prone to human error. More so because CI/CD for machine learning results in a dynamic system that badly needs monitoring of various aspects. Censius AI Observability Platform is a one-stop solution to all your MLOps monitoring needs.
The Censius observability tool can easily align accuracy and business metrics to contain model drifts and unintended bias—moreover the automation guards against upstream data changes and their degrading effects on ROI. The three easy steps to use the platform do not require coding expertise.
- Register your model, log the relevant features and sit back to capture predictions
- Achieve tailor-made observability by choosing from a plethora of monitor configurations
- Use the rich interface to monitor violations and analyze issues
The platform provides automated monitoring for the whole of your ML lifecycle.
What’s more? You will get to see visuals and make custom dashboards for diagnostic assessment of the models.
AI explainability has become quite popular due to its ability to make BlackBox models more transparent to encourage trust among users. Censius platform also brings explainability to the table to facilitate more trustworthy solutions.
Wrapping up
Machine learning operations management guarantees the continuous smooth functioning of your deployed model. The four steps of MLOps form a well-rounded approach to translate model metrics to better business outputs. Moreover, it offers several insights on how to scale ml models. We have condensed our knowledge about challenges in ML that can help you sail through choppy waters using MLOps. If you are still not convinced that it is the way to the future, then do check out what MLOps has in store for the year 2022.
If your organization has decided to operationalize ML models, we suggest you also be aware of the five pillars that uphold an MLOps platform. While the functions may seem daunting, there are platforms such as Censius that can automate your MLOps monitoring needs and let your team focus on building more powerful AI solutions.
Explore Censius AI Observability Tool
Sources:
Explore how Censius helps you monitor, analyze and explain your ML models
Explore Platform