CI/CD in a conventional set-up
Good technology is not just path-breaking; it is also a concept that has withstood the wear of time and evolved to benefit new needs. In 1913, when Ford Motors started assembly line production of cars, little did they know that a technique borrowed from slaughterhouses would make its way to unimaginable domains. The process of making parts, moving them along a conveyor belt, and transporting the output of the previous step to the next set of workers or tools was the birth of assembly line manufacturing. Not only did it lower the moving time of workers, but it also allowed the engineers to find weak points. Fixing these pain points expedited the manufacturing process.
This concept found its way to the software industry as well. DevOps emerged as a set of standards that managed the conveyance of different outputs of the software development lifecycle (SDLC) to ensure a quick and smooth software delivery. Analogous to an efficient assembly line, DevOps was not restricted to shipping the product to the client but also to consuming feedback and responding to changes rapidly.
The DevOps cycle is a continuous process flow that ensures feedback and improvements can be incorporated into planning, integration, delivery, deployment, and testing. Among these, let us focus on Continuous Integration (CI) and Continuous Delivery (CD). CI enables developers to frequently enhance the common code base with their additions. While the thought of integration can give you the heebie-jeebies, many tools are now available to automate the process. An automated CI can lower the team's efforts towards building and testing. Additionally, the automated tests can quickly bring to notice any conflicts or errors and thereby allow merges that may be as frequent as hourly intervals.
Similarly, automation of CD in the next stage of the DevOps assembly line offers benefits. It aims at the automated release of the integrated and tested code to the production where the expected output is a stable codebase ready for the production environment. It also ensures that client feedback reflects in the live software. Together, CI/CD processes ensure stable and up-to-date software through automation tools such as GitLab, CML, GoCD, TeamCity, Terraform, and others. Lastly, the DevOps approach also proposed the concept of Infrastructure as a code (IAAC) where the source code and its entire infrastructure are maintained in the repository.
CI/CD in ML lifecycle
CI/CD has become a backbone of traditional DevOps and has found its place as an essential component of machine learning model lifecycle management. MLOps has identified the needs created by deploying a trained and tested model to respond to the rigors of the production environment. While several techniques benefit machine learning operations management, an MLOps pipeline needs CI/CD to quickly test and deploy new implementations to handle changes like live data and a dynamic business environment. An organization may have different needs and varying levels of access to resources, so a simple CI/CD workflow can prove to be a good starting point for an MLOps engineer. The pipeline can then be iterated and added with more processes whenever new requirements arise.
An example of CI/CD for ML could include processes like:
- Building the model. The team would build ML artifacts, save them at a central repository, maintain and run sanity checks, and also generate explainability reports.
- Deployment to test environment. The team would validate performance through test suites and manual checks in some cases.
- Deployment to production. The developed model can be deployed as a canary or an API.
While the requirements of each team may vary, a CI/CD pipeline can be helpful to ensure a robust MLOps strategy.
CI/CD Challenges in ML Lifecycle
The processes and their inherent issues
A CI/CD pipeline consists of different processes, and each comes with associated challenges.
These processes include the following points that may not necessarily be present in your workflow:
Develop and experiment
“Small actions every day lead to big results.”
A deployed model may run into bugs or demand more functionalities. Moreover, since the world of computing continuously comes up with new ML algorithms, there is always a scope for improvement.
- Your team will implement fixes and additional features, test them at their end and merge them with the central repository.
- Since there could be multiple contributors, potential challenges in this stage can be the risk of merge conflicts and a live version with errors.
- The expected outputs of this CI/CD process are called ML artifacts. An ML artifact is a testable bundle of model code, configuration and hyperparameters, training and validation datasets, development environment specifics, documentation, and test scenarios used at this level. Therefore, maintenance of versions and access control is much needed.
Testing and integration
“Forget the mistake; remember the lesson.”
CI processes depend on tests to ensure seamless merges of new additions. The tests include regression checks – if the updated code fails the employed feature engineering logic or the existing implementation.
- Unit tests are required to check against the model's training losses, garbage, or NaN prediction values.
- The newly added code should not break the existing functions and deliver the expected ML artifacts.
- The expected outputs include artifacts that are ready for the deployment environment.
Delivery to a pre-production environment
“You have to let go of what you were to become who you will be.”
This phase of a CI/CD pipeline tackles all the hurdles before the model can be assumed to be compatible with the production environment.
- The verifications at this stage of the machine learning production pipeline include checking the availability and versions of packages and computational and accelerator resources.
- The checks also include if the tested model meets the expected performance targets. The target environment may have to cater to multiple models, and compatibility issues must be addressed during this phase.
- The expected outputs include trained models with functionalities added for that particular cycle, which are now part of the model registry.
“All the world’s a stage.”
The updated model needs to undergo more tests before making its grand appearance on the target infrastructure.
- To serve the model as a prediction API, the service should be tested with inputs that it will encounter in the wild.
- Its performance also needs to be gauged through specific metrics like model latency. The team can proceed with model deployment after the pipeline has been run in the pre-production environment a certain number of times.
- The expected output of the CI/CD pipeline phase is a trained and tested model available as a prediction service.
“A watched pot never boils.”
Now that your model is in the field, there may not be users or clients who provide feedback about its performance. Monitoring ML models in production thereby gains importance and lets the team gauge model performance when it is working on live data. The monitoring involves three primary functions:
- Monitoring of resources. The system metrics such as CPU and memory consumption and network usage can be indicators of underlying issues. Their collection and monitoring facilitate detection and troubleshooting of potential performance issues.
- Scheduled diagnostics. A simple mechanism of regularly querying the model running as a prediction service can help uncover any latency issues.
- Monitoring of ML metrics. The accuracy metrics of the live model can be compared with those of previous versions or publicly available benchmark figures. Such comparisons help detect if the model needs re-training or implementation level enhancements.
- The expected output of this phase is an observation or an alert that should trigger the execution of the CI/CD pipeline or a new build and experiment phase.
How does MLOps tackle the CI/CD challenges?
MLOps practice and tools can answer all of the issues that have been listed above:
- For source code versioning, Git has been a popular choice of tool, but it is not as suitable for machine learning model versioning because it includes binary files and datasets. DVC is a recommended open-source tool to manage data versioning needs.
- Neptune is suitable for logging and tracking ML metadata, model registry, and experiment tracking apart from the usual machine learning experiment tracking. The visual interface provided by Neptune can prove to be handy for your team.
- While editors like PyCharm and Spyder allow for the writing of unit tests. WEKA is useful for manual testing of ML models. Many tools are available to automate unit tests and checks on code coverage that includes Functionize, Test.ai, and Appvance.
- Automation of pre-production checks has many benefits, such as approved code merges can trigger model delivery to the pre-production environment. The trigger may also result in a rebuild of the target environment.
- Containerization can tackle the challenges associated with the model-serving process. Docker is one of the most widely accepted containerization solutions. Several Docker hosts can be created for a target environment that caters to multiple models. The orchestration of these hosts is well-managed when contained in a Kubernetes cluster. The combination of Docker and Kubernetes can provide a suitable infrastructure to handle all of the issues that come with the delivery of ML models
- The deployment strategies defined by MLOps aim to minimize downtime risks. As per a strategy called blue-green or red-black deployment, the requests are directed to the updated model, increasing the volume gradually. If the new model displays expected functionality, the older version is phased out. Kubernetes provides such a type of deployment natively.
- Another approach to machine learning serving was borrowed from using canaries to gauge unexpected or dangerous situations in deep mines and warn the human miners. In this case, the updated model would be treated as a canary while the stable version continued to work in production. The model under canary deployment would receive a certain percentage of the workload, and its performance would be monitored.
- Canary releases are often more efficient but require more effort towards implementation, monitoring, and error handling. Tools like Seldon Core and Cortex can automate and reduce risks for the deployment process.
- Tools like Cronitor and Healthchecks.io allow teams to schedule pings in the form of HTTP requests to the live model, log results and use dashboards to analyze the deployed model’s response trends.
- Automation capabilities provided by tools such as the Censius observability platform can help detect issues in the production environment. Additionally, explainability providing platforms like ELI5 can help the team better understand the predictions and interpretations of model parameters.
How do I get my team to CI/CD?
Now that you understand CI/CD and tools that can be used, more information on open source machine learning projects and the MLOps platform can be found on awesome production ML repo. If your team is already using Git, we would like to introduce Continuous Machine Learning (CML), a potent add-on to answer all your CI/CD needs.
So how does CML elevate machine learning operations management?
CML is an open-source tool that helps automate CI/CD pipeline functions. You can automate model training, evaluation, monitoring, and view evaluation reports at every pull request. It is the obvious choice if your team uses GitLab or GitHub to manage ML models. The auto-generated reports can guide you to decide on CI/CD pipeline triggers. Moreover, it is also compatible with cloud platforms such as AWS EC2 and Azure. Here is how you can set up and get working with it:
- A GitLab, GitHub, or Bitbucket account is needed to use CML.
- Your team will also need an access token that can be generated for each of the repositories. You can find the file for token generation and appropriate steps for GitLab on the CML website here.
- Similarly, the key file and instructions for BitBucket can be found on the same website.
- And also the token access file and instructions for GitHub.
- A good way to try CML is by forking the sample CML project for a classifier model
git clone https://github.com/<your-username>/example_cml
- Create a workflow file and place it at the path .github/workflows/cml.yaml
A typical workflow file looks like this.
o To better understand how CML handles model changes, please open the train.py of the sample CML project and modify it
§ Change depth = 5
§ git checkout -b test_cml
§ git add . && git commit -m "modified forest depth"
§ git push origin test_cml
o Now execute a pull request to compare the master with the test_cml branch
o After a refresh, a comment in github-actions for the pull request with your CML report will get displayed. This is due to the cml-send-comment function of your workflow.
To summarize a typical CML workflow, changes are pushed to your repository, which triggers the execution of the workflow file. Post file runs, the results are shared to the repository management system.
Lastly, CML can also be used for data versioning with the help of DVC. One of the advantages of this approach is that you can view metric changes after the data changes have been committed.
This is the end of your initiation to CI/CD that grew its presence from DevOps to MLOps, but just the beginning of elevating your MLOps game. Thank you for reading.