Member-only story
Lifecycle Management of Data Science Models in Redbus
This blog is Part 1 of a 2 part series, explaining how Machine Learning models are trained, inferred and monitored in redBus.
Data Engineering team at redBus decided to create a platform which manages the end-to-end lifecycle of a machine learning model, starting from the Data pull from a data source till the Model inferencing. Here’s how we created a platform which eases the life of Data Scientists by abstracting all the training, management, reporting, etc. into a single central platform.
Lifecycle of a model :
Before we dig into the platform, here’s a quick refresher about Machine Learning model lifecycle. The lifecycle of a model generally has 4 parts to it :
- Training data
The first and foremost part in the lifecycle of a model is to fetch the training data from a Data source, run some Machine Learning algos on it, and generate a trained model, usually a .pikl file. - Metrics Analysis
Reporting/Analysis of the metrics and parameters generated by the training run are crucial in determining the accuracy for the model. Metrics like Mape, R2, etc. should be available and comparable to the previous runs. - Inferencing/Deployment
The trained Data Science model, when it undergoes all the metric quality checks, is…