Lifecycle Management of Data Science Models in Redbus

Yash Bansal
6 min readFeb 8, 2024

This blog is Part 1 of a 2 part series, explaining how Machine Learning models are trained, inferred and monitored in redBus.

Data Engineering team at redBus decided to create a platform which manages the end-to-end lifecycle of a machine learning model, starting from the Data pull from a data source till the Model inferencing. Here’s how we created a platform which eases the life of Data Scientists by abstracting all the training, management, reporting, etc. into a single central platform.

Lifecycle of a model :

Before we dig into the platform, here’s a quick refresher about Machine Learning model lifecycle. The lifecycle of a model generally has 4 parts to it :

  1. Training data
    The first and foremost part in the lifecycle of a model is to fetch the training data from a Data source, run some Machine Learning algos on it, and generate a trained model, usually a .pikl file.
  2. Metrics Analysis
    Reporting/Analysis of the metrics and parameters generated by the training run are crucial in determining the accuracy for the model. Metrics like Mape, R2, etc. should be available and comparable to the previous runs.
  3. Inferencing/Deployment
    The trained Data Science model, when it undergoes all the metric quality checks, is…

--

--

Yash Bansal

100K+ views, Principal Engineer, Loves to read and write about latest tech, sometimes about life topics . Find me on Topmate - https://topmate.io/yashbansal042