This quick post will highlight the necessary bits to get you thinking about how you might consider deploying your Machine Learning models into production. I will be using a python stack, and more specifically:
- FastAPI
- Docker
- Scikit-Learn
This very basic proof of concept has scripts that:
train.py
a really horrible model, but that’s not the point. It fits a text classifier and saves the model to disk for use downstream.- Serve the model with FastAPI via
app.py
anduvicorn app:app --reload
if you want to test it locally. To view the docs, go tolocalhost:8000/docs
. - Dockerize the FastAPI app and serve.
Why
- Remind my future self on the basic stack needed to deploy ML APIs.
- Act as a Quick Start guide that data product owners can use to have discussions with downstream engineers
- Highlight how easy it is to get started serving models with scikit-learn, a robust ML framework that allows for reproducible pipelines. Current SOTA models tend to be built in DL frameworks like Tensorflow, but for a large set of business needs, the marginal gain is not worth the marginal effort.
The last point is especially true for teams that are starting to figure out how to deploy models, even if they are batch-oriented and not via API responses.