This repository is part of the course of MLOPs for USFQ. To work with pipelines of ML, and the deployment in production, we will use BentoML. BentoML is an open-source ML model serving framework that helps developers package, deploy, and serve ML models in production easily and efficiently.
First of all, the student has to create an account in bentoml, in the next URL: https://www.bentoml.com/
Create Python environment using virtualenv:
virtualenv -p python3.10 myvenv310Create Python environment using conda:
conda create --name myvenv310 python=3.10To install Python libraries:
pip install bentoml mlflow scikit-learnLogin in bentoml using local computer:
bentoml cloud loginTo verify the model is saved to the Model Store:
bentoml models listTo start MLflow tracking server:
mlflow server --host 127.0.0.1 --port 8080Construir la imagen Docker
docker build -t mlflow-bento:latest .Ejecuta el contenedor con acceso interactivo y un volumen local (para persistir datos):
docker run -it --name mlflow_bento \
-p 5000:5000 -p 3000:3000 -p 8888:8888 \
-v $(pwd):/app \
mlflow-bento:latestEsto te dejará dentro de la consola del contenedor (/bin/bash), donde puedes:
python my_script.py
mlflow ui --host 0.0.0.0
bentoml serve service:svc --host 0.0.0.0To serve the model using the BentoML CLI:
bentoml serve 04_bentoml_service.py:IrisClassifier --port=3001Make requests by curl:
curl -X 'POST' 'http://localhost:3000/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input_data": [[
5.9, 3.0, 5.1, 1.8
]]
}'curl -X 'POST' 'http://localhost:3000/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input_data": [[
5.9, 3.0, 5.1, 1.8
]]
}'curl -X 'POST' 'http://localhost:3000/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input_data": [[
5.9, 3.0, 5.1, 1.8
]]
}'curl -X 'POST' 'http://localhost:3002/v1/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input_data": [[
5.9, 3.0, 5.1, 1.8
]]
}'curl -X 'POST' 'http://localhost:3002/v2/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input_data": [[
5.9, 3.0, 5.1, 1.8
]]
}'curl -X 'POST' 'http://localhost:3002/predict_combined/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input_data": [[
5.9, 3.0, 5.1, 1.8
]]
}'To inspect the OpenAPI documentation to see the required schema for your service:
curl localhost:3000/docs.jsonTo execute a server with batching requests:
bentoml serve 07_bentoml_service_advanced.py:IrisClassifier --port=3002To execute two endpoints and an ensemble prediction:
bentoml serve 09_bentoml_service_multiple.py:IrisClassifier --port=3003Containerization: Build an OCI-compliant image for your ML service for deployment on any container platform:
bentoml buildBentoML provides multiple options for production deployment. Next steps:
- Deploy to BentoCloud:
bentoml deploy iris_classifier:nd46dyf6kkbzr5oe -n ${DEPLOYMENT_NAME}- Update an existing deployment on BentoCloud:
bentoml deployment update --bento iris_classifier:mmd2rarxb6fexe65 ${DEPLOYMENT_NAME}- Containerize your Bento with
bentoml containerize:
bentoml containerize iris_classifier:mmd2rarxb6fexe65- Push to BentoCloud with
bentoml push: $ bentoml push iris_classifier:mmd2rarxb6fexe65