How to deploy machine learning models as a microservice using fastapi

As of today, FastAPI is the most popular web framework for building microservices with python 3.6+ versions. By deploying machine learning models as microservice-based architecture, we make code components re-usable, highly maintained, ease of testing, and of-course the quick response time. FastAPI is built over ASGI (Asynchronous Server Gateway Interface) instead of flask’s WSGI (Web Server Gateway Interface). This is the reason it is faster as compared to flask-based APIs.

It has a data validation system that can detect any invalid data type at the runtime and returns the reason for bad inputs to the user in the JSON format only which frees developers from managing this exception explicitly.

In this post the objective is to explain the machine learning model deployment as microservices with the help of FastAPI. So we will focus on that part not on the model training.

Step 1. Make your model ready for which you want to create the API

To create API for prediction we need the model ready so I have written few lines of code which train the model and save it as LRClassifier.pkl file in the local disk. I have not focused in exploratory data analysis, per-processing or feature engineering part as that is out of the scope for this article.

 import pandas as pd
 from sklearn.model_selection import train_test_split
 from sklearn.linear_model import LogisticRegression
 import pickle
 # Load dataset
 url = ""
 names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
 dataset = pd.read_csv(filepath_or_buffer=url,header=None,sep=',',names=names)
 # Split-out validation dataset
 array = dataset.values
 X = array[:,0:4]
 y = array[:,4]
 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=1, shuffle=True)
 classifier = LogisticRegression(),y_train)
 # save the model to disk
 pickle.dump(classifier, open('LRClassifier.pkl', 'wb'))
 # load the model from disk
 loaded_model = pickle.load(open('LRClassifier.pkl', 'rb'))
 result = loaded_model.score(X_test, y_test)

Jupyter snippet of the above code:

Python Jupyter notebook code for Logistic Regression | Data Science Duniya
Python Jupyter notebook code for Logistic Regression

Step 2. Create API using FastAPI framework

Start from scratch so that you don’t get any error:

  • Open VS code or any other editor of your choice. I use VS code
  • Using file meny open the directory where you want to work
  • open terminal and create the virtul environment as below:
    • python -m venv venv-name
    • Activate venv using venv-name\Scripts\activate
  • Install Libraries:
    • pip install pandas
    • pip install numpy
    • pip install sklearn
    • pip install pickle
    • pip install FastAPI
  • Import libraries as shown in below code.
  • create a FastAPI “instance” and assign it to app
    • Here the app variable will be an “instance” of the class FastAPI.
    • This will be the main point of interaction to create all your API.
    • This app is the same one referred by uvicorn in the command as below:
  • Here main is the name of file where you are writing the code. you can give any name but same you have to use while executing in the command in place of main.
  • When you need to send data from a client (let’s say, a browser) to your API, you send it as a request body.
  • A request body is data sent by the client to your API. A response body is the data your API sends to the client.
  • Your API almost always has to send a response body. But clients don’t necessarily need to send request bodies all the time.
  • To declare a request body, you use Pydantic models with all their power and benefits.
  • Then you declare your data model as a class that inherits from BaseModel.
  • Use standard Python types for all the attributes.
  • In our case we want to predict the Iris Species so will create a data model as class with four parameters which are the dimensions of the species.
  • Now create an end point also known as route named “predict”
  • Add a parameter of type data model we created which is “IrisSpecies”.
  • Now we can post data as json and it will be accepted in iris variable.
  • Next we will load the already saved model in a variable loaded_model.
  • Now perform the prediction as usual we do in machine learning and return the results.
  • now you can run the app and see the beautiful User Interface (UI) created by FastAPI which uses Swagger now known as openAPI as backened for designing the documentation and UI.
  • Full code is given below you can simply copy and paste and it will work if you have followed the above steps properly.
from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np
import pandas as pd

app = FastAPI()

class IrisSpecies(BaseModel):
    sepal_length: float 
    sepal_width: float 
    petal_length: float 
    petal_width: float'/predict')
async def predict_species(iris: IrisSpecies):
    data = iris.dict()
    loaded_model = pickle.load(open('LRClassifier.pkl', 'rb'))
    data_in = [[data['sepal_length'], data['sepal_width'], data['petal_length'], data['petal_width']]]
    prediction = loaded_model.predict(data_in)
    probability = loaded_model.predict_proba(data_in).max()
    return {
        'prediction': prediction[0],
        'probability': probability

VS-Code snippet of the API creation:

IrisSpecies Classifier API creation using FastAPI

Executing the APP:

FastAPI execution | Data Science Duniya
FastAPI execution

Now if you can see the nice UI created by typing the url:

Below you see the API end point is created as POST request.


Click on the end point and it will expand as below.

FastAPI Iris Species classifier API | Data Science Duniya

Now click on Try it out and paste the dimensions to get the prediction.

FastAPI end point parameter body | Data Science Duniya

I pasted some dummy dimensions and clicked on execute.

FastAPI Parameter body input test data for Iris species prediction | Data Science Duniya

Now you see that it has predicted it as Iris-setosa with 99% accuracy.

You can directly call this api from any where as below:

import requests                 
new_measurement = {
   "sepal_length": 1.2,
   "sepal_width": 2.3,
   "petal_length": 1.4,
   "petal_width": 2.8
response ='', json=new_measurement)           
>>> b'{"prediction":"Iris-setosa","probability":0.99}'

So this was all about the API creation using the FastAPI.

FastAPI also provide the nice documentation which get created automatically. just type in the browser

FastAPI redoc using openAPI for Iris Species Prediction using Logistic Regression | Data Science Duniya

git hub repository link:

That’s it for this article. Hope you enjoyed reading. Share your thoughts on what is your experience with FastAPI. Also you can ask if you get any doubts during implementation using the comment.

Recommended Articles:



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.