Blog » ML Tools » Streamlit Guide: How to Build Machine Learning Applications

Streamlit Guide: How to Build Machine Learning Applications

Building machine learning applications keeps getting easier. With Streamlit, you can develop machine learning apps quickly and easily. You can also use the Streamlit sharing platform to deploy your applications in just a couple of clicks. 

It doesn’t take long to start developing with Streamlit, since you don’t even need any front-end web development experience. With Streamlit, you script everything with Python. Streamlit is also compatible with data science libraries that you probably know. 

In this article, we’ll be taking a look at how you can leverage Streamlit to develop your machine learning applications. 

Streamlit installation

With your Python environment ready, Streamlit installation is simple:

$ pip install streamlit

If you’re just trying it, you can run the hello world example:

$ streamlit hello

There are also bigger examples available in the official repo. When working with Streamlit, you will usually import it as `st`. 

How to run Streamlit applications 

Assuming that you have written Streamlit code in a file called `app.py`, you can run the application with:

$ streamlit run app.py

`run app.py`  will start running the application on your local machine, and provide a link that you can use to access the app on the network.

Streamlit widgets

Streamlit makes it easy to develop web apps by providing widgets. Let’s look at some of them.

Displaying text in Streamlit

Streamlit has several widgets for displaying text, for example:

  • `st.text` displays fixed-width and preformatted text
  • `st.markdown` shows markdown text
  • `st.latex` displays mathematical expressions formatted as LaTex
  • `st.write` behaves differently depending on the inputs given to it. For example:
    • when you pass a data frame to it, it prints the data frame as a table
    • displays information about a function when a function is passed to it
    • displays a Keras model when one is passed to it
    • `st.title` displays text in title formatting
    • `st.header` displays text in header formatting
    • `st.code` displays code

Here is an example of all of them in action:

import streamlit as st
st.code("st.text()", language='python')
st.text('Neptune AI Blog')
st.code("st.markdown()", language='python')
st.markdown('# This is Heading 1 in Markdown')
st.code("st.title()", language='python')
st.title('This is a title')
st.code("st.header()", language='python')
st.header('Header')
st.code("st.subheader()", language='python')
st.subheader('Sub Header')
st.code("st.latex()", language='python')
st.latex(r'''
...     a + ar + a r^2 + a r^3 + \cdots + a r^{n-1} =
...     \sum_{k=0}^{n-1} ar^k =
...     a \left(\frac{1-r^{n}}{1-r}\right)
...     ''')
st.code("st.write()", language='python')
st.write('Can display many things')
Streamlit widgets

Displaying data in Streamlit

Streamlit can also display data. Data can be displayed as either JSON, a table, or a data frame.

df = pd.read_csv("data.csv")
st.dataframe(df)
st.table(df)
st.json(json_data) 

Displaying media in Streamlit

Streamlit also lets you add media to your applications. You can add audio, video or images. To do this, you use the `audio`, `video` and `image` widgets. 

from PIL import Image
icon = Image.open("icon.png")
image = Image.open("logo.png")
st.image(image)

video_file = open("video.mp4", "rb")
video_bytes = video_file.read()
st.video(video_bytes)

audio_file = open("video.wav", "rb")
audio_bytes = audio_file.read()
st.audio(audio_bytes)

Displaying code in Streamlit

Use the `with st.echo()` command to display the code after it. For instance, this code will show the code, then display the data frame:

with st.echo():
    df = pd.read_csv("data.csv")
    st.dataframe(df)
Streamlit display code

Displaying progress and status in Streamlit

When building your application, it’s always good practice to show user progress or some status. For example, when loading a large dataset, you can show a progress bar. Some other status and progress widgets that you can use in Streamlit include:

  • `st.spinner()` displays a temporary message when executing a block of code
  • `st.balloons()` shows celebratory balloons
  •  `st.error()` displays an error message
  • `st.warning` shows a warning message
  • `st.info` shows informational messages
  • `st.success` shows success messages 
  • `st.exception` communicates an exception in your application 
import time
my_bar = st.progress(0)
for percent_complete in range(100):
    time.sleep(0.1)
    my_bar.progress(percent_complete + 1)
st.spinner()
with st.spinner(text='In progress'):
    time.sleep(5)
    st.success('Done')
st.balloons()
st.error('Error message')
st.warning('Warning message')
st.info('Info message')
st.success('Success message')
e = RuntimeError('This is an exception of type RuntimeError')
st.exception(e)

Displaying charts in Streamlit

Streamlit supports visualizations in the following libraries:

Streamlit also provides a couple of functions to perform basic visualizations:

  • `st.line_chart(data)` for line charts
  • `st.area_chart(data)` for an area chart
  • `st.bar_chart(data)` to display a bar chart
  • `st.map(data)` for plotting data on a map

Interactive widgets in Streamlit

Streamlit also has widgets that let users interact with your application, for example:

  • you can use the select box to let the user choose between several options (say, enable the user to filter data depending on a certain category)
  • the multi-select widget is similar to the select box, but allows multiple selections
  • the text area and text input widgets can be used to collect user input
  • the date and time input can be used to collect time and date input
  • you can also let users upload files using the file uploader widget (this can come in handy when you’ve built an image classifier or object detection model, and want users to upload images and see the result)
st.button('Click here')
st.checkbox('Check')
st.radio('Radio', [1,2,3])
st.selectbox('Select', [1,2,3])
st.multiselect('Multiple selection', [21,85,53])
st.slider('Slide', min_value=10, max_value=20)
st.select_slider('Slide to select', options=[1,2,3,4])
st.text_input('Enter some text')
st.number_input('Enter a number')
st.text_area('Text area')
st.date_input('Date input')
st.time_input('Time input')
st.file_uploader('File uploader')
st.color_picker('Color Picker')

Caching in Streamlit

In any application, caching servers improves user experience by ensuring that data and certain functionalities are available to the user on demand. For example, you can have your application cache data, to reduce time spent on fetching the data. You can also cache the result of functions that return data. 

@st.cache
def fetch_data():
    df = pd.read_csv("data.csv")
    return df

data = fetch_data()

The first time you run a function with `@st.cache`, the result will be stored in a local cache. If the code, input parameters, and name of the function don’t change the next time the function is called, Streamlit will skip execution and read the cache’s result. 

Personalizing Streamlit apps

In Streamlit, you can personalize:

  • the title of the page
  • the page’s icon
  • the layout of the page (centered or wide)
  • whether sidebar will be initially loaded
icon = Image.open("icon.png")
st.set_page_config(
    page_title="Data Application",
    page_icon=icon,
    layout="centered",
    initial_sidebar_state="auto",
)

Streamlit configurations

Streamlit runs perfectly with the default configurations. You can check your current configurations using the command below. 

$ streamlit config show

However, sometimes you encounter a situation that forces you to add or change the default settings. There are four different ways of doing that. 

In a global file

In this case, the `~/.streamlit/config.toml` for macOS/Linux is edited. In Windows that will be `%userprofile%/.streamlit/config.toml`. For example, you can change the default port that Streamlit runs on.

# The port where the server will listen for browser connections.
# Default: 8501
port = 8502

In a per-project config file

In this case, the configurations are passed to the `$CWD/.streamlit/config.toml` file, where CWD is the folder where Streamlit is running from.

Through environment variables

Here `STREAMLIT_*`  environment variables are passed via the terminal:

$ export STREAMLIT_SERVER_PORT=8502

As flags on the command line

You can also set the configurations using a flag when executing the Streamlit `run` command. 

$ streamlit run app.py --server.port 8502

Integrating visualization libraries to Streamlit

Let’s look at how you can use Streamlit with common visualization libraries. 

Using Matplotlib and Seaborn in Streamlit

When using Matplotlib and Seaborn in Streamlit, the only thing you have to do is define a figure and pass it to `st.pyplot`.

fig = plt.figure(figsize=(12, 5))
st.pyplot(fig)

Integrating Plotly in Streamlit

When working with Plotly, you will define a figure and pass it to the `plotly_chart` Streamlit function. 

fig = px.scatter(
      ….
    )
    st.plotly_chart(fig)

Using Vega-Lite in Streamlit

If you are using Vega-Lite, you will use the `vega_lite_chart` function as shown below:

st.vega_lite_chart(
        df,
        {
            "mark": {...},
            "width": width,
            "height": height,
            "encoding": {
                "x": {...},
                "y": {...},
                "size": {...},
                "color": {...},
            },
        },
    )

Using Altair in Streamlit

When using Altair, you will define a chart using `altair.Chart()` and then display it using `st.altair_chart()`:

chart = (
            alt.Chart(data)
            .mark_bar()
            .encode(x=alt.X(...)
            .properties(...)
            .interactive()
        )
st.altair_chart(chart)

Visualize maps with Streamlit

You can use `st.map()` to plot data on maps. It’s a wrapper around `st.pydeck_chart` that creates scatter plot charts on top of a map:

map_data = df[["lat", "lon"]]
st.map(map_data)

When using this function, you have to use a personal Mapbox token. You can set it in the `~/.streamlit/config.toml`:

[mapbox]
token = "YOUR_MAPBOX_TOKEN"

CHECK ALSO
Neptune’s integrations with visualization libraries


Streamlit components

You might find that Streamlit doesn’t officially support a certain functionality that you need. This is where Streamlit components come in handy. These are a set of packages built on top of Streamlit by the community. For instance, you can use the Streamlit embed code component to embed code snippets from Github Gist, CodePen snippets, Gitlab snippets, etc.

from streamlit_embedcode import github_gist
github_gist(gist_url)

Streamlit also has an API that you can use to build your components

Laying out your Streamlit  application 

Streamlit lets you layout your application using containers and columns. However, this functionality is still in beta. As you will see, methods have the `beta_` prefix. Once these features become stable, all you have to do is remove that beta prefix. `beta_columns` lays out containers side by side. 

`beta_container` inserts an invisible container that can be used to hold multiple elements. Another function that you can use is the `beta_expander` that generates a multi-element container that can be expanded and collapsed. Here is an example of all these items in action.

Streamlit layout
with left_column:
    st.altair_chart(chart)
with right_column:
    st.altair_chart(chart)
with st.beta_container():
    st.altair_chart(chart)
with st.beta_expander("Some explanation"):
    st.write("This is an explanation of the two graphs..")

Authenticating Streamlit apps

At the moment, there’s no official support for authentication in Streamlit. However, there are workarounds. If you’re familiar with Flask, you can write your authentication function and weave it into Streamlit. 

Another hack is to use this Session State gist to add per session state to Streamlit. You can use the `st.empty` widget to initialize a single element container. It’s useful because once the user enters the correct password, you can just discard it and show the functionality you wanted to.

Streamlit authentification
Streamlit authentification
PASSWORD = config('PASSWORD')
session_state = SessionState.get(username='', password='')
if (session_state.password == PASSWORD):
    your_function()

elif ( session_state.password != PASSWORD):
    password_placeholder = st.empty()
    password = password_placeholder.text_input("Enter Password:", type="password")
    session_state.password = password

    if (password and session_state.password == PASSWORD):
        password_placeholder.empty()
        st.success("Logged in successfully")
        your_function()
    elif(password and session_state.password != PASSWORD):
        st.error("Wrong password")

Uploading and processing files in Streamlit

Let’s take a look at how you can upload and process files in Streamlit. This example will focus on image data, although you can upload other files such as CSV files and so on. 

Build an image classification application with Streamlit

For this illustration, let’s use a pre-trained TensorFlow Hub model to build an application that can identify the type of disease given a plant’s leaf image. Here’s a demo of this application. Later on, you will work on bringing it to production. 

Streamlit processing files

For this application, you need the following packages:

  • `streamlit`, obviously
  • `pillow` for resizing the image that the user will upload 
  • `matplotlib` for showing the image
  • `tensorflow_hub` for loading the pre-trained model
  • `numpy` for expanding the dimensions of the image
import streamlit as st
from PIL import Image
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
from tensorflow.keras import preprocessing

This application will have two main functions:

  • main function: let the user upload an image
  • prediction function: run inference on uploaded image

Let’s start with the main function. You can use `header` to set the application heading as seen earlier. 

In the main function, you use `st.file_uploader` to let the user upload an image. You can also specify accepted file types in the file uploader. After uploading the image, `Image` from Pillow is used to open it. Next, you run the prediction function and display the result. You can plot the image below the result. 

st.header("Predict if plant is healthy")

def main():
    file_uploaded = st.file_uploader("Choose File", type=["png","jpg","jpeg"])
   
    if file_uploaded is not None:    
        image = Image.open(file_uploaded)
        fig = plt.figure()
        plt.imshow(image)
        plt.axis("off")
        predictions = predict(image)
        st.write(predictions)
        st.pyplot(fig)

Now to the prediction function. There are a couple of things that you need to do in this function:

  • load the pre-trained model from TensorFlow Hub
  • resize the image uploaded by the user to the required size (this model requires a 300 x 300 image)
  • convert the image to an array and standardize it
  • include the batch size in the image dimension (it’s just one image, so the batch size is 1)
  • run predictions and map them to class names
def predict(image):
    classifier_model = "https://tfhub.dev/agripredict/disease-classification/1"
    IMAGE_SHAPE = (300, 300,3)
    model = tf.keras.Sequential([
    hub.KerasLayer(classifier_model,input_shape=IMAGE_SHAPE)])
    test_image = image.resize((300,300))
    test_image = preprocessing.image.img_to_array(test_image)
    test_image = test_image / 255.0
    test_image = np.expand_dims(test_image, axis=0)
    class_names = [
          'Tomato Healthy',
          'Tomato Septoria Leaf Spot',
          'Tomato Bacterial Spot', 
          'Tomato Blight', 
          'Cabbage Healthy',
          'Tomato Spider Mite', 
          'Tomato Leaf Mold',
          'Tomato_Yellow Leaf Curl Virus',
          'Soy_Frogeye_Leaf_Spot',
          'Soy_Downy_Mildew', 
          'Maize_Ravi_Corn_Rust',
          'Maize_Healthy', 
          'Maize_Grey_Leaf_Spot',
          'Maize_Lethal_Necrosis', 
          'Soy_Healthy',
          'Cabbage Black Rot']
    predictions = model.predict(test_image)
    scores = tf.nn.softmax(predictions[0])
    scores = scores.numpy()
    results = {
          'Tomato Healthy':0,
          'Tomato Septoria Leaf Spot':0,
          'Tomato Bacterial Spot':0, 
          'Tomato Blight':0, 
          'Cabbage Healthy':0,
          'Tomato Spider Mite':0, 
          'Tomato Leaf Mold':0,
          'Tomato_Yellow Leaf Curl Virus':0,
          'Soy_Frogeye_Leaf_Spot':0,
          'Soy_Downy_Mildew':0, 
          'Maize_Ravi_Corn_Rust':0,
          'Maize_Healthy':0, 
          'Maize_Grey_Leaf_Spot':0,
          'Maize_Lethal_Necrosis':0, 
          'Soy_Healthy':0,
          'Cabbage Black Rot':0
}
   
    result = f"{class_names[np.argmax(scores)]} with a { (100 * np.max(scores)).round(2) } percent confidence." 
    return result

The final step is just to run the main function. 

if __name__ == "__main__":
    main()

Check out the complete example here

Use Hugging face to develop a natural language processing application with Streamlit

Let’s take a look at another example: building a natural language processing app using Hugging Face. This app uses the `transformers` package from Hugging Face, so don’t forget to install it via `pip`. 

Streamlit Huggingface NLP

Some of the functionalities that you can perform using this package include:

  • text summarization 
  • text translation 
  • text classification
  • question answering 
  • named entity recognition 

Let’s transform some sentences.

You can start by importing the two packages required for this example, and then create a select box that will let the user select a task.

import streamlit as st
from transformers import pipeline

option = st.selectbox(
    "Select Option",
    [
        "Classify Text",
        "Question Answering",
        "Text Generation",
        "Named Entity Recognition",
        "Summarization",
        "Translation",
    ],
)

Running any of the tasks requires the initialization of `pipeline` alongside the task. Below is an example of how that would look like for text classification and question answering. The others have been omitted for readability.

if option == "Classify Text":
    text = st.text_area(label="Enter a text here")
    if text:
        classifier = pipeline("sentiment-analysis")
        answer = classifier(text)
        st.write(answer)
elif option == "Question Answering":
    q_a = pipeline("question-answering")
    context = st.text_area(label="Enter the context")
    question = st.text_area(label="Enter the question")
    if context and question:
        answer = q_a({"question": question, "context": context})
        st.write(answer)

You can find the complete Hugging Face example here

Deploying Streamlit applications

After you build an application, you want to host it somewhere to make it accessible to users. Let’s look at how you can host that plant disease application that we worked on earlier. There’s a couple of options for doing that:

  • Streamlit sharing
  • Heroku

Streamlit sharing

Streamlit sharing is the easiest and quickest way to bring your Streamlit application to production. There’s one caveat – it has to be publicly hosted at Github.

If your application is meant to be public, then this is a great choice. However, at the moment, you have to request an invitation before you can gain access to the Streamlit sharing platform. 

Once you receive your invitation, deploying your Streamlit app is done with the click of a button. You just select the repo and click ‘deploy’. You need to make sure that you push your app to Github along with your app’s requirements in a requirements.txt file. 

Streamlit deploy app

Heroku

To deploy your Streamlit application to Heroku, you will need three files:

  • a Procfile that informs Heroku about the type of your application (in this case, Python)
  • a requirement.txt that contains all the packages needed for the application
  • a `setup.sh` file that contains information about Streamlit configurations

With that in place, you will continue with your usual Heroku deployment process. Check the deploy folder of this linked repo to see the contents of the files mentioned above. 

Streamlit and Neptune AI

You can also use Streamlit to build custom visualizations for your Neptune experiments. Let’s assume that you have a LightGBM experiment and would like to visualize the running time vs parameter boosting type. Of course, you can visualize as many items as you want; this is just an illustration of how to do that.

Streamlit running time vs boosting

Okay, first, you need to get your dashboard data from Neptune. The first step is to initialize Neptune with your API key:

project = neptune.init(project_qualified_name='mwitiderrick/LightGBM', api_token='YOUR_API_KEY')

Next, define a function that will load the data from Neptune. You can also cache the data depending on the frequency of your experiments. In this case, 60 seconds is used, meaning that the cache will expire after 60 seconds. The function returns a data frame which is later stored in a `df` variable. 

@st.cache(ttl=60)
def get_leaderboard_data():
    leaderboard = project.get_leaderboard()
    return leaderboard
    
df = get_leaderboard_data()

The next step is to use your favorite visualization tool to plot the columns in this data frame. Let’s use Plotly here. 

def visualize_leaderboard_data():
    fig = px.pie(
        df,
        hole=0.2,
        values="running_time",
        names="parameter_boosting_type",
        title="Running time vs Parameter boosting type",
        color_discrete_sequence=px.colors.sequential.Blackbody,
    )
    st.plotly_chart(fig)

if __name__ == "__main__":
    visualize_leaderboard_data()

Check out the complete example here

Visualize Neptune project progress with Streamlit

With that background in place, let’s look at how you can use Streamlit to visualize the progress of a Neptune experiment. This example will build on the previous one because the leaderboard data is still required here. 

Streamlit Neptune project progress

For this example, you will need to install `neptunecontrib`. 

After that, you can extract progress information from the leaderboard data frame. To do this, you use the `extract_project_progress_info` function from `neptunecontrib.api.utils`.

def get_progress_data():
    leaderboard = project.get_leaderboard()
    progress_df = extract_project_progress_info(leaderboard,
                                            metric_colname='running_time',
                                            time_colname='created')
    return progress_df

The function requires a metric column of your choice, and a time column in timestamp format. It then extracts information that’s relevant for analyzing project progress. 

Here’s a visual of how that progress dataframe would look like: 

Streamlit project dataframe

The progress data frame can then be visualized using the `project_progress` function from `neptunecontrib.viz.projects`. 

The function creates an interactive project progress exploration chart. Since it returns an Altair chart, you can display it in your Streamlit app using `st.altair_chart`. 

progress_df = get_progress_data()

def visualize_progress():
    plot = project_progress(progress_df, width=400, heights=[50, 200])
    st.altair_chart(plot)

Check out the complete example here

Final thoughts

In this article, we explored how to build applications with Streamlit, and performed several examples. Complete examples have been omitted for readability, otherwise the article would’ve been much too long. 

However, you can check out all the complete examples in this repo. Since they’re Streamlit apps, you can just clone them and deploy them on Streamlit sharing to see them. Alternatively, you can just run them on your local machine. 

I can’t wait to see what you build!

Additional resources: 


READ NEXT

ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

Jakub Czakon | Posted November 26, 2020

Let me share a story that I’ve heard too many times.

”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…

…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…

…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”

– unfortunate ML researcher.

And the truth is, when you develop ML models you will run a lot of experiments.

Those experiments may:

  • use different models and model hyperparameters
  • use different training or evaluation data, 
  • run different code (including this small change that you wanted to test quickly)
  • run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)

And as a result, they can produce completely different evaluation metrics. 

Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.  

This is where ML experiment tracking comes in. 

Continue reading ->
Data science tools

Best Data Science Tools to Increase Machine Learning Model Understanding

Read more
Model deployment tools

Best 8 Machine Learning Model Deployment Tools That You Need to Know

Read more

Best Tools to Manage Machine Learning Projects

Read more

How to Structure and Manage Natural Language Processing (NLP) Projects

Read more