# Streamlit Guide: How to Build Machine Learning Applications

Building machine learning applications keeps getting easier. With Streamlit, you can develop machine learning apps quickly and easily. You can also use the Streamlit sharing platform to deploy your applications in just a couple of clicks.

It doesn’t take long to start developing with Streamlit, since you don’t even need any front-end web development experience. With Streamlit, you script everything with Python. Streamlit is also compatible with data science libraries that you probably know.

In this article, we’ll be taking a look at how you can leverage Streamlit to develop your machine learning applications.

## Streamlit installation

$pip install streamlit If you’re just trying it, you can run the hello world example: $ streamlit hello

There are also bigger examples available in the official repo. When working with Streamlit, you will usually import it as st.

### How to run Streamlit applications

Assuming that you have written Streamlit code in a file called app.py, you can run the application with:

$streamlit run app.py run app.py will start running the application on your local machine, and provide a link that you can use to access the app on the network. ## Streamlit widgets Streamlit makes it easy to develop web apps by providing widgets. Let’s look at some of them. ### Displaying text in Streamlit Streamlit has several widgets for displaying text, for example: • st.text displays fixed-width and preformatted text • st.markdown shows markdown text • st.latex displays mathematical expressions formatted as LaTex • st.write behaves differently depending on the inputs given to it. For example: • when you pass a data frame to it, it prints the data frame as a table • displays information about a function when a function is passed to it • displays a Keras model when one is passed to it • st.title displays text in title formatting • st.header displays text in header formatting • st.code displays code Here is an example of all of them in action: import streamlit as st st.code("st.text()", language='python') st.text('Neptune AI Blog') st.code("st.markdown()", language='python') st.markdown('# This is Heading 1 in Markdown') st.code("st.title()", language='python') st.title('This is a title') st.code("st.header()", language='python') st.header('Header') st.code("st.subheader()", language='python') st.subheader('Sub Header') st.code("st.latex()", language='python') st.latex(r''' ... a + ar + a r^2 + a r^3 + \cdots + a r^{n-1} = ... \sum_{k=0}^{n-1} ar^k = ... a \left(\frac{1-r^{n}}{1-r}\right) ... ''') st.code("st.write()", language='python') st.write('Can display many things') ### Displaying data in Streamlit Streamlit can also display data. Data can be displayed as either JSON, a table, or a data frame. df = pd.read_csv("data.csv") st.dataframe(df) st.table(df) st.json(json_data)  ### Displaying media in Streamlit Streamlit also lets you add media to your applications. You can add audio, video or images. To do this, you use the audio, video and image widgets. from PIL import Image icon = Image.open("icon.png") image = Image.open("logo.png") st.image(image) video_file = open("video.mp4", "rb") video_bytes = video_file.read() st.video(video_bytes) audio_file = open("video.wav", "rb") audio_bytes = audio_file.read() st.audio(audio_bytes) ### Displaying code in Streamlit Use the with st.echo() command to display the code after it. For instance, this code will show the code, then display the data frame: with st.echo(): df = pd.read_csv("data.csv") st.dataframe(df)  ### Displaying progress and status in Streamlit When building your application, it’s always good practice to show user progress or some status. For example, when loading a large dataset, you can show a progress bar. Some other status and progress widgets that you can use in Streamlit include: • st.spinner() displays a temporary message when executing a block of code • st.balloons() shows celebratory balloons • st.error() displays an error message • st.warning shows a warning message • st.info shows informational messages • st.success shows success messages • st.exception communicates an exception in your application import time my_bar = st.progress(0) for percent_complete in range(100): time.sleep(0.1) my_bar.progress(percent_complete + 1) st.spinner() with st.spinner(text='In progress'): time.sleep(5) st.success('Done') st.balloons() st.error('Error message') st.warning('Warning message') st.info('Info message') st.success('Success message') e = RuntimeError('This is an exception of type RuntimeError') st.exception(e) ### Displaying charts in Streamlit Streamlit supports visualizations in the following libraries: Streamlit also provides a couple of functions to perform basic visualizations: • st.line_chart(data) for line charts • st.area_chart(data) for an area chart • st.bar_chart(data) to display a bar chart • st.map(data) for plotting data on a map ### Interactive widgets in Streamlit Streamlit also has widgets that let users interact with your application, for example: • you can use the select box to let the user choose between several options (say, enable the user to filter data depending on a certain category) • the multi-select widget is similar to the select box, but allows multiple selections • the text area and text input widgets can be used to collect user input • the date and time input can be used to collect time and date input • you can also let users upload files using the file uploader widget (this can come in handy when you’ve built an image classifier or object detection model, and want users to upload images and see the result) st.button('Click here') st.checkbox('Check') st.radio('Radio', [1,2,3]) st.selectbox('Select', [1,2,3]) st.multiselect('Multiple selection', [21,85,53]) st.slider('Slide', min_value=10, max_value=20) st.select_slider('Slide to select', options=[1,2,3,4]) st.text_input('Enter some text') st.number_input('Enter a number') st.text_area('Text area') st.date_input('Date input') st.time_input('Time input') st.file_uploader('File uploader') st.color_picker('Color Picker') ### Caching in Streamlit In any application, caching servers improves user experience by ensuring that data and certain functionalities are available to the user on demand. For example, you can have your application cache data, to reduce time spent on fetching the data. You can also cache the result of functions that return data. @st.cache def fetch_data(): df = pd.read_csv("data.csv") return df data = fetch_data() The first time you run a function with @st.cache, the result will be stored in a local cache. If the code, input parameters, and name of the function don’t change the next time the function is called, Streamlit will skip execution and read the cache’s result. ## Personalizing Streamlit apps In Streamlit, you can personalize: • the title of the page • the page’s icon • the layout of the page (centered or wide) • whether sidebar will be initially loaded icon = Image.open("icon.png") st.set_page_config( page_title="Data Application", page_icon=icon, layout="centered", initial_sidebar_state="auto", ) ## Streamlit configurations Streamlit runs perfectly with the default configurations. You can check your current configurations using the command below. $ streamlit config show

However, sometimes you encounter a situation that forces you to add or change the default settings. There are four different ways of doing that.

### In a global file

In this case, the ~/.streamlit/config.toml for macOS/Linux is edited. In Windows that will be %userprofile%/.streamlit/config.toml. For example, you can change the default port that Streamlit runs on.

# The port where the server will listen for browser connections.
# Default: 8501
port = 8502

### As flags on the command line

You can also set the configurations using a flag when executing the Streamlit run command.

\$ streamlit run app.py --server.port 8502

## Integrating visualization libraries to Streamlit

Let’s look at how you can use Streamlit with common visualization libraries.

### Using Matplotlib and Seaborn in Streamlit

When using Matplotlib and Seaborn in Streamlit, the only thing you have to do is define a figure and pass it to st.pyplot.

fig = plt.figure(figsize=(12, 5))
st.pyplot(fig)

### Integrating Plotly in Streamlit

When working with Plotly, you will define a figure and pass it to the plotly_chart Streamlit function.

fig = px.scatter(
….
)
st.plotly_chart(fig)

### Using Vega-Lite in Streamlit

If you are using Vega-Lite, you will use the vega_lite_chart function as shown below:

st.vega_lite_chart(
df,
{
"mark": {...},
"width": width,
"height": height,
"encoding": {
"x": {...},
"y": {...},
"size": {...},
"color": {...},
},
},
)

### Using Altair in Streamlit

When using Altair, you will define a chart using altair.Chart() and then display it using st.altair_chart():

chart = (
alt.Chart(data)
.mark_bar()
.encode(x=alt.X(...)
.properties(...)
.interactive()
)
st.altair_chart(chart)

### Visualize maps with Streamlit

You can use st.map() to plot data on maps. It’s a wrapper around st.pydeck_chart that creates scatter plot charts on top of a map:

map_data = df[["lat", "lon"]]
st.map(map_data)

When using this function, you have to use a personal Mapbox token. You can set it in the ~/.streamlit/config.toml:

[mapbox]
token = "YOUR_MAPBOX_TOKEN"

## Streamlit components

You might find that Streamlit doesn’t officially support a certain functionality that you need. This is where Streamlit components come in handy. These are a set of packages built on top of Streamlit by the community. For instance, you can use the Streamlit embed code component to embed code snippets from Github Gist, CodePen snippets, Gitlab snippets, etc.

from streamlit_embedcode import github_gist
github_gist(gist_url)

Streamlit also has an API that you can use to build your components

## Laying out your Streamlit  application

Streamlit lets you layout your application using containers and columns. However, this functionality is still in beta. As you will see, methods have the beta_ prefix. Once these features become stable, all you have to do is remove that beta prefix. beta_columns lays out containers side by side.

beta_container inserts an invisible container that can be used to hold multiple elements. Another function that you can use is the beta_expander that generates a multi-element container that can be expanded and collapsed. Here is an example of all these items in action.

with left_column:
st.altair_chart(chart)
with right_column:
st.altair_chart(chart)
with st.beta_container():
st.altair_chart(chart)
with st.beta_expander("Some explanation"):
st.write("This is an explanation of the two graphs..")

## Authenticating Streamlit apps

At the moment, there’s no official support for authentication in Streamlit. However, there are workarounds. If you’re familiar with Flask, you can write your authentication function and weave it into Streamlit.

Another hack is to use this Session State gist to add per session state to Streamlit. You can use the st.empty widget to initialize a single element container. It’s useful because once the user enters the correct password, you can just discard it and show the functionality you wanted to.

PASSWORD = config('PASSWORD')
your_function()

st.success("Logged in successfully")
your_function()
st.error("Wrong password")

Let’s take a look at how you can upload and process files in Streamlit. This example will focus on image data, although you can upload other files such as CSV files and so on.

### Build an image classification application with Streamlit

For this illustration, let’s use a pre-trained TensorFlow Hub model to build an application that can identify the type of disease given a plant’s leaf image. Here’s a demo of this application. Later on, you will work on bringing it to production.

For this application, you need the following packages:

• streamlit, obviously
• pillow for resizing the image that the user will upload
• matplotlib for showing the image
• tensorflow_hub for loading the pre-trained model
• numpy for expanding the dimensions of the image
import streamlit as st
from PIL import Image
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
from tensorflow.keras import preprocessing

This application will have two main functions:

• main function: let the user upload an image
• prediction function: run inference on uploaded image

Let’s start with the main function. You can use header to set the application heading as seen earlier.

In the main function, you use st.file_uploader to let the user upload an image. You can also specify accepted file types in the file uploader. After uploading the image, Image from Pillow is used to open it. Next, you run the prediction function and display the result. You can plot the image below the result.

st.header("Predict if plant is healthy")

def main():

fig = plt.figure()
plt.imshow(image)
plt.axis("off")
predictions = predict(image)
st.write(predictions)
st.pyplot(fig)

Now to the prediction function. There are a couple of things that you need to do in this function:

• load the pre-trained model from TensorFlow Hub
• resize the image uploaded by the user to the required size (this model requires a 300 x 300 image)
• convert the image to an array and standardize it
• include the batch size in the image dimension (it’s just one image, so the batch size is 1)
• run predictions and map them to class names
def predict(image):
classifier_model = "https://tfhub.dev/agripredict/disease-classification/1"
IMAGE_SHAPE = (300, 300,3)
model = tf.keras.Sequential([
hub.KerasLayer(classifier_model,input_shape=IMAGE_SHAPE)])
test_image = image.resize((300,300))
test_image = preprocessing.image.img_to_array(test_image)
test_image = test_image / 255.0
test_image = np.expand_dims(test_image, axis=0)
class_names = [
'Tomato Healthy',
'Tomato Septoria Leaf Spot',
'Tomato Bacterial Spot',
'Tomato Blight',
'Cabbage Healthy',
'Tomato Spider Mite',
'Tomato Leaf Mold',
'Tomato_Yellow Leaf Curl Virus',
'Soy_Frogeye_Leaf_Spot',
'Soy_Downy_Mildew',
'Maize_Ravi_Corn_Rust',
'Maize_Healthy',
'Maize_Grey_Leaf_Spot',
'Maize_Lethal_Necrosis',
'Soy_Healthy',
'Cabbage Black Rot']
predictions = model.predict(test_image)
scores = tf.nn.softmax(predictions[0])
scores = scores.numpy()
results = {
'Tomato Healthy':0,
'Tomato Septoria Leaf Spot':0,
'Tomato Bacterial Spot':0,
'Tomato Blight':0,
'Cabbage Healthy':0,
'Tomato Spider Mite':0,
'Tomato Leaf Mold':0,
'Tomato_Yellow Leaf Curl Virus':0,
'Soy_Frogeye_Leaf_Spot':0,
'Soy_Downy_Mildew':0,
'Maize_Ravi_Corn_Rust':0,
'Maize_Healthy':0,
'Maize_Grey_Leaf_Spot':0,
'Maize_Lethal_Necrosis':0,
'Soy_Healthy':0,
'Cabbage Black Rot':0
}

result = f"{class_names[np.argmax(scores)]} with a { (100 * np.max(scores)).round(2) } percent confidence."
return result

The final step is just to run the main function.

if __name__ == "__main__":
main()

Check out the complete example here

## Use Hugging face to develop a natural language processing application with Streamlit

Let’s take a look at another example: building a natural language processing app using Hugging Face. This app uses the transformers package from Hugging Face, so don’t forget to install it via pip.

Some of the functionalities that you can perform using this package include:

• text summarization
• text translation
• text classification
• named entity recognition

Let’s transform some sentences.

You can start by importing the two packages required for this example, and then create a select box that will let the user select a task.

import streamlit as st
from transformers import pipeline

option = st.selectbox(
"Select Option",
[
"Classify Text",
"Text Generation",
"Named Entity Recognition",
"Summarization",
"Translation",
],
)

Running any of the tasks requires the initialization of pipeline alongside the task. Below is an example of how that would look like for text classification and question answering. The others have been omitted for readability.

if option == "Classify Text":
text = st.text_area(label="Enter a text here")
if text:
classifier = pipeline("sentiment-analysis")
context = st.text_area(label="Enter the context")
question = st.text_area(label="Enter the question")
if context and question:
answer = q_a({"question": question, "context": context})
st.write(answer)

You can find the complete Hugging Face example here

## Deploying Streamlit applications

After you build an application, you want to host it somewhere to make it accessible to users. Let’s look at how you can host that plant disease application that we worked on earlier. There’s a couple of options for doing that:

• Streamlit sharing
• Heroku

### Streamlit sharing

Streamlit sharing is the easiest and quickest way to bring your Streamlit application to production. There’s one caveat – it has to be publicly hosted at Github.

If your application is meant to be public, then this is a great choice. However, at the moment, you have to request an invitation before you can gain access to the Streamlit sharing platform.

Once you receive your invitation, deploying your Streamlit app is done with the click of a button. You just select the repo and click ‘deploy’. You need to make sure that you push your app to Github along with your app’s requirements in a requirements.txt file.

### Heroku

To deploy your Streamlit application to Heroku, you will need three files:

• a Procfile that informs Heroku about the type of your application (in this case, Python)
• a requirement.txt that contains all the packages needed for the application
• a setup.sh file that contains information about Streamlit configurations

With that in place, you will continue with your usual Heroku deployment process. Check the deploy folder of this linked repo to see the contents of the files mentioned above.

## Streamlit and Neptune AI

You can also use Streamlit to build custom visualizations for your Neptune experiments. Let’s assume that you have a LightGBM experiment and would like to visualize the running time vs parameter boosting type. Of course, you can visualize as many items as you want; this is just an illustration of how to do that.

Okay, first, you need to get your dashboard data from Neptune. The first step is to initialize Neptune with your API key:

project = neptune.init(project_qualified_name='mwitiderrick/LightGBM', api_token='YOUR_API_KEY')

Next, define a function that will load the data from Neptune. You can also cache the data depending on the frequency of your experiments. In this case, 60 seconds is used, meaning that the cache will expire after 60 seconds. The function returns a data frame which is later stored in a df variable.

@st.cache(ttl=60)

df = get_leaderboard_data()

The next step is to use your favorite visualization tool to plot the columns in this data frame. Let’s use Plotly here.

def visualize_leaderboard_data():
fig = px.pie(
df,
hole=0.2,
values="running_time",
names="parameter_boosting_type",
title="Running time vs Parameter boosting type",
color_discrete_sequence=px.colors.sequential.Blackbody,
)
st.plotly_chart(fig)

if __name__ == "__main__":
visualize_leaderboard_data()

Check out the complete example here

### Visualize Neptune project progress with Streamlit

With that background in place, let’s look at how you can use Streamlit to visualize the progress of a Neptune experiment. This example will build on the previous one because the leaderboard data is still required here.

For this example, you will need to install neptunecontrib.

After that, you can extract progress information from the leaderboard data frame. To do this, you use the extract_project_progress_info function from neptunecontrib.api.utils.

def get_progress_data():
metric_colname='running_time',
time_colname='created')
return progress_df

The function requires a metric column of your choice, and a time column in timestamp format. It then extracts information that’s relevant for analyzing project progress.

Here’s a visual of how that progress dataframe would look like:

The progress data frame can then be visualized using the project_progress function from neptunecontrib.viz.projects.

The function creates an interactive project progress exploration chart. Since it returns an Altair chart, you can display it in your Streamlit app using st.altair_chart.

progress_df = get_progress_data()

def visualize_progress():
plot = project_progress(progress_df, width=400, heights=[50, 200])
st.altair_chart(plot)

Check out the complete example here

## Final thoughts

In this article, we explored how to build applications with Streamlit, and performed several examples. Complete examples have been omitted for readability, otherwise the article would’ve been much too long.

However, you can check out all the complete examples in this repo. Since they’re Streamlit apps, you can just clone them and deploy them on Streamlit sharing to see them. Alternatively, you can just run them on your local machine.

I can’t wait to see what you build!

## ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

Jakub Czakon | Posted November 26, 2020

Let me share a story that I’ve heard too many times.

”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…

…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…

…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”

– unfortunate ML researcher.

And the truth is, when you develop ML models you will run a lot of experiments.

Those experiments may:

• use different models and model hyperparameters
• use different training or evaluation data,
• run different code (including this small change that you wanted to test quickly)
• run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)

And as a result, they can produce completely different evaluation metrics.

Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.

This is where ML experiment tracking comes in.