Create a web app for your data science project in under an hour
Prefer to watch this? Check out my Streamlit tutorial on YouTube.
Having a portfolio is crucial to land the data science job of your dreams. And as long as you get your hands dirty working with data working with interesting use cases, you will impress your future employer. But you can always go the next mile when it comes to presenting... and make an interactive web app out of the project you built.
Until very recently this required one to learn web development and start complex React or Angular projects but no more!
Streamlit is an amazing tool that makes it extremely easy to build an interactive front-end. It is specially made with data science projects in mind and thus has a lot of useful functionality to show off your projects.
Let's walk through how to set up a very simple yet very impressive front-end that you can use to show off your project.
If you're thinking, "But Mısra, I don't have a project I can build a web app on yet!", fear not. I have just the thing for you. My Hands-on Data Science course is specifically designed to help you get your first data science project out. In the course, we get hands-on immediately and learn by doing. Go check it out. I promise you will be impressed by how fast you can progress when you put your practical skills to the test and experience working on a real-life project first hand. And then, you can build and be proud of your web app.
Download and install Streamlit
Make sure you have pip installed on your computer. Pip is the package installer for python. Once you have pip, you can go ahead and install streamlit using:
You can check that it is installed with:
Now, create a folder where you will have everything for your front-end in. Inside that folder, create another folder named data, that’s where you will have the data you worked with and a python file called main.py
Decide on a design
Before we go any further and start creating the front-end I think it’s a good idea to sketch out what you want it to look like. At this point it might be good to know what Streamlit is capable of in terms of layout. Basically by default Streamlit displays things linearly. So everything you want to show goes below the previous thing. And as an addition, you can also have a sidebar. On this sidebar, you can have field to collect input from the suser like sliders, text fields, drop down selections etc.
But with a recent update Streamlit now also lets you divide your content into containers and columns. Containers divide the page into horizontal sections and columns let you divide it into vertical sections. Columns go inside containers but you don’t have to have columns at all. You can just have one column in the middle.
So in the light of this information, here is what I sketched for my app. I am not using the sidebar because I want to be able to use the whole width of the screen and also show 2 column comfortably.
Set up main structure of the page
Now it’s time to code! Open your main.py file and import streamlit. And right away create all the horizontal sections I want.
Let’s add some of the text we want. Here is how you write text:
Very simple right? And there are a couple of options when it comes to writing headings and titles.
This is what my code looks like after I add some text. Note that the ‘with’ divider helps me write code that belong to a certain container inside its limits.
At this point I want to see what my app looks like. So I open a terminal window and make sure to navigate to the folder where I have my main.py file. Then I run:
The browser window automatically pops up and here is what I see:
That’s a great start! And actually, most of what we’re going to do next is normal Python code. Next up is actually using the data to build the first section. I get my data and put it into the data folder I created inside the folder for this streamlit application.
By the way, when you make changes to your application, you do not need to run streamlit run main.py again. If you have not closed the browser window, you will see the rerun options on the top right of your screen.
Bring in your data
I want to create a visualization for my dataset. And here is simply how to do this.
#1 import pandas library but put this at the beginning of the file
#2 read the data file
#3 calculate the amount of time each pick-up location ID occurs in the data
#4 display the values in a bar chart
In the next section, you might want to talk about some of the features you came up with. One useful way to show them is to use a list. Streamlit lets use have markdown text. That way you can decide how you want your text to look like. Here is a simple guide on how to edit markdown. https://www.markdownguide.org/basic-syntax/
In the new features container, I add the explanations:
Here is how my app looks now:
Collect user input
In the last section, I want to get some input from the user. I want to show you three different ways to do this:
- Drop down menu
- Text input
You can create a slider simply with the function st.slider(). You need to determine the minimum value the slider can take, maximum value the slider can take, the default value the app should start with and the step meaning how much at a time should the slider move. I use it to determine the max_Depth I should use at my machine learning algorithm. And the way to get the input from the user is simply to write a variable equals to the slider function.
Similarly we can have a drop down menu. In streamlit it goes by the name of select box. You need to give it the options to be displayed as a list and the index of the option to be showed by default.
As you can see you can have numerical and non-numerical values in the same list so it’s possible to have options like “All”, “No limit”. We read the selection by the user with the same technique.
Lastly, you can also get text input from the user. You need to give the text_inpur function the prompt to show the user. You can choose to include a default value too. I also included a way to show the user their options before they give an input.
In my original design, I wanted to have 2 columns in the last section. In order to do that, inside the container section I will define two columns.
But even though I create the columns, the input prompts still take up the whole width. In order to put them in one column, we need to assign them to columns. To do this, I change the streamlit identifier "st" with the name of the column I want to assign the component to.
So my code for the last section looks like this:
And the section looks like this:
Lastly, I want to report the performance of the model trained with the choices of the user. It is just like any other time you train a machine learning model, only the hyper parameters are determined by the user.
#1 set up the model with the selected inputs
#2 set the input features again selected by the user
#3 set the output feature
#4 and #5 Fitting the model to the data, getting the predictions. (Ps. I didn’t separate the data into train and test for the ask of keeping the code simple.)
#6 Display an explanation for the metric
#7 Display the calculated metric for the performance of this certain settings
Optimize your app's run time
One other very nice feature of streamlit is caching. You might not be aware of it when working with small application built on small amounts of data but every time the user changes an input, the whole application runs from the beginning. If your application includes a lot of calculations, works with big amounts of data or does calls to a database to retrieve data, your application will quickly become too slow to interact with.
As a solution, the streamlit team developed “caching”. It is very simple to set up and when you assign a certain piece of code to be cached, the application saves the result of that piece of code and does not re-run it unless the input given directly to that piece of code has changed.
We can do this for reading the data in our case because once it is read, we do not make any changes on the data and we do not need it to be re-read from its source.
Al we have to do is to turn the data reading code into a function and decorate it with the cache decorator.
Then we can call this function to load out data to the application
Streamlit is designed to put interactive front-ends out there quickly. So that’s why there is not much flexibility when it comes to how the application looks. But there is still something we can do to personalize the application a little bit. All you need to do is to add this little section to your code.
And by populating the gap between the quotes with css code, you can personalize your application. What I would recommend especially if you’re going to use multiple columns is to definitely make sure you set the width to be wider, so the application takes up more of the screen and is not crammed to the center of the webpage.
One example is changing the back ground color:
Let's wrap this up here. I will prepare a separate guide on how to deploy and share your streamlit apps with the world.