How to Create a Dashboard with Streamlit and Altair

Altair

Streamlit is an open-source python framework for creating interactive dashboards. Currently, the most popular framework seems to be Dash, but as the graph below shows, Streamlit’s popularity on Github has been growing rapidly since 2020.

Although Streamlit has the disadvantage of not allowing detailed visual customization like Dash, it is very easy to create a dashboard if you can accept some default settings. In this article, we will explain the process of creating a dashboard to visualize vega_datasets’ cars datasets using Streamlit and Altair, a visualization library.

The code presented in this article is available in the GitHub Repository here.

Dataset and Dashboard

The cars dataset is a dataset that stores horsepower, displacement, etc. of automobiles as shown below.

The goal is to create the following dashboard with the sidebar section on the left and the content section on the right.

For example, this dashboard allows you to easily create a graph comparing the displacement and horsepower of U.S. and Japanese cars over the last five years by manipulating the sidebar.

Installing Libraries and Datasets

Install Streamlit and Altair. Also, install vega_datasets to acquire data for visualization.

conda install -c conda-forge streamlit altair vega_datasets

Import the required modules and load the cars dataset.

import altair as alt
import streamlit as st
from vega_datasets import data

# Loading the cars dataset
df = data.cars()

List items to be set in the sidebar. Create a YYYY column so that the period can be selected in years, and set variables to store the minimum and maximum year.

# List of quantitative data items
item_list = [
    col for col in df.columns if df[col].dtype in ['float64', 'int64']]

# List of Origins
origin_list = list(df['Origin'].unique())

# Create the column of YYYY
df['YYYY'] = df['Year'].apply(lambda x: x.year)
min_year = df['YYYY'].min().item()
max_year = df['YYYY'].max().item()

Creating Components of the Sidebar Section

First, set up the following to avoid unnecessary white space on both sides of the screen

st.set_page_config(layout="wide")

The following components are required for the sidebar section.

  • The title, “Dashboard of Cars Dataset”
  • Bold and italicize the word “Settings”
  • Slider to specify period
  • Drop-down list to set Origin
  • Drop-down list for selecting the first item
  • Drop-down list for selecting the second item

These are expressed in Streamlit as the following code. Streamlit makes it easy to add dashboard components by simply coding st + method name.

# Sidebar
st.title("Dashboard of Cars Dataset")
st.markdown('###')
st.markdown("### *Settings*")
start_year, end_year = st.slider(
    "Period",
    min_value=min_year, max_value=max_year,
    value=(min_year, max_year))

st.markdown('###')
origins = st.multiselect('Origins', origin_list,
                          default=origin_list)
st.markdown('###')
item1 = st.selectbox('Item 1', item_list, index=0)
item2 = st.selectbox('Item 2', item_list, index=3)

Each method has the following roles.

  • st.title: Display the title
  • st.markdown: Display strings according to markdown notation
  • st.markdown(‘###’): Create a space the size of Heading 3
  • st.slider: Display the slider. “min_value” and “max_value” arguments can specify the minimum and maximum date. “value” aurgument can set the initial value. Here, we default to the whole period.
  • st.multiselect: Display the multiselected drop-down. The first argument sets the title, the second the list of choices, and the third the initial value.
  • st.selectbox: Display the drop-down. The arguments are the same as for the multiselected drop-down list.

The items selected in the slider and drop-down list are used in the following step, so assign them to the variables start_year, end_year, origins, item1, and item2, respectively.

Next, extract the data corresponding to the Period and Origin specified in the slider and drop-down list above.

df_rng = df[(df['YYYY'] >= start_year) & (df['YYYY'] <= end_year)]
source = df_rng[df_rng['Origin'].isin(origins)]

Save the codes above in app.py and execute streamlit run app.py on the terminal, and the following screen will appear in your browser.

Layout of the Sidebar Section

To display the component created in the previous section as a sidebar, just add “sidebar” between st and the method name.

# Sidebar
st.sidebar.title("Dashboard of Cars Dataset")
st.sidebar.markdown('###')
st.sidebar.markdown("### *Settings*")
start_year, end_year = st.sidebar.slider(
    "Period",
    min_value=min_year, max_value=max_year,
    value=(min_year, max_year))

st.sidebar.markdown('###')
origins = st.sidebar.multiselect('Origins', origin_list,
                                 default=origin_list)
st.sidebar.markdown('###')
item1 = st.sidebar.selectbox('Item 1', item_list, index=0)
item2 = st.sidebar.selectbox('Item 2', item_list, index=3)

When the code is changed, a Rerun button will appear in the upper right corner of the browser, and pressing that button will refresh the screen.

The sidebar is easily created.

Creating Graphs of the Content Section

Create graphs in Altair to be displayed in the content section.

# Content
base = alt.Chart(source).properties(height=300)

bar = base.mark_bar().encode(
    x=alt.X('count(Origin):Q', title='Number of Records'),
    y=alt.Y('Origin:N', title='Origin'),
    color=alt.Color('Origin:N', legend=None)
)

point = base.mark_circle(size=50).encode(
    x=alt.X(item1 + ':Q', title=item1),
    y=alt.Y(item2 + ':Q', title=item2),
    color=alt.Color('Origin:N', title='',
                    legend=alt.Legend(orient='bottom-left'))
)

line1 = base.mark_line(size=5).encode(
    x=alt.X('yearmonth(Year):T', title='Date'),
    y=alt.Y('mean(' + item1 + '):Q', title=item1),
    color=alt.Color('Origin:N', title='',
                    legend=alt.Legend(orient='bottom-left'))
)

line2 = base.mark_line(size=5).encode(
    x=alt.X('yearmonth(Year):T', title='Date'),
    y=alt.Y('mean(' + item2 + '):Q', title=item2),
    color=alt.Color('Origin:N', title='',
                    legend=alt.Legend(orient='bottom-left'))
)

Now that we have assigned bar, point, line1, and line2 to the graphs to be displayed in the contents, respectively, we will set up the layout in the next section. The basic usage of Altair is described in the article below.

Layout of the Content Section

Steamlit allows us to set the number of columns to divide the entire screen vertically. In this case, the number of columns is set to 2 since the four graphs are arranged in a 2 x 2 layout. Also, set the variables for the components to be placed in the left and right columns to left_column and right_column, respectively.

# Layout (Content)
left_column, right_column = st.columns(2)

The components to be displayed in the content section are as follows.

  • Bar chart title and body (upper left)
  • Scatterplot title and body (upper right)
  • Line chart title body (bottom left)
  • Line chart title body (bottom right)

These are expressed in Streamlit as the following code.

left_column.markdown(
    '**Number of Records (' + str(start_year) + '-' + str(end_year) + ')**')
left_column.altair_chart(bar, use_container_width=True)

right_column.markdown(
    '**Scatter Plot of _' + item1 + '_ and _' + item2 + '_**')
right_column.altair_chart(point, use_container_width=True)

left_column.markdown('**_' + item1 + '_ (Monthly Average)**')
left_column.altair_chart(line1, use_container_width=True)

right_column.markdown('**_' + item2 + '_ (Monthly Average)**')
right_column.altair_chart(line2, use_container_width=True)

The entire code is as follows.

import altair as alt
import streamlit as st
from vega_datasets import data

# Loading the cars dataset
df = data.cars()

# List of quantitative data items
item_list = [
    col for col in df.columns if df[col].dtype in ['float64', 'int64']]

# List of Origins
origin_list = list(df['Origin'].unique())

# Create the column of YYYY 
df['YYYY'] = df['Year'].apply(lambda x: x.year)
min_year = df['YYYY'].min().item()
max_year = df['YYYY'].max().item()

st.set_page_config(layout="wide")

# Sidebar
st.sidebar.title("Dashboard of Cars Dataset")
st.sidebar.markdown('###')
st.sidebar.markdown("### *Settings*")
start_year, end_year = st.sidebar.slider(
    "Period",
    min_value=min_year, max_value=max_year,
    value=(min_year, max_year))

st.sidebar.markdown('###')
origins = st.sidebar.multiselect('Origins', origin_list,
                                 default=origin_list)
st.sidebar.markdown('###')
item1 = st.sidebar.selectbox('Item 1', item_list, index=0)
item2 = st.sidebar.selectbox('Item 2', item_list, index=3)

df_rng = df[(df['YYYY'] >= start_year) & (df['YYYY'] <= end_year)]
source = df_rng[df_rng['Origin'].isin(origins)]

# Content
base = alt.Chart(source).properties(height=300)

bar = base.mark_bar().encode(
    x=alt.X('count(Origin):Q', title='Number of Records'),
    y=alt.Y('Origin:N', title='Origin'),
    color=alt.Color('Origin:N', legend=None)
)

point = base.mark_circle(size=50).encode(
    x=alt.X(item1 + ':Q', title=item1),
    y=alt.Y(item2 + ':Q', title=item2),
    color=alt.Color('Origin:N', title='',
                    legend=alt.Legend(orient='bottom-left'))
)

line1 = base.mark_line(size=5).encode(
    x=alt.X('yearmonth(Year):T', title='Date'),
    y=alt.Y('mean(' + item1 + '):Q', title=item1),
    color=alt.Color('Origin:N', title='',
                    legend=alt.Legend(orient='bottom-left'))
)

line2 = base.mark_line(size=5).encode(
    x=alt.X('yearmonth(Year):T', title='Date'),
    y=alt.Y('mean(' + item2 + '):Q', title=item2),
    color=alt.Color('Origin:N', title='',
                    legend=alt.Legend(orient='bottom-left'))
)

# Layout (Content)
left_column, right_column = st.columns(2)

left_column.markdown(
    '**Number of Records (' + str(start_year) + '-' + str(end_year) + ')**')
left_column.altair_chart(bar, use_container_width=True)

right_column.markdown(
    '**Scatter Plot of _' + item1 + '_ and _' + item2 + '_**')
right_column.altair_chart(point, use_container_width=True)

left_column.markdown('**_' + item1 + '_ (Monthly Average)**')
left_column.altair_chart(line1, use_container_width=True)

right_column.markdown('**_' + item2 + '_ (Monthly Average)**')
right_column.altair_chart(line2, use_container_width=True)

Press the Rerun button on the browser, set the period from 1977 to 1982 in the slider, and select Origins as US and Japan, Miles_per_Gallon as item 1, and Displacement as item 2 in the drop-down, and the dashboard will change as follows.

The scatter plots and line graphs clearly show that in 1977-1982, the Japanese automobile data in blue shows better fuel economy and lower emissions than the U.S. automobile data in orange.

Conclusion

Now, we have the interactive dashboard. Although Streamlit does not allow for detailed visual customization, we found that when combined with Altair, it is very easy to create a good-looking dashboard with a minimum of coding.

If you want to create dashboards with Streamlit and Plotly, check out this article.

コメント

タイトルとURLをコピーしました