Before you get started
Below itemizes the information that will need to have setup prior to moving onto the next session below.
- Google Cloud Project
- Google Cloud CLI installed and configured
- If you are coding locally, Docker installed (Docker Desktop is the easiest path to get started)
In my opinion, the easiest way to code against GCP is with Cloud Shell Terminal, which you can find at shell.cloud.google.com
Below assumes a *nix environment or Cloud Shell. If you are coding locally on Windows you will need to modify the commands as necessary. In my case, this is a simple POC application that uses Secret Manager to access a Token that is used to create a connection to a Motherduck cloud data warehouse.
The app is a simple POC to help students learn streamlit, but the focus of this post is to walk you through the steps to deploy a serverless Streamlit app on GCP.
Setup
First, you will need to ensure strealmit is installed:
pip install streamlit
Now, let’s setup the project structure:
mkdir streamlit
cd streamlit
Finally, make sure you have enabled APIs on GCP
gcloud services enable run.googleapis.com
gcloud services enable cloudbuild.googleapis.com
gcloud services enable artifactregistry.googleapis.com
gcloud services enable secretmanager.googleapis.com
gcloud services enable iam.googleapis.com
Files
Below we are going to define the files for our application.
Requirements
Modify this for your project and application’s needs.
streamlit/requirements.txt
streamlit
matplotlib
networkx
google-cloud-secret-manager
numpy
pandas
duckdb
Streamlit App
Below is my application. Modify for your needs.
streamlit/app.py
import streamlit as st
import duckdb
import pandas as pd
from google.cloud import secretmanager
import numpy as np
from itertools import combinations
import networkx as nx
from collections import Counter
import matplotlib.pyplot as plt
= '_your project id_'
project_id = 'mother_duck' #<---------- this is the name of the Google Cloud Secret
secret_id = 'latest'
version_id
= 'awsblogs'
db = "stage"
schema = f"{db}.{schema}"
db_schema
= secretmanager.SecretManagerServiceClient()
sm
= f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}"
name
= sm.access_secret_version(request={"name": name})
response = response.payload.data.decode("UTF-8")
md_token
= duckdb.connect(f'md:?motherduck_token={md_token}')
md
############################################ Streamlit App
="My Fancy Streamlit App", layout="wide")
st.set_page_config(page_title
= """
sql select
min(published) as min,
max(published) as max,
from
awsblogs.stage.posts
"""
= md.sql(sql).df()
date_range = date_range['min'].to_list()[0]
start_date = date_range['max'].to_list()[0]
end_date
"Streamlit - Example")
st.title("To demonstrate Streamlit concepts - It's just a python script!")
st.subheader(
"Filters")
st.sidebar.header("One option is to use sidebars for inputs")
st.sidebar.markdown(= st.sidebar.text_input("Search by Author")
author_filter = st.sidebar.date_input("Post Start Date", (start_date, end_date))
date_filter
"A button to control inputs")
st.sidebar.button(
"Users can upload files that your app analyzes!")
st.sidebar.file_uploader(
"These controls are not wired up to control data, just highlighting you have a lot of control!")
st.sidebar.markdown(
############ A simple line plot
= 365 # Number of days for the time series
num_days = '2023-01-01' # Start date
start_date
= pd.date_range(start=start_date, periods=num_days, freq='D')
date_range
42) # For reproducibility
np.random.seed(= np.random.randint(50, 150, size=num_days) # Example: random sales values between 50 and 150
values
= pd.DataFrame({
time_series_data 'date': date_range,
'value': values
})
="date", y="value")
st.line_chart(time_series_data, x
############ Graph of co-association of tags, a touch forward looking
"---")
st.markdown(
= """
pt_sql select post_id, term from awsblogs.stage.tags
"""
= md.sql(pt_sql).df()
pt_df
"### You _can_ show data tables")
st.markdown(
st.dataframe(pt_df)
"### A static network graph")
st.markdown("We can think of relationships as a graph")
st.markdown(
= []
cotag_pairs
for _, group in pt_df.groupby('post_id'):
# Get the unique list of authors for each post
= group['term'].unique()
terms # Generate all possible pairs of co-authors for this post
= combinations(terms, 2)
pairs
cotag_pairs.extend(pairs)
= Counter(cotag_pairs)
cotag_counter
= nx.Graph()
G
for (term1, term2), weight in cotag_counter.items():
=weight)
G.add_edge(term1, term2, weight
= nx.degree_centrality(G)
degree_centrality
= [100 * degree_centrality[node] for node in G.nodes()]
node_sizes
= [G[u][v]['weight'] for u, v in G.edges()]
edge_weights
= nx.spring_layout(G, k=0.3, seed=42)
pos
= plt.figure(figsize=(12, 12))
fig
=node_sizes, node_color='skyblue', alpha=0.7)
nx.draw_networkx_nodes(G, pos, node_size
=edge_weights, alpha=0.5, edge_color='gray')
nx.draw_networkx_edges(G, pos, width
"Tag Graph")
plt.title(
st.pyplot(fig)
############ There are some chat support features, more coming
"---")
st.markdown(
"### There are even some chat features - more coming on the roadmap.")
st.markdown(
= st.chat_input("Say something")
prompt if prompt:
f"User has sent the following prompt: {prompt}") st.write(
Dockerfile
Our solution uses a Docker container.
streamlit/Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY . /app
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8501
ENV STREAMLIT_SERVER_HEADLESS=true
ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
CMD ["streamlit", "run", "app.py", "--server.port", "8080", "--server.address", "0.0.0.0"]
Deployment
The bash script below will build the image and deploy it to Google Cloud run as a service.
streamlit/deploy.sh
gcloud config set project {your_project_here}
echo "======================================================"
echo "build (no cache)"
echo "======================================================"
docker build --no-cache -t gcr.io/{your_project_here}/streamlit-poc .
echo "======================================================"
echo "push"
echo "======================================================"
docker push gcr.io/{your_project_here}/streamlit-poc
echo "======================================================"
echo "deploy run"
echo "======================================================"
gcloud run deploy streamlit-poc \
--image gcr.io/{your_project_here}/streamlit-poc \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--service-account {service_account_name}@{your_project_here}.iam.gserviceaccount.com_ \
--memory 1Gi
A few notes on the deploy script
- You will need to replace
{your_project_here}
with your actual project id - I am using the us-central1 region
- Above specifies a service account that Cloud Run will use to access various services on GCP. You will need to ensure that when you define the Service Account, you provide access to Secret Manager
From your shell, deploy the application.
bash streamlit/deploy.sh
When the process completes, you should see something similar to below:
======================================================
deploy run
======================================================
Deploying container to Cloud Run service [streamlit-poc] in project [{your-project-here}] region [us-central1]
OK Deploying... Done.
OK Creating Revision...
OK Routing traffic...
OK Setting IAM Policy...
Done.
Service [streamlit-poc] revision [streamlit-poc-00003-588] has been deployed and is serving 100 percent of traffic.
Service URL: https://streamlit-poc-111222333444.us-central1.run.app
That’s it! Navigate to the URL and you have deployed a publicly available, serverless, Streamlit app on GCP via Cloud Run.