Migrate a Three Tier Web Application to Kubernetes

Saturday, June 17, 2017

Multi-tier architectures are a common way to design web applications:

  1. A frontend - presentation - tier which provides the user interface
  2. An application - logic - tier where the processing happens
  3. A backend - data - tier where different storage technologies run

The different tiers could be deployed to a single virtual machine. Configuration management tools such as Ansible, Salt, Chef, or Puppet could be used to automate the deployment process. However, if the web application needs to start handling more traffic, it is only a matter of time before the resources of that single virtual machine are consumed. The single virtual machine could be scaled up by adding more CPU, RAM, and storage, but there is usually an upper limit to doing that.

Because of the multi-tier architecture, each tier could be separated into its own virtual machine with a load balancer in front of each allowing each tier to be independently scaled, but that can create a lot of overhead due to the additional virtual machines running. Additionally, if want to continue using configuration management tools to manage this now more complicated architecture, the deployment scripts will need to be heavily modified.

A potentially easier and forward-thinking solution would be to containerize each tier and deploy, manage, and scale those containers using container orchestration software. In this post, you will be using Docker to containerize each tier, and Kubernetes to orchestrate all of those containers.

You will be deploying a three tier web application to Google Container Engine (GKE) due to its ability to quickly create a Kubernetes cluster. Google Cloud Container Registry and Google Cloud Container Builder will also be used.

If you prefer videos to blog posts, you can watch a recent episode of the GCP Online Meetup where I present a live demo of what this blog post will cover. However, I have improved upon some of the concepts presented during the live demo in this blog post, specifically concerning rolling updates.

The Three Tier Web Application

The three tier web application you will be deploying to Kubernetes is called SnapPass. Developed at Pinterest, it is a Python Flask application that uses redis as the storage backend to provide a simple, secure, and ephemeral way to share passwords. I am a big fan of SnapPass, and you can read more about why here.

Prerequisites

This post assumes the following:

  • You are familiar with Google Cloud Platform (GCP) and have already created a GCP Project
  • You have installed the Google Cloud SDK
  • You have authenticated the gcloud command against your Google Account

Create a GCP Project

If you have not yet created a GCP Project, follow these steps:

  1. Open a web browser, and create or log in to a Google Account
  2. Navigate to https://console.cloud.google.com
  3. If this is your first GCP Project, you will be prompted to create a GCP Project. Each Google Account gets $300 in credit to use within 12 months towards GCP. You are required to enter a credit card to create a GCP Project, but it will not be charged until the $300 credit is consumed or 12 months expire.
  4. If this is a new GCP Project, you will need to enable the Compute Engine API by navigating to the Compute Engine section of the GCP Console and wait for initialization to complete

Install the Google Cloud SDK

If you have not yet installed the Google Cloud SDK, follow the instructions here.

Authenticate gcloud

Once you have created a GCP Project and installed the Google Cloud SDK, the last step is to authenticate the gcloud command to your Google Account. Open your terminal application and run the following command:

gcloud auth login

A web page will open in your web browser. Select your Google Account and give it permission to access GCP. Once completed, you will be authenticated and ready to move forward.

Set the GCP Project ID and Zone

The first thing to do is set a few variables, specifically the GCP Project ID and the GCP Zone to deploy in.

Open your terminal application and get a list of all your current GCP Projects:

gcloud projects list

The first column will contain all of your GCP Project’s IDs. Copy the GCP Project ID you want to deploy to, and set its value in an environment variable:

export GCP_PROJECT_ID="<PROJECT_ID>"

Then, configure the values using gcloud config:

gcloud config set project $GCP_PROJECT_ID

gcloud config set compute/zone us-central1-a

Build the Docker Images

With gcloud configured, the next step is to containerize the web application. The web application has three tiers:

  1. nginx
  2. Python
  3. redis

Each tier will be containerized using Docker. Despite there being three tiers, you will only be creating two Docker images - nginx and Python - because the redis Docker image will be pulled from redis’ official Docker Hub repository.

The two Docker images that need to be created could be built on your local workstation, but instead you will be using Google Cloud Container Builder.

Build the snappass-nginx Docker Image

git clone the snappass-nginx-blog-post repository:

git clone https://github.com/jameswthorne/snappass-nginx-blog-post.git

Change to the newly cloned directory:

cd snappass-nginx-blog-post

Next, you will need to generate a couple cryptographic things:

  1. A Diffie-Hellman key
  2. A self-signed SSL certificate

I host my own version of SnapPass, and generating a unique Diffie-Hellman key is part of my deployment process to get a higher SSL score on Qualy’s SSL Labs. Generate a Diffie-Hellman key with the following command (this make take a few minutes to complete):

openssl dhparam -out nginx_configuration/dhparam.pem 2048

Create a directory to store the self-signed SSL certificates in:

mkdir ssl_certs

Create the Private Key and the Certificate Signing Request:

openssl req -new -newkey rsa:2048 -nodes -keyout ssl_certs/example.com.key -out ssl_certs/example.com.csr

You will be prompted to fill out the following fields. You can input bogus information - or none at all - in those fields since this is not meant to be a valid SSL certificate. Additionally, you can leave the challenge password empty:

Country Name (2 letter code) [AU]:
State or Province Name (full name) [Some-State]:
Locality Name (eg, city) []:
Organization Name (eg, company) [Internet Widgits Pty Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (e.g. server FQDN or YOUR name) []:
Email Address []:

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

Create the SSL certificate - valid for 30 days - from the Certificate Signing Request and sign it with the Private Key:

openssl x509 -req -days 30 -in ssl_certs/example.com.csr -signkey ssl_certs/example.com.key -out ssl_certs/example.com.crt

Commit your latest changes to version control:

Do not skip this step, or any other git steps, as the git SHA will be used later.

git add --all

git commit -m "Added dhparam and SSL certificates"

After you have committed your changes to git, store the git SHA of the latest commit in an environment variable:

export SNAPPASS_NGINX_GIT_SHA=$(git rev-parse HEAD)

Now, build the Docker image using Google Cloud Container Builder. The git SHA you just stored in an environment variable will be used to tag the Docker image so you know exactly what git commit the Docker image was built from:

gcloud container builds submit --tag us.gcr.io/$GCP_PROJECT_ID/snappass-nginx:$SNAPPASS_NGINX_GIT_SHA .

After about 30 seconds of build time, you should have one Docker image in your GCP Project’s Container Registry:

gcloud container images list --repository us.gcr.io/$GCP_PROJECT_ID

Build the snappass Docker Image

Change out of the snappass-nginx-blog-post directory by going back one directory:

cd ..

git clone the snappass repository:

git clone https://github.com/pinterest/snappass.git

Change to the newly cloned directory:

cd snappass

Store the git SHA of the latest commit in an environment variable:

export SNAPPASS_GIT_SHA=$(git rev-parse HEAD)

Now, build the Docker image using Google Cloud Container Builder. Once again, the git SHA you just stored in an environment variable will be used to tag the Docker image so you know exactly what git commit the Docker image was built from:

gcloud container builds submit --tag us.gcr.io/$GCP_PROJECT_ID/snappass:$SNAPPASS_GIT_SHA .

After about one minute of build time, you should now have two Docker images in your GCP Project’s Container Registry:

gcloud container images list --repository us.gcr.io/$GCP_PROJECT_ID

Build the redis Docker Image

There is nothing for you to build here since you will be pulling the redis Docker image from redis’ official Docker Hub repository later in the post.

Create the GKE Cluster

With the necessary Docker images created, it is time for you to create the GKE cluster. Run the following command to create a GKE cluster named cluster-1 with three worker nodes as machine type g1-small (1 shared CPU and 1.7 GB RAM):

gcloud container clusters create cluster-1 --num-nodes 3 --machine-type g1-small

After a few minutes, the GKE cluster will be up and running.

Configure Kubernetes Credentials

Once the GKE cluster is up and running, you need to obtain the necessary credentials to communicate with it via the kubectl command. The gcloud command makes this very easy:

gcloud container clusters get-credentials cluster-1

After the gcloud command completes, you can begin communicating with the GKE cluster using the kubectl command. Try running kubectl get nodes. The response should display three nodes which represent the three virtual machines running the Kubernetes software.

Deploy the Web Application to GKE

You are now ready to begin deploying the three tier web application to Kubernetes. Since each tier builds upon the previous tier, you are going to deploy everything backwards. This isn’t always necessary, but it helps to show how everything layers on top of each other.

The following steps will have you setting up three Kubernetes Deployments and three Kubernetes Services. A Kubernetes Deployment allows you to specify what Docker image you want to deploy as a Docker container and how many of those Docker containers to run. It also handles rolling updates, which will be covered later in the post. The two types of Kubernetes Services you will be creating are ClusterIP and LoadBalancer; these are essentially internal and external load balancers, respectively.

Deploy redis

Deploy the redis Docker image as a Docker container using a Kubernetes Deployment:

kubectl run snappass-redis --image=redis --port=6379 --replicas=1 --labels="name=snappass-redis,tier=backend,app=snappass"

Because redis is the storage backend, it is the stateful tier in the web application, and you will only be creating one replica. If you created more than one replica, each replica would be independent of each other which would cause problems for end users trying to access their passwords. In order for redis to be properly scaled out, it would need to be clustered which is out of scope for this post.

The Python Flask application needs to communicate with redis, so create a Kubernetes Service of type ClusterIP which is essentially an internal load balancer that can only communicate within the GKE cluster using its short name snappass-redis:

kubectl expose deployment snappass-redis --type=ClusterIP

Deploy snappass

Deploy the snappass Docker image as a Docker container using a Kubernetes Deployment:

kubectl run snappass --image=us.gcr.io/$GCP_PROJECT_ID/snappass:$SNAPPASS_GIT_SHA --replicas=1 --port=5000 --env="REDIS_HOST=snappass-redis" --labels="name=snappass,tier=backend,app=snappass"

Due to the large size of the snappass Docker image, it might take up to one minute for the Deployment to complete.

You might have noticed the additional command line switch, --env="REDIS_HOST=snappass-redis". This tells the Python Flask application how to communicate with the redis Kubernetes Service. Because snappass-redis is a short name registered in Kubernetes’ internal DNS, that short name will resolve to the internal IP address the snappass-redis Kubernetes Service is listening on.

nginx needs to communicate with the Python Flask application, so create a Kubernetes Service of type ClusterIP which is essentially an internal load balancer that can only communicate within the GKE cluster using its short name snappass.

kubectl expose deployment snappass --type=ClusterIP

Deploy nginx

Deploy the nginx Docker image as a Docker container using a Kubernetes Deployment:

kubectl run snappass-nginx --image=us.gcr.io/$GCP_PROJECT_ID/snappass-nginx:$SNAPPASS_NGINX_GIT_SHA --replicas=1 --port 443 --labels="name=snappass-nginx,tier=frontend,app=snappass"

Because the nginx Docker container is the frontend of the web application, it needs to be publicly facing. Instead of creating a Kubernetes Service of type ClusterIP, you will create a Kubernetes Service of type LoadBalancer which will create a Layer 4 Google Cloud Network Load Balancer with a public IP address (it may take up to a couple minutes for the network load balancer to deploy):

kubectl expose deployment snappass-nginx --type=LoadBalancer

Once the network load balancer is deployed, run command kubectl get services snappass-nginx to see the public IP address assigned to it in the EXTERNAL-IP column.

If you wanted to assign a domain name to this web application, you would use that public IP address for the DNS A record.

Test the Web Application

With the web application deployed to the GKE cluster and the network load balancer in place, you can access the web application by going to the public IP address obtained in the previous step.

Open a web browser and navigate to https://EXTERNAL-IP. Click past the SSL warnings to get to the web application. You can verify the web application is functional by inputting a password and TTL, clicking Generate URL, copying and pasting the generated URL into your web browser and seeing the password you inputted. If you refresh the page it should now 404 because the password you inputted has been deleted from redis.

Query Kubernetes Resources

After you have confirmed the web application is functional, go back to your terminal and look through the various Kubernetes components you just created.

Pods

You should currently have three Pods:

kubectl get pods

If you wanted to filter specific Pods based on the Labels you set when the Deployment was created, you can do so with the following commands:

kubectl get pods -l 'tier=frontend'

kubectl get pods -l 'tier=backend'

kubectl get pods -l 'app=snappass'

Deployments

You should have three Deployments:

kubectl get deployments

You can see the replica numbers in the nginx and snappass Deployment rows.

Services

You should have four Services. The Service titled kubernetes can be ignored because it is part of the Kubernetes system services:

kubectl get services

You can also filter Services using Labels:

kubectl get services -l 'tier=frontend'

Scale Out the Web Application

If the web application needs to handle more traffic, Kubernetes allows you to easily scale out the different tiers because they were originally created as separate Deployments.

Scale Out the nginx Deployment

The nginx Deployment was originally created with one replica which created one Pod on one of the Kubernetes Nodes. If you scale the Deployment to three replicas, two more Pods will be created on the remaining Kubernetes Nodes:

kubectl scale --replicas=3 deployment/snappass-nginx

Scale Out the snappass Deployment

The snappass Deployment was originally created with one replica which created one Pod on one of the Kubernetes Nodes. If you scale the Deployment to three replicas, two more Pods will be created on the remaining Kubernetes Nodes:

kubectl scale --replicas=3 deployment/snappass

Due to the large size of the snappass Docker image, it may take up to two minutes for the Pods to be created on the remaining Kubernetes Nodes. This is in contrast to the nginx Docker image which replicates almost instantly because it’s only about 10 megabytes in size.

Query Kubernetes Resources

With the nginx and snappass Deployments scaled out, you can see the additional Pods created and the Nodes they are running on with the following command:

kubectl get pods -o wide

Then, you can see the updated replica numbers in the nginx and snappass Deployments:

kubectl get deployments

Perform a Rolling Update

Rolling updates have been made much easier because of Kubernetes Deployments. A Kubernetes Deployment will handle creating a new ReplicaSet and scaling it up to the desired replica count while it scales down the original ReplicaSet to zero. The original ReplicaSet will remain - but at a zero replica count - in case you need to rollback an update.

For this particular rolling update scenario, the SSL certificate currently configured in the nginx Docker image needs to be updated and deployed across the GKE cluster.

Start by changing back into the snappass-nginx-blog-post directory. If you have been following the post step-by-step, that should mean changing back one directory and changing into the snappass-nginx-blog-post directory:

cd ..

cd snappass-nginx-blog-post

Rename the original SSL certificate so it will not be overwritten:

mv ssl_certs/example.com.crt ssl_certs/example.com.crt.old

Create a new SSL certificate with a 90 day expiration time:

openssl x509 -req -days 90 -in ssl_certs/example.com.csr -signkey ssl_certs/example.com.key -out ssl_certs/example.com.crt

Commit your latest changes to version control:

git add --all

git commit -m "Updated SSL certificate"

Update the SNAPPASS_NGINX_GIT_SHA environment variable with the latest git SHA:

export SNAPPASS_NGINX_GIT_SHA=$(git rev-parse HEAD)

Rebuild the nginx Docker image which will create a new Docker image with a new Docker tag corresponding to the latest git SHA:

gcloud container builds submit --tag us.gcr.io/$GCP_PROJECT_ID/snappass-nginx:$SNAPPASS_NGINX_GIT_SHA .

Then, tell Kubernetes to pull the new nginx Docker image based on the updated Docker tag:

kubectl set image deployment/snappass-nginx snappass-nginx=us.gcr.io/$GCP_PROJECT_ID/snappass-nginx:$SNAPPASS_NGINX_GIT_SHA

Because the nginx Docker image is only about 10 megabytes, Kubernetes will be able to re-deploy the three snappass-nginx Pods very quickly. You can confirm the new snappass-nginx Docker container is running within each of the three Pods by looking at the SSL certificate for the web application within your web browser. The expiration date should now be 90 days out instead of 30 days.

Additionally, you can see every step the Deployment took to deploy the new Docker image by running the following command:

kubectl describe deployment snappass-nginx

Delete Everything

To ensure you do not consume anymore of the $300 credit in your GCP Project than is necessary, the following commands will allow you to delete everything you just created.

Delete all of the Kubernetes Deployments and Services you created with the Label frontend and backend:

kubectl delete deployments,services -l "tier in (frontend, backend)"

Destroy the GKE cluster (this will take a few minutes to complete):

gcloud container clusters delete cluster-1

Finally, delete the Docker images you uploaded by navigating to the Container Registry section of the GCP Console in your web browser and clicking into each repository and deleting all of the Docker images. The repositories will disappear after all Docker images inside of them are deleted.

Next Steps

You have now successfully deployed, scaled, and updated a three tier web application on top of Kubernetes, but this is just scratching the surface.

There are many more things and best practices that need to be considered before deploying production applications to Kubernetes, such as:

  1. Converting the command line Kubernetes Deployments used above to YAML files
  2. Scanning your Docker images for exploits and vulnerabilities
  3. Using SELinux or AppArmor to further secure your containers (GKE already uses Docker’s default AppArmor security profile)
  4. Building Continuous Deployment pipelines with something like Spinnaker
  5. Configuring Pod autoscaling within Kubernetes
  6. Monitoring Kubernetes clusters with something like Prometheus
comments powered by Disqus