Kubernetes Configuration and Production Readiness
You've deployed applications to Kubernetes and watched them self-heal. You've set up networking with Services and performed zero-downtime updates. But your applications aren't quite ready for a shared production cluster yet.
Think about what happens when multiple teams share the same Kubernetes cluster. Without proper boundaries, one team's runaway application could consume all available memory, starving everyone else's workloads. When an application crashes, how does Kubernetes know whether to restart it or leave it alone? And what about sensitive configuration like database passwords - surely we don't want those hardcoded in our container images?
Today, we'll add the production safeguards that make applications good citizens in shared clusters. We'll implement health checks that tell Kubernetes when your application is actually ready for traffic, set resource boundaries to prevent noisy neighbor problems, and externalize configuration so you can change settings without rebuilding containers.
By the end of this tutorial, you'll be able to:
- Add health checks that prevent broken applications from receiving traffic
- Set resource limits to protect your cluster from runaway applications
- Run containers as non-root users for better security
- Use ConfigMaps and Secrets to manage configuration without rebuilding images
- Understand why these patterns matter for production workloads
Why Production Readiness Matters
Let's start with a scenario that shows why default Kubernetes settings aren't enough for production.
You deploy a new version of your ETL application. The container starts successfully, so Kubernetes marks it as ready and starts sending it traffic. But there's a problem: your application needs 30 seconds to warm up its database connection pool and load reference data into memory. During those 30 seconds, any requests fail with connection errors.
Or consider this: your application has a memory leak. Over several days, it slowly consumes more and more RAM until it uses all available memory on the node, causing other applications to crash. Without resource limits, one buggy application can take down everything else running on the same machine.
These aren't theoretical problems. Every production Kubernetes cluster deals with these challenges. The good news is that Kubernetes provides built-in solutions - you just need to configure them.
Health Checks: Teaching Kubernetes About Your Application
By default, Kubernetes considers a container "healthy" if its main process is running. But a running process doesn't mean your application is actually working. Maybe it's still initializing, maybe it lost its database connection, or maybe it's stuck in an infinite loop.
Probes let you teach Kubernetes how to check if your application is actually healthy. There are three types that solve different problems:
- Readiness probes answer: "Is this Pod ready to handle requests?" If the probe fails, Kubernetes stops sending traffic to that Pod but leaves it running. This prevents users from hitting broken instances during startup or temporary issues.
- Liveness probes answer: "Is this Pod still working?" If the probe fails repeatedly, Kubernetes restarts the Pod. This recovers from situations where your application is stuck but the process hasn't crashed.
- Startup probes disable the other probes until your application finishes initializing. Most data processing applications don't need this, but it's useful for applications that take several minutes to start.
The distinction between readiness and liveness is important. Readiness failures are often temporary (like during startup or when a database is momentarily unavailable), so we don't want to restart the Pod. Liveness failures indicate something is fundamentally broken and needs a fresh start.
Setting Up Your Environment
Let's add these production features to the ETL pipeline from previous tutorials. If you're continuing from the last tutorial, make sure your Minikube cluster is running:
minikube start
alias kubectl="minikube kubectl --"
If you're starting fresh, you'll need the ETL application from the previous tutorial. Clone the repository:
git clone https://github.com/dataquestio/tutorials.git
cd tutorials/kubernetes-services-starter
# Point Docker to Minikube's environment
eval $(minikube -p minikube docker-env)
# Build the ETL image (same as tutorial 2)
docker build -t etl-app:v1 .
Clean up any existing deployments so we can start fresh:
kubectl delete deployment etl-app postgres --ignore-not-found=true
kubectl delete service postgres --ignore-not-found=true
Building a Production-Ready Deployment
In this tutorial, we'll build up a single deployment file that incorporates all production best practices. This mirrors how you'd work in a real job - starting with a basic deployment and evolving it as you add features.
Create a file called etl-deployment.yaml
with this basic structure:
apiVersion: apps/v1
kind: Deployment
metadata:
name: etl-app
spec:
replicas: 2
selector:
matchLabels:
app: etl-app
template:
metadata:
labels:
app: etl-app
spec:
containers:
- name: etl-app
image: etl-app:v1
imagePullPolicy: Never
env:
- name: DB_HOST
value: postgres
- name: DB_PORT
value: "5432"
- name: DB_USER
value: etl
- name: DB_PASSWORD
value: mysecretpassword
- name: DB_NAME
value: pipeline
- name: APP_VERSION
value: v1
This is our starting point. Now we'll add production features one by one.
Adding Health Checks
Kubernetes probes should use lightweight commands that run quickly and reliably. For our ETL application, we need two different types of checks: one to verify our database dependency is available, and another to confirm our processing script is actively working.
First, we need to modify our Python script to include a heartbeat mechanism. This lets us detect when the ETL process gets stuck or stops working, which a simple process check wouldn't catch.
Edit the app.py
file and add this heartbeat code:
def update_heartbeat():
"""Write current timestamp to heartbeat file for liveness probe"""
import time
with open("/tmp/etl_heartbeat", "w") as f:
f.write(str(int(time.time())))
f.write("\n")
# In the main loop, add the heartbeat after successful ETL completion
if __name__ == "__main__":
while True:
run_etl()
update_heartbeat() # Add this line
log("Sleeping for 30 seconds...")
time.sleep(30)
We’ll also need to update our Dockerfile because our readiness probe will use psql
, but our base Python image doesn't include PostgreSQL client tools:
FROM python:3.10-slim
WORKDIR /app
# Install PostgreSQL client tools for health checks
RUN apt-get update && apt-get install -y postgresql-client && rm -rf /var/lib/apt/lists/*
COPY app.py .
RUN pip install psycopg2-binary
CMD ["python", "-u", "app.py"]
Now rebuild with the PostgreSQL client tools included:
# Make sure you're still in Minikube's Docker environment
eval $(minikube -p minikube docker-env)
docker build -t etl-app:v1 .
Now edit your etl-deployment.yaml
file and add these health checks to the container spec, right after the env section. Make sure the readinessProbe:
line starts at the same column as other container properties like image:
and env:
. YAML indentation errors are common here, so if you get stuck, you can reference the complete working file to check your spacing.
readinessProbe:
exec:
command:
- /bin/sh
- -c
- |
PGPASSWORD="$DB_PASSWORD" \
psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -t -c "SELECT 1;" >/dev/null
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 3
livenessProbe:
exec:
command:
- /bin/sh
- -c
- |
# get the current time in seconds since 1970
now=$(date +%s)
# read the last "heartbeat" timestamp from a file
# if the file doesn't exist, just pretend it's 0
hb=$(cat /tmp/etl_heartbeat 2>/dev/null || echo 0)
# subtract: how many seconds since the last heartbeat?
# check that it's less than 600 seconds (10 minutes)
[ $((now - hb)) -lt 600 ]
initialDelaySeconds: 60
periodSeconds: 30
failureThreshold: 2
Let's understand what these probes do:
- readinessProbe: Uses
psql
to test the actual database connection our application needs. This approach works reliably with the security settings we'll add later and tests the same connection path our ETL script uses. - livenessProbe: Verifies our ETL script is actively processing by checking when it last updated a heartbeat file. This catches situations where the script gets stuck in an infinite loop or stops working entirely.
The liveness probe uses generous timing (check every 30 seconds, allow up to 10 minutes between heartbeats) because ETL jobs can legitimately take time to process data, and unnecessary restarts are expensive.
Web applications often use HTTP endpoints for probes (like /readyz
for readiness and /livez
for liveness, following Kubernetes component naming conventions), but data processing applications typically verify their connections to databases, message queues, or file systems directly.
The timing configuration tells Kubernetes:
- readinessProbe: Start checking after 10 seconds, check every 10 seconds with a 3-second timeout per attempt, mark unready after 3 consecutive failures
- livenessProbe: Start checking after 60 seconds (giving time for initialization), check every 30 seconds, restart after 2 consecutive failures
Timing Values in Practice: These numbers are example values chosen for this tutorial. In production, you should tune these values based on your actual application behavior. Consider how long your service actually takes to start up (for initialDelaySeconds
), how reliable your network connections are (affecting periodSeconds
and failureThreshold
), and how disruptive false restarts would be to your users. A database might need 60+ seconds to initialize, while a simple API might be ready in 5 seconds. Network-dependent services in flaky environments might need higher failure thresholds to avoid unnecessary restarts.
Now deploy PostgreSQL and then apply your deployment:
# Deploy PostgreSQL
kubectl create deployment postgres --image=postgres:13
kubectl set env deployment/postgres POSTGRES_DB=pipeline POSTGRES_USER=etl POSTGRES_PASSWORD=mysecretpassword
kubectl expose deployment postgres --port=5432
# Deploy ETL app with probes
kubectl apply -f etl-deployment.yaml
# Check the initial status
kubectl get pods
You might initially see the ETL pods showing 0/1
in the READY column. This is expected! The readiness probe is checking if PostgreSQL is available, and it might take a moment for the database to fully start up. Watch the pods transition to 1/1
as PostgreSQL becomes ready:
kubectl get pods -w
Once both PostgreSQL and the ETL pods show 1/1
READY, press Ctrl+C and proceed to the next step.
Testing Probe Behavior
Let's see readiness probes in action. In one terminal, watch the Pod status:
kubectl get pods -w
In another terminal, break the database connection by scaling PostgreSQL to zero:
kubectl scale deployment postgres --replicas=0
Watch what happens to the ETL Pods. You'll see their READY column change from 1/1
to 0/1
. The Pods are still running (STATUS remains "Running"), but Kubernetes has marked them as not ready because the readiness probe is failing.
Check the Pod details to see the probe failures:
kubectl describe pod -l app=etl-app | grep -A10 "Readiness"
You'll see events showing readiness probe failures. The output will include lines like:
Readiness probe failed: psql: error: connection to server at "postgres" (10.96.123.45), port 5432 failed: Connection refused
This shows that psql
can't connect to the PostgreSQL service, which is exactly what we expect when the database isn't running.
Now restore PostgreSQL:
kubectl scale deployment postgres --replicas=1
Within about 15 seconds, the ETL Pods should return to READY status as their readiness probes start succeeding again. Press Ctrl+C to stop watching.
Understanding What Just Happened
This demonstrates the power of readiness probes:
- When PostgreSQL was available: ETL Pods were marked READY (1/1)
- When PostgreSQL went down: ETL Pods automatically became NOT READY (0/1), but kept running
- When PostgreSQL returned: ETL Pods automatically became READY again
If these ETL Pods were behind a Service (like a web API), Kubernetes would have automatically stopped routing traffic to them during the database outage, then resumed traffic when the database returned. The application didn't crash or restart unnecessarily. Instead, it just waited for its dependency to become available again.
The liveness probe continues running in the background. You can verify it's working by checking for successful probe events:
kubectl get events --field-selector reason=Unhealthy -o wide
If you don't see any recent "Unhealthy" events related to liveness probes, that means they're passing successfully. You can also verify the heartbeat mechanism by checking the Pod logs to confirm the ETL script is running its normal cycle:
kubectl logs deployment/etl-app --tail=10
You should see regular "ETL cycle complete" and "Sleeping for 30 seconds" messages, which indicates the script is actively running and would be updating its heartbeat file.
This demonstrates how probes enable intelligent application lifecycle management. Kubernetes makes smart decisions about what's broken and how to fix it.
Resource Management: Being a Good Neighbor
In a shared Kubernetes cluster, multiple applications run on the same nodes. Without resource limits, one application can monopolize CPU or memory, starving others. This is the "noisy neighbor" problem.
Kubernetes uses resource requests and limits to solve this:
- Requests tell Kubernetes how much CPU/memory your Pod needs to run properly. Kubernetes uses this for scheduling decisions.
- Limits set hard caps on how much CPU/memory your Pod can use. If a Pod exceeds its memory limit, it gets killed.
A note about ephemeral storage: You can also set requests and limits for ephemeral-storage
, which controls temporary disk space inside containers. This becomes important for applications that generate lots of log files, cache data locally, or create temporary files during processing. Without ephemeral storage limits, a runaway process that fills up disk space can cause confusing Pod evictions that are hard to debug. While we won't add storage limits to our ETL example, keep this in mind for data processing jobs that work with large temporary files.
Adding Resource Controls
Now let's add resource controls to prevent our application from consuming too many cluster resources. Edit your etl-deployment.yaml
file and add a resources
section right after the environment variables. The resources
section should align with other container properties like image
and env
. Make sure resources:
starts at the same column as those properties (8 spaces from the left margin):
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
Apply the updated configuration:
kubectl apply -f etl-deployment.yaml
The resource specifications mean:
- requests: The Pod needs at least 128MB RAM and 0.1 CPU cores to run
- limits: The Pod cannot use more than 256MB RAM or 0.5 CPU cores
CPU is measured in "millicores" where 1000m = 1 CPU core. Memory uses standard units (Mi = mebibytes).
Check that Kubernetes scheduled your Pods with these constraints:
kubectl describe pod -l app=etl-app | grep -A3 "Limits"
You'll see output showing your resource configuration for each Pod. Kubernetes uses these requests to decide if a node has enough free resources to run your Pod. If your cluster doesn't have enough resources available, Pods stay in the Pending state until resources free up.
Understanding Resource Impact
Resources affect two critical behaviors:
- Scheduling: When Kubernetes needs to place a Pod, it only considers nodes with enough unreserved resources to meet your requests. If you request 4GB of RAM but all nodes only have 2GB free, your Pod won't schedule.
- Runtime enforcement: If your Pod tries to use more memory than its limit, Kubernetes kills it (OOMKilled status). CPU limits work differently - instead of killing the Pod, Kubernetes throttles it to stay within the limit. Be aware that heavy CPU throttling can slow down probe responses, which might cause Kubernetes to restart the Pod if health checks start timing out.
Quality of Service (QoS): Your resource configuration determines how Kubernetes prioritizes your Pod during resource pressure. You can see this in action:
kubectl describe pod -l app=etl-app | grep "QoS Class"
You'll likely see "Burstable" because our requests are lower than our limits. This means the Pod can use extra resources when available, but might get evicted if the node runs short. For critical production workloads, you often want "Guaranteed" QoS by setting requests equal to limits, which provides more predictable performance and better protection from eviction.
This is why setting appropriate values matters. Too low and your application crashes or runs slowly. Too high and you waste resources that other applications could use.
Security: Running as Non-Root
By default, containers often run as root (user ID 0). This is a security risk - if someone exploits your application, they have root privileges inside the container. While container isolation provides some protection, defense in depth means we should run as non-root users whenever possible.
Configuring Non-Root Execution
Edit your etl-deployment.yaml
file and add a securityContext
section inside the existing Pod template spec. Find the section that looks like this:
template:
metadata:
labels:
app: etl-app
spec:
containers:
Add the securityContext
right after the spec:
line and before the containers:
line:
template:
metadata:
labels:
app: etl-app
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
# ... rest of container spec
Apply the secure configuration:
kubectl apply -f etl-deployment.yaml
The securityContext
settings:
- runAsNonRoot: Prevents the container from running as root
- runAsUser: Specifies user ID 1000 (a non-privileged user)
- fsGroup: Sets the group ownership for mounted volumes
Since we changed the Pod template, Kubernetes needs to create new Pods with the security context. Check that the rollout completes:
kubectl rollout status deployment/etl-app
You should see "deployment successfully rolled out" when it's finished. Now verify the container is running as a non-root user:
kubectl exec deployment/etl-app -- id
You should see uid=1000
, not uid=0(root)
.
Configuration Without Rebuilds
So far, we've hardcoded configuration like database passwords directly in our deployment YAML. This is problematic for several reasons:
- Changing configuration requires updating deployment files
- Sensitive values like passwords are visible in plain text
- Different environments (development, staging, production) need different values
Kubernetes provides ConfigMaps for non-sensitive configuration and Secrets for sensitive data. Both let you change configuration without rebuilding containers, but they offer different ways to deliver that configuration to your applications.
Creating ConfigMaps and Secrets
First, create a ConfigMap for non-sensitive configuration:
kubectl create configmap app-config \
--from-literal=DB_HOST=postgres \
--from-literal=DB_PORT=5432 \
--from-literal=DB_NAME=pipeline \
--from-literal=LOG_LEVEL=INFO
Now create a Secret for sensitive data:
kubectl create secret generic db-credentials \
--from-literal=DB_USER=etl \
--from-literal=DB_PASSWORD=mysecretpassword
Secrets are base64 encoded (not encrypted) by default. In production, you'd use additional tools for encryption at rest.
View what was created:
kubectl get configmap app-config -o yaml
kubectl get secret db-credentials -o yaml
Notice that the Secret values are base64 encoded. You can decode them:
echo "bXlzZWNyZXRwYXNzd29yZA==" | base64 -d
Using Environment Variables
Kubernetes gives you two main ways to use ConfigMaps and Secrets in your applications: as environment variables (which we'll use) or as mounted files inside your containers. Environment variables work well for simple key-value configuration like database connections. Volume mounts are better for complex configuration files, certificates, or when you need to rotate secrets without restarting containers. We'll stick with environment variables to keep things focused, but keep volume mounts in mind for more advanced scenarios.
Edit your etl-deployment.yaml
file to use these external configurations. Replace the hardcoded env
section with:
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: db-credentials
env:
- name: APP_VERSION
value: v1
The key change is envFrom
, which loads all key-value pairs from the ConfigMap and Secret as environment variables.
Apply the final configuration:
kubectl apply -f etl-deployment.yaml
Updating Configuration Without Rebuilds
Here's where ConfigMaps and Secrets shine. Let's change the log level without touching the container image:
kubectl edit configmap app-config
Change LOG_LEVEL
from INFO
to DEBUG
and save.
ConfigMap changes don't automatically restart Pods, so trigger a rollout:
kubectl rollout restart deployment/etl-app
kubectl rollout status deployment/etl-app
Verify the new configuration is active:
kubectl exec deployment/etl-app -- env | grep LOG_LEVEL
You just changed application configuration without rebuilding the container image or modifying deployment files. This pattern becomes powerful when you have dozens of configuration values that differ between environments.
Cleaning Up
When you're done experimenting:
# Delete deployments and services
kubectl delete deployment etl-app postgres
kubectl delete service postgres
# Delete configuration
kubectl delete configmap app-config
kubectl delete secret db-credentials
# Stop Minikube
minikube stop
Production Patterns in Action
You've transformed a basic Kubernetes deployment into something ready for production. Your application now:
- Communicates its health to Kubernetes through readiness and liveness probes
- Respects resource boundaries to be a good citizen in shared clusters
- Runs securely as a non-root user
- Accepts configuration changes without rebuilding containers
These patterns follow real production practices you'll see in enterprise Kubernetes deployments. Health checks prevent cascading failures when dependencies have issues. Resource limits prevent cluster instability when applications misbehave. Non-root execution reduces security risks if vulnerabilities get exploited. External configuration enables GitOps workflows where you manage settings separately from code.
These same patterns scale from simple applications to complex microservices architectures. A small ETL pipeline uses the same production readiness features as a system handling millions of requests per day.
Every production Kubernetes deployment needs these safeguards. Without health checks, broken Pods receive traffic. Without resource limits, one application can destabilize an entire cluster. Without external configuration, simple changes require complex rebuilds.
Next Steps
Now that your applications are production-ready, you can explore advanced Kubernetes features:
- Horizontal Pod Autoscaling (HPA): Automatically scale replicas based on CPU/memory usage
- Persistent Volumes: Handle stateful applications that need durable storage
- Network Policies: Control which Pods can communicate with each other
- Pod Disruption Budgets: Ensure minimum availability during cluster maintenance
- Service Mesh: Add advanced networking features like circuit breakers and retries
The patterns you've learned here remain the same whether you're running on Minikube, Amazon EKS, Google GKE, or your own Kubernetes cluster. Start with these fundamentals, and add complexity only when your requirements demand it.
Remember that Kubernetes is a powerful tool, but not every application needs all its features. Use health checks and resource limits everywhere. Add other features based on actual requirements, not because they seem interesting. The best Kubernetes deployments are often the simplest ones that solve real problems.