Production Cluster
In the real world, Kubernetes clusters must be designed to handle production demands. This section dives into two critical aspects: High Availability and Load Balancing. We’ll walk through practical implementations using real-world scenarios and configurations that you can deploy immediately.
High Availability
High availability (HA) is non-negotiable in production environments. It ensures your services remain accessible even when individual components fail. In Kubernetes, HA is achieved through three key layers:
- Control Plane HA: The control plane must be resilient. A common pattern is to run the control plane on multiple nodes (e.g., 3 master nodes) with a shared etcd cluster.
- Worker Node HA: Worker nodes should be configured with redundancy and failover capabilities.
- Stateful Services: Critical services (like databases) require stateful sets with persistent storage and replication.
Here’s a practical example of a stateful set for a PostgreSQL database (a common production use case):
<code class="language-yaml"># postgres-statefulset.yaml <p>apiVersion: apps/v1</p> <p>kind: StatefulSet</p> <p>metadata:</p> <p> name: postgres</p> <p>spec:</p> <p> replicas: 3</p> <p> serviceName: "postgres-service"</p> <p> template:</p> <p> metadata:</p> <p> labels:</p> <p> app: postgres</p> <p> spec:</p> <p> containers:</p> <p> - name: postgres</p> <p> image: postgres:14</p> <p> env:</p> <p> - name: POSTGRES_PASSWORD</p> <p> value: "securepassword"</p> <p> ports:</p> <p> - containerPort: 5432</p> <p> volumeMounts:</p> <p> - name: postgres-data</p> <p> mountPath: /var/lib/postgresql/data</p> <p> volumeClaimTemplates:</p> <p> - metadata:</p> <p> name: postgres-data</p> <p> spec:</p> <p> accessModes: ["ReadWriteOnce"]</p> <p> storageClassName: "standard"</p> <p> resources:</p> <p> requests:</p> <p> storage: "10Gi"</code>
This StatefulSet ensures:
- Each PostgreSQL pod has its own persistent volume (preventing data loss on failure)
- Pods are ordered (pod 0, 1, 2) with automatic replication
- The service name
postgres-serviceprovides stable DNS resolution
Pro tip: Always test your HA setup with controlled failures. Delete one PostgreSQL pod and verify the database continues operating with minimal downtime.
Load Balancing
In production, load balancing is essential for distributing traffic across multiple instances and handling scaling. Kubernetes provides two primary approaches:
- In-Cluster Load Balancing: Using Kubernetes Services (type
LoadBalancer) for internal traffic distribution - External Load Balancing: Using Ingress controllers for public HTTP(S) traffic routing
Let’s walk through a real-world implementation:
Step 1: Deploy a web application (Node.js example)
<code class="language-yaml"># web-deployment.yaml <p>apiVersion: apps/v1</p> <p>kind: Deployment</p> <p>metadata:</p> <p> name: web</p> <p>spec:</p> <p> replicas: 3</p> <p> selector:</p> <p> matchLabels:</p> <p> app: web</p> <p> template:</p> <p> metadata:</p> <p> labels:</p> <p> app: web</p> <p> spec:</p> <p> containers:</p> <p> - name: web</p> <p> image: my-web-app:1.0</p> <p> ports:</p> <p> - containerPort: 8080</code>
Step 2: Create an internal load balancer
<code class="language-yaml"># web-service.yaml <p>apiVersion: v1</p> <p>kind: Service</p> <p>metadata:</p> <p> name: web</p> <p>spec:</p> <p> type: LoadBalancer</p> <p> ports:</p> <p> - port: 80</p> <p> targetPort: 8080</p> <p> selector:</p> <p> app: web</code>
Step 3: Implement external HTTP routing (Nginx Ingress)
<code class="language-yaml"># nginx-ingress.yaml <p>apiVersion: networking.k8s.io/v1</p> <p>kind: Ingress</p> <p>metadata:</p> <p> name: web-ingress</p> <p>spec:</p> <p> rules:</p> <p> - http:</p> <p> paths:</p> <p> - path: "/"</p> <p> pathType: Prefix</p> <p> backend:</p> <p> service:</p> <p> name: web</p> <p> port:</p> <p> number: 80</code>
When applied, Kubernetes creates a cloud provider load balancer (e.g., AWS ALB) that distributes traffic to your 3 pods. For complex routing, the Ingress controller handles path-based routing, hostnames, and SSL termination.
Pro tip: Always configure health checks using livenessProbe and readinessProbe in your pod specifications to ensure only healthy instances receive traffic.
Here’s a quick comparison of load balancing approaches:
| Approach | When to Use | Example Use Case |
|---|---|---|
| In-Cluster (Service) | Internal traffic distribution within cluster | Microservice backend communication |
| External (Ingress) | Public HTTP(S) traffic from clients | Customer-facing web applications |
Summary
In this section, we’ve covered two critical production aspects of Kubernetes clusters:
- High Availability: Ensuring your cluster remains operational despite node failures through multi-node control planes and stateful services.
- Load Balancing: Distributing traffic efficiently using Kubernetes Services and Ingress controllers for both internal and external traffic.
These practices form the foundation of a resilient and scalable production cluster. 🔄🌐