Production Cluster

In the real world, Kubernetes clusters must be designed to handle production demands. This section dives into two critical aspects: High Availability and Load Balancing. We’ll walk through practical implementations using real-world scenarios and configurations that you can deploy immediately.

High Availability

High availability (HA) is non-negotiable in production environments. It ensures your services remain accessible even when individual components fail. In Kubernetes, HA is achieved through three key layers:

Control Plane HA: The control plane must be resilient. A common pattern is to run the control plane on multiple nodes (e.g., 3 master nodes) with a shared etcd cluster.
Worker Node HA: Worker nodes should be configured with redundancy and failover capabilities.
Stateful Services: Critical services (like databases) require stateful sets with persistent storage and replication.

Here’s a practical example of a stateful set for a PostgreSQL database (a common production use case):

<code class="language-yaml"># postgres-statefulset.yaml
<p>apiVersion: apps/v1</p>
<p>kind: StatefulSet</p>
<p>metadata:</p>
<p>  name: postgres</p>
<p>spec:</p>
<p>  replicas: 3</p>
<p>  serviceName: "postgres-service"</p>
<p>  template:</p>
<p>    metadata:</p>
<p>      labels:</p>
<p>        app: postgres</p>
<p>    spec:</p>
<p>      containers:</p>
<p>      - name: postgres</p>
<p>        image: postgres:14</p>
<p>        env:</p>
<p>        - name: POSTGRES_PASSWORD</p>
<p>          value: "securepassword"</p>
<p>        ports:</p>
<p>        - containerPort: 5432</p>
<p>        volumeMounts:</p>
<p>        - name: postgres-data</p>
<p>          mountPath: /var/lib/postgresql/data</p>
<p>  volumeClaimTemplates:</p>
<p>  - metadata:</p>
<p>      name: postgres-data</p>
<p>    spec:</p>
<p>      accessModes: ["ReadWriteOnce"]</p>
<p>      storageClassName: "standard"</p>
<p>      resources:</p>
<p>        requests:</p>
<p>          storage: "10Gi"</code>

This StatefulSet ensures:

Each PostgreSQL pod has its own persistent volume (preventing data loss on failure)
Pods are ordered (pod 0, 1, 2) with automatic replication
The service name postgres-service provides stable DNS resolution

Pro tip: Always test your HA setup with controlled failures. Delete one PostgreSQL pod and verify the database continues operating with minimal downtime.

Load Balancing

In production, load balancing is essential for distributing traffic across multiple instances and handling scaling. Kubernetes provides two primary approaches:

In-Cluster Load Balancing: Using Kubernetes Services (type LoadBalancer) for internal traffic distribution
External Load Balancing: Using Ingress controllers for public HTTP(S) traffic routing

Let’s walk through a real-world implementation:

Step 1: Deploy a web application (Node.js example)

<code class="language-yaml"># web-deployment.yaml
<p>apiVersion: apps/v1</p>
<p>kind: Deployment</p>
<p>metadata:</p>
<p>  name: web</p>
<p>spec:</p>
<p>  replicas: 3</p>
<p>  selector:</p>
<p>    matchLabels:</p>
<p>      app: web</p>
<p>  template:</p>
<p>    metadata:</p>
<p>      labels:</p>
<p>        app: web</p>
<p>    spec:</p>
<p>      containers:</p>
<p>      - name: web</p>
<p>        image: my-web-app:1.0</p>
<p>        ports:</p>
<p>        - containerPort: 8080</code>

Step 2: Create an internal load balancer

<code class="language-yaml"># web-service.yaml
<p>apiVersion: v1</p>
<p>kind: Service</p>
<p>metadata:</p>
<p>  name: web</p>
<p>spec:</p>
<p>  type: LoadBalancer</p>
<p>  ports:</p>
<p>  - port: 80</p>
<p>    targetPort: 8080</p>
<p>  selector:</p>
<p>    app: web</code>

Step 3: Implement external HTTP routing (Nginx Ingress)

<code class="language-yaml"># nginx-ingress.yaml
<p>apiVersion: networking.k8s.io/v1</p>
<p>kind: Ingress</p>
<p>metadata:</p>
<p>  name: web-ingress</p>
<p>spec:</p>
<p>  rules:</p>
<p>  - http:</p>
<p>      paths:</p>
<p>      - path: "/"</p>
<p>        pathType: Prefix</p>
<p>        backend:</p>
<p>          service:</p>
<p>            name: web</p>
<p>            port:</p>
<p>              number: 80</code>

When applied, Kubernetes creates a cloud provider load balancer (e.g., AWS ALB) that distributes traffic to your 3 pods. For complex routing, the Ingress controller handles path-based routing, hostnames, and SSL termination.

Pro tip: Always configure health checks using livenessProbe and readinessProbe in your pod specifications to ensure only healthy instances receive traffic.

Here’s a quick comparison of load balancing approaches:

Approach	When to Use	Example Use Case
In-Cluster (Service)	Internal traffic distribution within cluster	Microservice backend communication
External (Ingress)	Public HTTP(S) traffic from clients	Customer-facing web applications

Summary

In this section, we’ve covered two critical production aspects of Kubernetes clusters:

High Availability: Ensuring your cluster remains operational despite node failures through multi-node control planes and stateful services.
Load Balancing: Distributing traffic efficiently using Kubernetes Services and Ingress controllers for both internal and external traffic.

These practices form the foundation of a resilient and scalable production cluster. 🔄🌐