Skip to main content

Deployments

Deployments provide declarative updates for Pods and ReplicaSets. They enable you to describe the desired state of your application, and the Deployment controller changes the actual state to match at a controlled rate.

Basic Deployment

Here’s a complete Deployment with rolling update strategy:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nnwebserver
spec:
  selector:
    matchLabels:
      run: nnwebserver
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        run: nnwebserver
    spec:
      containers:
        - name: nnwebserver
          image: lovelearnlinux/webserver:v2
          livenessProbe:
            exec:
              command:
              - cat 
              - /var/www/html/index.html
            initialDelaySeconds: 10
            timeoutSeconds: 3
            periodSeconds: 20
            failureThreshold: 3
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "200m"
              memory: "256Mi"
          ports:
            - containerPort: 80
              name: http
              protocol: TCP

Deployment Components

  • selector: Defines how the Deployment finds Pods to manage
  • replicas: Number of Pod copies to maintain
  • strategy: How to replace old Pods with new ones
  • template: Pod template used to create new Pods
  • matchLabels: Labels that must match for the Deployment to manage the Pod

Rolling Update Strategy

Rolling updates allow you to update Pods with zero downtime.
strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1        # Maximum number of Pods above desired count
    maxUnavailable: 0  # Maximum number of Pods that can be unavailable
Specifies the maximum number of Pods that can be created above the desired replica count during an update.
  • Value: 1: Can temporarily have replicas + 1 Pods
  • Value: 25%: Can temporarily have 25% more Pods than desired
Higher values speed up rollouts but use more resources.
You cannot set both maxSurge and maxUnavailable to 0.

Managing Deployments

Create and Inspect

# Create deployment
kubectl create -f deployment-one.yml

# View deployments
kubectl get deployment

# Describe deployment
kubectl describe deployment nnwebserver

# View ReplicaSets created by deployment
kubectl get rs --selector=run=nnwebserver

Scaling

# Scale deployment
kubectl scale deployment nnwebserver --replicas=3

# Verify scaling
kubectl get deployment nnwebserver
If you manually scale a ReplicaSet managed by a Deployment, the Deployment controller will reset it to match the Deployment’s replica count.

Rollout Management

# Check rollout status
kubectl rollout status deployment nnwebserver

# Pause rollout (useful for canary deployments)
kubectl rollout pause deployment nnwebserver

# Resume rollout
kubectl rollout resume deployment nnwebserver

# View rollout history
kubectl rollout history deployment nnwebserver

# Rollback to previous version
kubectl rollout undo deployment nnwebserver

# Rollback to specific revision
kubectl rollout undo deployment nnwebserver --to-revision=2

Updating Deployments

To update a Deployment, modify the YAML and apply:
kubectl apply -f deployment-one.yml
You can also update the image directly:
kubectl set image deployment/nnwebserver nnwebserver=lovelearnlinux/webserver:v2
Add annotations to track change reasons:
template:
  metadata:
    annotations:
      kubernetes.io/change-cause: "updated to new version"

Horizontal Pod Autoscaler (HPA)

HPA automatically scales the number of Pods based on observed metrics like CPU utilization.

Deployment for Autoscaling

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-autoscaler
spec:
  selector:
    matchLabels:
      run: k8s-autoscaler
  replicas: 2
  template:
    metadata:
      labels:
        run: k8s-autoscaler
    spec:
      containers:
      - name: k8s-autoscaler
        image: lovelearnlinux/webserver:v1
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
            memory: 256Mi
          requests:
            cpu: 200m
            memory: 128Mi

HPA Configuration (v2)

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 66
metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 66  # Scale when CPU > 66%

HPA Management

# Create HPA
kubectl create -f hpa-for-deployment-v2.yaml

# View HPA status
kubectl get hpa

# Describe HPA
kubectl describe hpa my-app

# Quick create HPA via CLI
kubectl autoscale deployment my-app --cpu-percent=66 --min=1 --max=5
Prerequisites for HPA:
  • Metrics Server must be installed in the cluster
  • Pods must have resource requests defined
  • HPA checks metrics every 15 seconds by default

Deployment Strategies

Default strategy. Gradually replaces old Pods with new ones.Pros: Zero downtime, gradual rollout
Cons: Both versions run simultaneously
strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0

Best Practices

  • Always define resource requests and limits for HPA to work correctly
  • Set maxUnavailable: 0 for zero-downtime deployments
  • Use minReadySeconds to ensure Pods are stable before marking ready
  • Track changes with annotations for better rollout history
  • Test updates in staging before production
  • Monitor rollout status and have rollback plans ready
  • Use HPA for workloads with variable traffic patterns
  • Set conservative autoscaling thresholds to avoid flapping

Cleanup

# Delete deployment (also deletes managed ReplicaSets and Pods)
kubectl delete deployment nnwebserver

# Delete HPA
kubectl delete hpa my-app

Build docs developers (and LLMs) love