Devops | Frontend Devops With Angular And Minikube | Autoscaling Your Angular App with Kubernetes HPA

⚖️ Blog 5: Autoscaling Your Angular App with Kubernetes HPA (Horizontal Pod Autoscaler)

Welcome to Part 5 of our 7-blog DevOps series! So far, you’ve deployed your Angular app on Minikube and exposed it with Ingress. Now it’s time to make your app smartly scale based on demand using Kubernetes' Horizontal Pod Autoscaler (HPA).

🤔 What Does This Term Mean and Why Use It?

Horizontal Pod Autoscaler (HPA): A Kubernetes feature that automatically adjusts the number of pods in a deployment or replication controller based on observed CPU utilization (or other select metrics). "Horizontal" means it scales by adding or removing more instances (pods) of your application.
- Why? ⚖️ HPA helps your application efficiently handle varying loads. When traffic increases, it automatically scales up (adds more pods) to maintain performance and responsiveness. When traffic drops, it scales down (removes pods) to save computing resources and costs. It ensures your app is always performing optimally without manual intervention.

🛠️ Prerequisites

Requirement	Status
Minikube cluster running	✅
Metrics Server enabled	✅ (we enabled this in Blog 1)
Angular app deployed and running	✅

⚙️ Step 1: Check Metrics Server is Running

Metrics Server provides resource usage data to HPA.

Run:

kubectl get pods -n kube-system | grep metrics-server

You should see a metrics-server pod in Running state.

If not running, enable it:

minikube addons enable metrics-server

⚙️ Step 2: Create the HPA Resource

Create a file angular-app-hpa.yaml:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: angular-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: angular-app-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 20

Explanation:

scaleTargetRef points to your deployment
minReplicas and maxReplicas define scaling limits
Target CPU utilization is set to 20% (you can adjust this)

⚙️ Step 3: Apply the HPA Configuration

Apply the HPA YAML:

kubectl apply -f angular-app-hpa.yaml

Check HPA status (watch it update live):

kubectl get hpa -w

⚙️ Step 4: Generate Load to Trigger Scaling

To see autoscaling in action, you need some CPU load on your app.

Option A: Using hey Load Testing Tool (if installed) Run:

hey -z 2m -q 5 -c 10 http://myapp.local/

-z 2m → run for 2 minutes
-q 5 → 5 requests per second
-c 10 → 10 concurrent users

Option B: Using a simple PowerShell loop Open PowerShell and run:

for ($i=0; $i -lt 1000; $i++) {Invoke-WebRequest http://myapp.local/}

This will send 1000 HTTP requests.

⚙️ Step 5: Watch Pods Scale

Keep running:

kubectl get pods -w

You will see new pods getting created as CPU usage climbs, and pods reduce when load decreases.

🧪 Step 6: Check Pod CPU Usage

Run:

kubectl top pods

You’ll see CPU and memory usage per pod — helping you understand the load.

📝 Notes & Tips

HPA depends on metrics-server to get real-time CPU stats.
You can adjust averageUtilization to a higher or lower value depending on your needs.
HPA can also scale on other metrics (memory, custom metrics) but CPU is most common.

✅ Recap

You have successfully:

Learned what Horizontal Pod Autoscaler (HPA) is
Created and applied an HPA resource for your Angular app
Triggered load and watched Kubernetes scale pods up/down automatically