⚖️ Blog 5: Autoscaling Your Angular App with Kubernetes HPA (Horizontal Pod Autoscaler)
Welcome to Part 5 of our 7-blog DevOps series! So far, you’ve deployed your Angular app on Minikube and exposed it with Ingress. Now it’s time to make your app smartly scale based on demand using Kubernetes' Horizontal Pod Autoscaler (HPA).
🤔 What Does This Term Mean and Why Use It?
- Horizontal Pod Autoscaler (HPA): A Kubernetes feature that automatically adjusts the number of pods in a deployment or replication controller based on observed CPU utilization (or other select metrics). "Horizontal" means it scales by adding or removing more instances (pods) of your application.
- Why? ⚖️ HPA helps your application efficiently handle varying loads. When traffic increases, it automatically scales up (adds more pods) to maintain performance and responsiveness. When traffic drops, it scales down (removes pods) to save computing resources and costs. It ensures your app is always performing optimally without manual intervention.
🛠️ Prerequisites
| Requirement | Status |
|---|---|
| Minikube cluster running | ✅ |
| Metrics Server enabled | ✅ (we enabled this in Blog 1) |
| Angular app deployed and running | ✅ |
⚙️ Step 1: Check Metrics Server is Running
Metrics Server provides resource usage data to HPA.
Run:
kubectl get pods -n kube-system | grep metrics-server
You should see a metrics-server pod in Running state.
If not running, enable it:
minikube addons enable metrics-server
⚙️ Step 2: Create the HPA Resource
Create a file angular-app-hpa.yaml:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: angular-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: angular-app-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 20
Explanation:
scaleTargetRefpoints to your deploymentminReplicasandmaxReplicasdefine scaling limits- Target CPU utilization is set to
20%(you can adjust this)
⚙️ Step 3: Apply the HPA Configuration
Apply the HPA YAML:
kubectl apply -f angular-app-hpa.yaml
Check HPA status (watch it update live):
kubectl get hpa -w
⚙️ Step 4: Generate Load to Trigger Scaling
To see autoscaling in action, you need some CPU load on your app.
Option A: Using hey Load Testing Tool (if installed)
Run:
hey -z 2m -q 5 -c 10 http://myapp.local/
-z 2m→ run for 2 minutes-q 5→ 5 requests per second-c 10→ 10 concurrent users
Option B: Using a simple PowerShell loop Open PowerShell and run:
for ($i=0; $i -lt 1000; $i++) {Invoke-WebRequest http://myapp.local/}
This will send 1000 HTTP requests.
⚙️ Step 5: Watch Pods Scale
Keep running:
kubectl get pods -w
You will see new pods getting created as CPU usage climbs, and pods reduce when load decreases.
🧪 Step 6: Check Pod CPU Usage
Run:
kubectl top pods
You’ll see CPU and memory usage per pod — helping you understand the load.
📝 Notes & Tips
- HPA depends on
metrics-serverto get real-time CPU stats. - You can adjust
averageUtilizationto a higher or lower value depending on your needs. - HPA can also scale on other metrics (memory, custom metrics) but CPU is most common.
✅ Recap
You have successfully:
- Learned what Horizontal Pod Autoscaler (HPA) is
- Created and applied an HPA resource for your Angular app
- Triggered load and watched Kubernetes scale pods up/down automatically