Design 16: VMSS Auto-scaling (Scale Sets)

Summary

This design implements Virtual Machine Scale Sets (VMSS).

Topology: The VMSS runs in a Spoke VNet. It scales out based on CPU load.

1. Key Design Decisions (ADR)

ADR-01: Orchestration

Decision: Flexible Orchestration.

Design 16: VMSS Auto-scaling (Scale Sets)

Summary

This design implements Virtual Machine Scale Sets (VMSS).

Topology: The VMSS runs in a Spoke VNet. It scales out based on CPU load.

1. Key Design Decisions (ADR)

ADR-01: Orchestration

Decision: Flexible Orchestration.
Rationale: Mixes standard VMs and Spot VMs. High availability.

2. High-Level Design (HLD)

+--------------+           +--------------------------+           +--------------+
|  Internet    |           |        HUB VNet          |           |  SPOKE VNet  |
|  (Users)     |           |      (Firewall)          |           |  (Compute)   |
+------+-------+           +------------+-------------+           +------+-------+
       |                                |                                |
       v                                | (Peering)                      |
+------+-------+                        v                                v
|  Public      |           +------------+-------------+           +------+-------+
|  Load        |---------->| Azure Firewall           |<--------->|  VM Scale    |
|  Balancer    |           | (Egress/Mgmt)            |           |  Set         |
+--------------+           +--------------------------+           +------+-------+

3. Low-Level Design (LLD)

                               PRIMARY REGION (East US)
+-----------------------------------------------------------------------+
| HUB VNet: vnet-hub (10.0.0.0/16)                                      |
|   [Azure Firewall]                                                    |
+---------------|-------------------------------------------------------+
                | (Peering)
+---------------|-------------------------------------------------------+
| SPOKE VNet: vnet-scale-spoke (10.1.0.0/16)                            |
|   +-----------------------+                                           |
|   | Subnet: Compute       |                                           |
|   | [Public LB]           |                                           |
|   |   v                   |                                           |
|   | [VMSS]                |                                           |
|   |   [VM 1] [VM 2] ...   |                                           |
|   +-----------------------+                                           |
+-----------------------------------------------------------------------+
                               SECONDARY REGION (West US)
+-----------------------------------------------------------------------+
| DR STRATEGY                                                           |
|   +-----------------------+                                           |
|   | VMSS (DR)             |                                           |
|   | (Min Count: 0)        |                                           |
|   +-----------------------+                                           |
+-----------------------------------------------------------------------+

4. Component Rationale

Autoscale: Adds VMs when CPU > 75%. Removes when < 25%.
Public Load Balancer: Distributes inbound traffic to the VMSS instances.

5. Strategy: High Availability (HA)

Zones: Spread across Zones 1, 2, 3.

6. Strategy: Disaster Recovery (DR)

Implementation: Active-Passive.
Process: Deploy VMSS in West US with capacity 0. Scale up in disaster.

7. Strategy: Backup

Image: Backup the Custom Image (Gallery).

8. Strategy: Security

NSG: Allow LB traffic only.

9. Well-Architected Framework Analysis

Reliability: High.
Security: High.
Cost Optimization: High. Only pay for what you use.
Operational Excellence: High.
Performance Efficiency: Excellent.

10. Detailed Traffic Flow

1. Load: Traffic spikes.

2. Metric: Average CPU hits 80%.

3. Scale Out: Azure adds 2 new VMs.

4. LB: LB adds new IPs to backend pool.

5. Serve: Traffic distributed.

11. Runbook: Deployment Guide (Azure Portal)

Phase 1: Create Spoke VNet

1. Create Resource Group: rg-design16-vmss. Region: East US.

2. Create VNet: vnet-scale-spoke (10.1.0.0/16) with subnet snet-compute (10.1.1.0/24).

3. Peer to vnet-hub.

Phase 2: Create Public Load Balancer

1. Search: "Load Balancers" -> + Create.

2. Name: lb-vmss.

3. Type: Public.

4. SKU: Standard.

5. Frontend IP: Create new pip-lb-vmss.

6. Create.

7. Backend Pool: Go to LB -> Backend pools -> + Add. Name: bep-vmss. Save.

8. Load Balancing Rule:

* Name: rule-http.

* Frontend: pip-lb-vmss.

* Backend pool: bep-vmss.

* Protocol: TCP. Port: 80. Backend Port: 80.

* Probe: Create new probe-http (Port 80).

* Save.

Phase 3: Create VM Scale Set

1. Search: "Virtual machine scale sets" -> + Create.

2. Resource Group: rg-design16-vmss.

3. Name: vmss-web.

4. Region: East US.

5. Availability Zones: 1, 2, 3.

6. Orchestration: Flexible.

7. Image: Ubuntu Server 20.04 LTS.

8. Networking:

* Virtual network: vnet-scale-spoke.

* Network interface: Edit.

* Load balancing options: Azure load balancer.

* Select load balancer: lb-vmss.

* Select backend pool: bep-vmss.

9. Scaling:

* Initial instance count: 2.

* Scaling policy: Custom.

* Minimum: 1. Maximum: 10.

* Scale out: CPU threshold 75%, Duration 10 min, Increase by 1.

* Scale in: CPU threshold 25%, Duration 10 min, Decrease by 1.

10. Create.

Phase 4: Stress Test

1. Connect: SSH into one of the VMSS instances (via Bastion or Public IP if allowed).

2. Install Stress: sudo apt-get update && sudo apt-get install stress.

3. Run Stress: stress --cpu 8 --timeout 600.

4. Monitor: Go to VMSS -> Monitoring -> Metrics. Watch "Percentage CPU" spike.

5. Verify Scale: After ~10 mins, check Instances. You should see count increase.