← Back to Dashboard

Design 33: AKS Advanced (Private Cluster)

Summary

This design implements a Private AKS Cluster for high security.

Topology: The API Server has no Public IP. It is accessed via a Private Endpoint in the Spoke VNet. Admins must be in the Hub VNet (or connected via VPN) to run kubectl.

1. Key Design Decisions (ADR)

ADR-01: Privacy

  • Decision: Private Cluster.
  • Rationale: API Server endpoint is internal only. Reduces attack surface.

ADR-02: Egress

  • Decision: Route via Azure Firewall (in Hub).
  • Rationale: Inspect all outbound traffic from Pods (e.g., block crypto mining).

2. High-Level Design (HLD)

+--------------+           +--------------------------+           +--------------+
|  Admin       |           |        HUB VNet          |           |  SPOKE VNet  |
|  (VPN)       |           |      (Firewall)          |           |  (AKS)       |
+------+-------+           +------------+-------------+           +------+-------+
       |                                |                                |
       v                                | (Peering)                      |
+------+-------+                        v                                v
|  Jumpbox VM  |           +------------+-------------+           +------+-------+
|  (10.0.1.5)  |---------->| Private DNS Zone         |<----------|  AKS API     |
+--------------+           | (privatelink.eastus.azmk8s)|         |  (Private)   |
                           +--------------------------+           +------+-------+

3. Low-Level Design (LLD)

                               PRIMARY REGION (East US)
+-----------------------------------------------------------------------+
| HUB VNet: vnet-hub (10.0.0.0/16)                                      |
|   +-----------------------+                                           |
|   | Azure Firewall        |                                           |
|   +-----------|-----------+                                           |
|               |                                                       |
|               v (Peering)                                             |
+---------------|-------------------------------------------------------+
                |
+---------------|-------------------------------------------------------+
| SPOKE VNet: vnet-aks-private (10.1.0.0/16)                            |
|   +-----------------------+                                           |
|   | Subnet: AKS           |                                           |
|   | Route Table: 0/0->FW  |                                           |
|   | [AKS Nodes]           |                                           |
|   | [Private Endpoint]    |                                           |
|   +-----------------------+                                           |
+---------------|-------------------------------------------------------+
                |
                | (GitOps)
                v
+-----------------------------------------------------------------------+
| SECONDARY REGION (West US) - DR Site                                  |
|                                                                       |
|   +-----------------------+                                           |
|   | Private AKS (Standby) |                                           |
|   +-----------------------+                                           |
+-----------------------------------------------------------------------+

4. Component Rationale

  • UDR (User Defined Route): Forces 0.0.0.0/0 to the Hub Firewall IP.

5. Strategy: High Availability (HA)

  • SLA: Uptime SLA (Paid) guarantees 99.95%.

6. Strategy: Disaster Recovery (DR)

  • Implementation: Multi-Cluster.
  • Process: Same as Design 22. Use GitOps to keep West US cluster in sync.

7. Strategy: Backup

  • Velero: Backup namespaces to Blob Storage.

8. Strategy: Security

  • API Access: Only accessible from VNet.
  • Pod Identity: Use Workload Identity (Federated) instead of secrets.

9. Well-Architected Framework Analysis

  • Reliability: High.
  • Security: Excellent. Fully private.
  • Cost Optimization: Medium. Uptime SLA costs ~$70/mo.
  • Operational Excellence: High.
  • Performance Efficiency: High.

10. Detailed Traffic Flow

1. Admin: Connects VPN to Hub.

2. Command: kubectl get pods.

3. DNS: Resolves aks-prod.privatelink... to 10.1.1.5.

4. Connect: Connects to API Server.

5. Success: Returns pod list.

11. Runbook: Deployment Guide (Azure Portal)

11. Runbook: Deployment Guide (Azure Portal)

Phase 1: Create Spoke VNet

1. Search: "Virtual networks" -> + Create.

2. Resource Group: rg-aks-private.

3. Name: vnet-aks-private.

4. Region: East US.

5. Subnet: snet-aks-nodes (Range 10.1.0.0/22).

6. Create.

7. Peer this VNet to your Hub VNet (vnet-hub).

Phase 2: Create Private AKS Cluster

1. Search: "Kubernetes services" -> + Create.

2. Basics:

* Resource Group: rg-aks-private.

* Cluster name: aks-private-prod.

* Region: East US.

* Availability zones: 1, 2, 3.

3. Networking:

* Network configuration: Azure CNI Node Subnet.

* DNS name prefix: aks-prod.

* Enable private cluster: Check this box. Critical.

* Private DNS zone: Default (Azure creates a new one) or select existing.

* Virtual network: vnet-aks-private.

* Cluster subnet: snet-aks-nodes.

4. Integrations:

* Azure Policy: Enable.

5. Create.

Phase 3: Link DNS to Hub

1. Go to Private DNS zones.

2. Find the zone named something like privatelink.eastus.azmk8s.io (Created by AKS).

3. Virtual network links -> + Add.

4. Link name: link-to-hub.

5. Virtual network: vnet-hub.

6. OK.

* *Reason: The Jumpbox in the Hub needs to resolve the API Server IP.*

Phase 4: Verify Access (from Jumpbox)

1. Login to a Windows/Linux VM in the Hub VNet.

2. Install CLI: az login.

3. Get Credentials:

* az aks get-credentials --resource-group rg-aks-private --name aks-private-prod

4. Run Command: kubectl get nodes.

* Success: You see the nodes.

* Failure: If you run this from your laptop (outside Azure), it will timeout. This proves it is private.