← Back to Dashboard

Design 29: Event Hubs (Big Data)

Summary

This design implements Azure Event Hubs for high-scale data ingestion (Kafka style).

Topology: The Event Hub Namespace is deployed in the Spoke VNet (via Private Endpoint). It peers to the Hub for secure access.

1. Key Design Decisions (ADR)

ADR-01: Connectivity

  • Decision: Private Endpoint.
  • Rationale: Prevent public access to the data stream.

ADR-02: Geo-Recovery

  • Decision: Use Geo-Recovery Alias.
  • Rationale: Provides a single connection string that automatically points to the secondary region if the primary fails.

2. High-Level Design (HLD)

+--------------+           +--------------------------+           +--------------+
|  Producer    |           |        HUB VNet          |           |  SPOKE VNet  |
|  (IoT Dev)   |           |      (DNS Resolver)      |           |  (Consumer)  |
+------+-------+           +------------+-------------+           +------+-------+
       |                                |                                |
       v                                | (Peering)                      |
+------+-------+                        v                                v
|  VPN Gateway |           +------------+-------------+           +------+-------+
|  (Ingress)   |---------->| Private DNS Zone         |<----------|  Function    |
+--------------+           | (privatelink.servicebus) |           |  (Reader)    |
                           +--------------------------+           +------+-------+
                                                                         |
                                                                         v
                                                                  +--------------+
                                                                  |  Event Hub   |
                                                                  |  (Namespace) |
                                                                  +--------------+

3. Low-Level Design (LLD)

                               PRIMARY REGION (East US)
+-----------------------------------------------------------------------+
| HUB VNet: vnet-hub (10.0.0.0/16)                                      |
|   +-----------------------+                                           |
|   | Private DNS Zone      |                                           |
|   +-----------|-----------+                                           |
|               |                                                       |
|               v (Peering)                                             |
+---------------|-------------------------------------------------------+
                |
+---------------|-------------------------------------------------------+
| SPOKE VNet: vnet-data-spoke (10.1.0.0/16)                             |
|   +-----------------------+       +-----------------------+           |
|   | Subnet: Workload      |       | Subnet: PrivateLink   |           |
|   | [Function App]        |------>| [Private Endpoint]    |           |
|   |                       |       | (10.1.1.5)            |           |
|   +-----------------------+       +-----------|-----------+           |
+-----------------------------------------------|-----------------------+
                                                |
                                                v
                                    +-----------------------+
                                    | Event Hub Namespace   |
                                    | (Primary)             |
                                    +-----------------------+

                                      |
                                      | (Metadata Sync)
                                      v

                               SECONDARY REGION (West US)
+-----------------------------------------------------------------------+
| DR SPOKE VNet                                                         |
|   +-----------------------+                                           |
|   | Event Hub Namespace   |                                           |
|   | (Secondary)           |                                           |
|   +-----------------------+                                           |
+-----------------------------------------------------------------------+

4. Component Rationale

  • Capture: Feature to automatically save all events to Blob Storage (Data Lake) for long-term archival.

5. Strategy: High Availability (HA)

  • SLA: 99.95% (Standard).
  • Zones: Enable Availability Zones.

6. Strategy: Disaster Recovery (DR)

  • Implementation: Geo-Disaster Recovery.
  • Process:

* Pair the East Namespace with West Namespace.

* Use the Alias Connection String in your app.

* If East fails, Microsoft (or you) initiates failover. The Alias now points to West.

* *Note: Data is NOT replicated. Only metadata (Event Hub names, Consumer Groups). You lose in-flight data.*

7. Strategy: Backup

  • Capture: Use Event Hubs Capture to save data to Blob Storage. This is your backup.

8. Strategy: Security

  • SAS Tokens: Use Shared Access Signatures with limited scope (Send Only vs Listen Only).

9. Well-Architected Framework Analysis

  • Reliability: High.
  • Security: High.
  • Cost Optimization: Medium. Throughput Units (TU) cost money. Auto-inflate TUs to handle spikes.
  • Operational Excellence: High.
  • Performance Efficiency: Excellent. Millions of events per second.

10. Detailed Traffic Flow

1. Producer: Sends event to alias.servicebus.windows.net.

2. DNS: Resolves to East US Private IP.

3. Ingest: Event Hub accepts message.

4. Capture: Saves copy to Blob Storage.

5. Consumer: Function App reads message.

11. Runbook: Deployment Guide (Azure Portal)

11. Runbook: Deployment Guide (Azure Portal)

Phase 1: Create Namespace

1. Search: "Event Hubs" -> + Create.

2. Resource Group: rg-data-spoke.

3. Namespace name: evh-ns-corp-[uniqueid].

4. Location: East US.

5. Pricing tier: Standard (Required for Private Link & Capture).

6. Throughput Units: 1.

7. Create.

Phase 2: Create Event Hub

1. Go to the new Namespace -> Event Hubs (Left Menu) -> + Event Hub.

2. Name: telemetry.

3. Partition count: 2 (Default) or 4.

4. Message retention: 1 day (Default).

5. Capture: Off (Enable later if needed).

6. Create.

Phase 3: Enable Geo-Recovery (DR)

1. Go to Namespace -> Geo-recovery.

2. Initiate pairing.

3. Subscription: Select yours.

4. Secondary namespace:

* Select Create new.

* Name: evh-ns-corp-dr-[uniqueid].

* Location: West US.

5. Alias: evh-alias-corp.

6. Create.

* *Note: You now use the Alias connection string in your apps.*

Phase 4: Private Endpoint

1. Go to the Primary Namespace (evh-ns-corp...).

2. Networking.

3. Public network access: Select Disabled.

4. Private endpoint connections -> + Private endpoint.

5. Name: pe-evh.

6. Resource Group: rg-data-spoke.

7. Target sub-resource: namespace.

8. Virtual Network: vnet-data-spoke.

9. Subnet: snet-privatelink.

10. Integrate with private DNS zone: Yes.

* Zone: privatelink.servicebus.windows.net.

11. Create.

Phase 5: Verify

1. Login to a VM in the Spoke.

2. Nslookup: nslookup evh-ns-corp-[uniqueid].servicebus.windows.net.

* Result should be 10.1.x.x (Private IP).

3. Send Event:

* Go to Portal -> Event Hub telemetry -> Data Explorer (Preview).

* Send events.

* Type {"test": "message"} -> Send.

* View events in the Events tab.