Description
Key Focus Areas:
Enterprise Cloud Infrastructure Architecture
Hub-Spoke Network Segmentation
Infrastructure-as-Code Automation
High Availability & Disaster Recovery
Modular Terraform Architecture
Private Connectivity & Zero Trust
Executive Summary
Architected a scalable enterprise cloud infrastructure platform on Microsoft Azure combining hub-spoke network topology, three-tier application segmentation across dedicated spoke VNets, VM Scale Set-based web tier compute, private endpoint database connectivity, modular Terraform Infrastructure-as-Code automation, centralised security governance, and Azure Site Recovery disaster recovery.
The architecture establishes a modular, reusable enterprise infrastructure framework — designed as a deployable blueprint that can be consistently reproduced across environments (development, staging, production) through parameterised Terraform modules, eliminating the deployment inconsistency and configuration drift that manual portal-based provisioning creates at scale.
The design is differentiated from pure security architecture studies by its primary focus on IaC-driven infrastructure governance — demonstrating how Terraform modular architecture, remote state management, and existing resource import workflows combine with enterprise network and security design to create a maintainable, auditable, and consistently deployable cloud infrastructure platform.
Business Drivers
Organisations deploying enterprise workloads in Azure frequently encounter governance and consistency challenges caused by fragmented, manual infrastructure deployment practices — each environment provisioned differently, security controls applied inconsistently, and no auditable record of infrastructure configuration decisions.
This architecture was designed to address the enterprise cloud infrastructure requirements of organisations where existing practices result in:
Inconsistent infrastructure deployments across development, staging, and production environments — configuration drift creating security gaps and operational unpredictability
Weak segmentation between application tiers — flat network architectures allowing lateral movement between web, application, and database workloads
No Infrastructure-as-Code governance — manual portal deployments with no version control, no deployment history, and no consistent rollback capability
Insufficient disaster recovery preparedness — workloads deployed without cross-region replication or tested recovery plans
Operational inefficiencies from manual provisioning — repeated manual deployment effort for each environment with no reusable infrastructure components
Difficulty maintaining deployment consistency across teams — different engineers applying different configurations to nominally identical environments
Operational Constraints
The architecture was designed to operate within the following constraints typical of enterprise Azure deployments:
Enterprise applications require clear separation between presentation, application, and data layers — both for security isolation and operational maintainability
Network segmentation must provide centralised governance without creating operational complexity that degrades deployment velocity
Web tier compute must scale elastically under variable traffic — manual capacity management is not operationally viable
Security controls must be consistently enforced across all environments — security cannot be environment-specific
Existing Azure resources deployed manually must be importable into Terraform governance — greenfield-only IaC adoption is rarely realistic
Disaster recovery must leverage Azure-native services — no third-party DR tooling dependency
Terraform codebase must be modular and maintainable — monolithic configurations become unmanageable as infrastructure complexity grows
Objectives
Design a scalable multi-tier enterprise cloud architecture with dedicated spoke VNets per application tier
Implement hub-spoke network topology with Azure Firewall as the centralised traffic inspection and governance control plane
Enable elastic and highly available web tier compute through Azure VM Scale Sets with demand-driven autoscaling
Enforce private-only database connectivity through Azure SQL private endpoints — no public database access
Implement secure secret and credential management through Azure Key Vault with Managed Identity authentication
Deliver cross-region disaster recovery capability through Azure Site Recovery replication and recovery plans
Build a modular, parameterised Terraform codebase enabling consistent multi-environment deployment
Integrate existing Azure resources into Terraform state governance through import workflows
Establish centralised operational monitoring through Azure Monitor and Log Analytics
Architecture Principles
Segmentation by design — web, application, and database tiers isolated in dedicated spoke VNets, not just subnets
Centralised network governance — all inter-spoke and external traffic routed through hub Azure Firewall inspection
Modular and reusable infrastructure components — Terraform child modules encapsulating each infrastructure domain
Secure-by-default deployment — no public exposure of internal or database tier resources
Separation of infrastructure responsibilities — network, compute, database, and security managed as independent Terraform modules
Elastic scalability — web tier scales horizontally through VMSS without manual intervention
Identity-driven authentication — Managed Identity eliminating credential management for service-to-service communication
Infrastructure automation and repeatability — all resources defined as code, deployed consistently across environments
Built-in resilience — cross-region ASR replication as a baseline requirement, not a post-deployment addition
Architecture Overview
The solution is structured as an eight-layer enterprise cloud infrastructure platform integrating hub-spoke networking, multi-tier application segmentation, scalable compute, database protection, security governance, disaster recovery, automation, and observability.
1. Hub Network Layer
The hub VNet serves as the centralised governance and shared services layer — the single network control plane through which all inter-spoke and external traffic flows.
Hub VNet Subnet Architecture:
Subnet | Purpose | Contents |
|---|---|---|
AzureFirewallSubnet | Azure Firewall deployment | Azure Firewall instance |
GatewaySubnet | VPN/ExpressRoute connectivity | Virtual Network Gateway |
ManagementSubnet | Shared management services | Bastion, Jump servers |
SharedServicesSubnet | Shared platform services | DNS, monitoring agents |
Azure Firewall — Centralised Traffic Inspection:
Application rule collections enforcing FQDN-based outbound access control for all spoke workloads
Network rule collections governing IP and port-based traffic flows between spokes and external destinations
DNAT rules for controlled inbound application traffic from internet to web tier
Centralised firewall policy management enabling consistent rule enforcement across the hub
Diagnostic logging to Log Analytics for traffic visibility and threat correlation
Hub-Spoke VNet Peering:
Each spoke VNet peered to the hub with UseRemoteGateways and AllowGatewayTransit configured
All spoke-to-spoke traffic routed through the hub firewall — spokes cannot communicate directly
User-Defined Routes (UDRs) in each spoke subnet forcing default route (0.0.0.0/0) to Azure Firewall private IP
2. Spoke Network Architecture
Three dedicated spoke VNets provide full VNet-level isolation between application tiers — stronger isolation than subnet-level separation within a shared VNet.
Why VNet-per-Tier over Subnet-per-Tier: Subnet-level segmentation within a shared VNet allows traffic between subnets to bypass the hub firewall through direct VNet routing. VNet-per-tier architecture forces all inter-tier traffic through hub firewall inspection — ensuring every database query from the application tier to the database tier passes through centralised traffic governance and logging.
Spoke VNet | Tier | Traffic Governance | Public Exposure |
|---|---|---|---|
web-spoke-vnet | Presentation | Hub firewall + NSG | Controlled inbound through Load Balancer |
app-spoke-vnet | Application | Hub firewall + NSG | None — internal only |
db-spoke-vnet | Database | Hub firewall + NSG | None — private endpoint only |
3. Web Tier
The presentation layer supports scalable, highly available public-facing services through Azure Load Balancer and VM Scale Sets.
Azure Load Balancer:
Public Load Balancer distributing inbound HTTP/HTTPS traffic across VMSS instances
Health probes monitoring individual instance availability — unhealthy instances removed from rotation automatically
Load balancing rules distributing traffic across available instances using round-robin algorithm
Inbound NAT rules for controlled administrative access to specific instances when required
Azure VM Scale Sets (VMSS) — Web Tier Compute:
Autoscaling Parameter | Configuration | Rationale |
|---|---|---|
Scale-out metric | CPU percentage > 75% for 5 minutes | Responsive scaling under sustained load |
Scale-out increment | +2 instances | Avoids single-instance increments that thrash under sustained load |
Scale-in metric | CPU percentage < 25% for 15 minutes | Conservative scale-in preventing premature deallocation |
Scale-in increment | -1 instance | Gradual scale-in reducing disruption risk |
Minimum instances | 2 | Ensures availability zone redundancy at minimum capacity |
Maximum instances | 10 | Cost ceiling preventing unbounded scaling |
Cooldown period | 5 minutes | Prevents rapid successive scaling actions |
VMSS Instance Configuration:
Availability zone distribution across zones 1, 2, and 3 — zone failure does not take all instances offline
Uniform orchestration mode for stateless web workloads
Custom script extension deploying web application stack at instance initialisation
Automatic OS image upgrades through rolling upgrade policy — instances updated progressively without full pool disruption
4. Application Tier
The application tier hosts internal business logic workloads isolated from both public ingress and direct database access paths.
Capabilities:
Azure App Services or containerised application workloads depending on deployment model
VNet Integration connecting App Services to the app-spoke-vnet — outbound traffic from App Services routed through the spoke VNet and hub firewall
Internal Load Balancer for distribution across multiple application tier instances where VM-based deployment is used
No public IP addresses — all application tier resources accessible only from web tier spoke through hub firewall-governed traffic paths
Traffic Flow — Web to Application Tier: Web tier VMSS instances → NSG allow rule → hub firewall application rule → app-spoke-vnet → application tier resources. All traffic logged at firewall for operational visibility and security analysis.
5. Database Tier
The database tier leverages Azure SQL with private endpoint connectivity — eliminating all public database access paths.
Azure SQL Database:
Private endpoint deployed in db-spoke-vnet — Azure SQL accessible only through private IP within the VNet, not through public Azure SQL endpoint
Public network access disabled on the Azure SQL server — connection attempts to the public endpoint are rejected regardless of firewall rules
Transparent Data Encryption (TDE) with service-managed or customer-managed keys — data at rest encrypted by default
Azure AD authentication enabled — application tier authenticates to Azure SQL through Managed Identity rather than SQL credentials
Private DNS Zone Integration:
Private DNS zone
privatelink.database.windows.netlinked to all VNets requiring database accessDNS resolution for the Azure SQL FQDN returns the private endpoint IP rather than the public Azure SQL IP
On-premises DNS conditional forwarder required if on-premises systems require database access through the private endpoint
6. Security Layer
The security layer integrates centralised secrets management, identity-driven authentication, and layered network traffic controls.
Azure Key Vault — Secrets Management:
Centralised storage for application secrets, connection strings, API keys, and TLS certificates
Private endpoint for Key Vault — no public Key Vault access from application workloads
Access policies or RBAC controlling which identities can read specific secrets
Key Vault diagnostic logging enabling audit of all secret access events
Managed Identity — Credential-Free Authentication:
System-assigned Managed Identities on VMSS instances and App Services
Managed Identity used for Azure SQL authentication (AAD token-based), Key Vault secret access, and Azure Monitor log forwarding
No hardcoded credentials in application configuration or infrastructure code — all service-to-service authentication through platform-managed identity tokens
NSGs — Workload-Level Traffic Control:
NSG | Applied To | Key Rules |
|---|---|---|
web-nsg | Web tier subnet | Allow HTTP/HTTPS inbound from Load Balancer, deny all other inbound |
app-nsg | App tier subnet | Allow inbound from web spoke only, deny all other inbound |
db-nsg | DB tier subnet | Allow inbound from app spoke on SQL port only, deny all other inbound |
7. Resilience & Disaster Recovery Layer
Disaster recovery capability leverages Azure Site Recovery for cross-region VM replication with defined recovery objectives.
Azure Site Recovery Configuration:
Parameter | Configuration |
|---|---|
Replication target | Secondary Azure region (paired region) |
Target RPO | ~1 minute (ASR continuous replication) |
Target RTO | 2–4 hours (dependent on recovery plan complexity) |
Replication scope | Web tier VMSS instances, application tier VMs |
Database recovery | Azure SQL geo-replication (independent of ASR) |
Recovery Plan Structure:
Group 1: Network infrastructure in secondary region (VNet, NSGs, Load Balancer) — pre-failover scripted deployment
Group 2: Database tier failover — Azure SQL failover group promotion to secondary replica
Group 3: Application tier VM failover — ASR-replicated application VMs brought online in secondary region
Group 4: Web tier VM failover — ASR-replicated web VMs connected to secondary Load Balancer
Azure SQL Geo-Replication: Azure SQL database geo-replication operates independently of ASR — the secondary database replica in the paired region maintains near-real-time synchronisation. Failover group configuration enables automatic or manual failover with connection string transparency — application connection strings do not change during failover.
8. Infrastructure as Code — Terraform Architecture
The Terraform codebase is structured as a modular architecture enabling consistent, parameterised deployment across multiple environments.
Module Structure:
Module Design Principles:
Each module accepts input variables for environment-specific configuration — no hardcoded values within modules
Module outputs expose resource IDs and connection strings required by dependent modules — network module outputs subnet IDs consumed by compute module
Modules are environment-agnostic — the same module code deploys to development, staging, and production through different variable files
Module versioning through Git tags — production environments pin to specific module versions preventing unintended updates
Remote State Backend:
Terraform state stored in Azure Blob Storage with state locking through Azure Storage lease
Separate state files per environment (dev, staging, prod) — preventing cross-environment state conflicts
State backend access restricted through Azure RBAC — only authorised deployment service principals can read or modify state
Existing Resource Import: Existing Azure resources deployed manually before IaC adoption are imported into Terraform state through terraform import workflows — bringing legacy resources under code governance without requiring recreation. Import workflows documented as runbooks enabling consistent execution across team members.
9. Observability Layer
Centralised monitoring provides operational visibility across infrastructure health, security events, and performance metrics.
Azure Monitor & Log Analytics:
Diagnostic settings on all deployed resources forwarding logs to centralised Log Analytics Workspace
VM insights for VMSS instance performance monitoring — CPU, memory, disk, and network utilisation per instance
Azure Firewall diagnostic logs providing traffic flow visibility and threat correlation capability
Alert rules for VMSS scaling events, unhealthy instance detection, and ASR replication health
Terraform-Managed Diagnostic Settings: All diagnostic setting configurations are managed through the monitoring Terraform module — ensuring every deployed resource consistently forwards logs to the centralised workspace. Manual diagnostic setting gaps — common in portal-managed environments — are eliminated through IaC enforcement.
Architecture Diagram

Technologies Used
Category | Technologies |
|---|---|
Infrastructure as Code | Terraform, Azure Remote State Backend (Azure Blob Storage) |
Cloud Platform | Microsoft Azure |
Networking | Hub-Spoke VNet Architecture, Azure Firewall, NSGs, UDRs, VNet Peering |
Compute & Scaling | Azure VM Scale Sets, Azure Load Balancer |
Database | Azure SQL Database, Private Endpoints, Transparent Data Encryption, Geo-Replication |
Secrets & Identity | Azure Key Vault, Managed Identity, Azure RBAC |
Disaster Recovery | Azure Site Recovery, Azure SQL Failover Groups |
Monitoring | Azure Monitor, Log Analytics Workspace, VM Insights |
Key Challenges Addressed
Designing secure communication flows between application tiers — addressed through VNet-per-tier spoke architecture forcing all inter-tier traffic through hub Azure Firewall inspection, with NSG rules providing workload-level traffic control within each spoke.
Scaling public-facing workloads dynamically — addressed through VMSS with metric-based autoscaling rules governing scale-out and scale-in thresholds, cooldown periods, and instance increment sizes — providing responsive scaling without manual intervention.
Integrating existing resources into Terraform governance — addressed through terraform import workflows bringing manually deployed Azure resources under IaC state management — enabling IaC adoption without requiring resource recreation.
Maintaining modular Terraform design without excessive complexity — addressed through child module architecture separating network, compute, database, security, monitoring, and DR into independent, parameterised modules with clean input/output interfaces between them.
Securing sensitive data services from public exposure — addressed through Azure SQL private endpoint with public network access disabled — the database is physically unreachable from public Azure endpoints regardless of firewall rule configuration.
Ensuring disaster recovery readiness without operational overhead — addressed through ASR continuous replication managed through Terraform DR module — replication configuration is version-controlled and consistently applied without manual ASR console configuration.
Design Decisions & Rationale
Hub-Spoke over Flat VNet Architecture : Flat VNet architectures without a central inspection point allow workloads to communicate directly without governance or logging. Hub-spoke forces all traffic through Azure Firewall — providing centralised traffic governance, consistent policy enforcement, and complete traffic logging across all workload-to-workload and workload-to-internet communication paths.
VNet-per-Tier over Subnet-per-Tier Segmentation : Subnet-level segmentation within a shared VNet allows inter-subnet traffic to bypass the hub firewall through direct VNet routing. VNet-per-tier with hub-spoke peering forces all inter-tier traffic through the hub firewall — ensuring database-bound queries from the application tier are inspected and logged at the centralised control plane rather than flowing directly between subnets.
VMSS over Static VM Deployment for Web Tier : Static VM deployments require manual scaling intervention and create single-instance failure risk. VMSS provides elastic horizontal scaling based on demand metrics and distributes instances across availability zones — providing both elasticity and resilience that static VM deployments cannot achieve without significant operational overhead.
Private Endpoints for Azure SQL : Azure SQL with firewall-based public access control still exposes the database to the public internet endpoint — only authentication prevents access. Private endpoints eliminate the public endpoint entirely — the database is not reachable from the internet regardless of authentication bypass vulnerabilities, significantly reducing the database attack surface.
Managed Identity over Credential-Based Authentication : Application credentials stored in configuration files or environment variables create secret management risk — credentials can be extracted, leaked, or expire without rotation. Managed Identity provides platform-managed, short-lived token-based authentication — no credential storage required and no rotation management overhead.
Modular Terraform Architecture : Monolithic Terraform configurations become increasingly difficult to maintain as infrastructure complexity grows — changes in one area risk unintended impacts on unrelated resources. Module separation with clean interfaces enables independent module development, testing, and versioning — reducing the blast radius of configuration changes and enabling team members to work on different infrastructure domains without conflict.
Remote State with State Locking : Local Terraform state creates collaboration risk — concurrent deployments from different team members corrupt state files. Remote state in Azure Blob Storage with lease-based locking prevents concurrent state modification, enables team collaboration, and provides state history for rollback and investigation.
Trade-offs & Design Constraints
Hub Firewall Throughput Bottleneck : Routing all inter-spoke traffic through Azure Firewall creates a potential throughput bottleneck for high-bandwidth workload-to-workload communication. Azure Firewall Standard supports up to 30 Gbps and Premium up to 100 Gbps — sufficient for most enterprise workloads but potentially constraining for high-throughput data processing scenarios. Workloads with very high inter-tier bandwidth requirements should evaluate whether selected Azure Firewall SKU provides adequate throughput capacity.
VMSS Stateless Constraint for Web Tier : VMSS horizontal scaling assumes stateless web tier workloads — session state must not be stored locally on individual instances. Applications maintaining local session state cannot be scaled horizontally without session affinity configuration or externalised session state storage (Azure Cache for Redis). Web application architecture must be validated for statelessness before VMSS deployment.
Terraform Import Operational Complexity : Importing existing Azure resources into Terraform state requires careful mapping of resource attributes to Terraform resource configurations. Mismatched attributes between existing resource configuration and Terraform resource definitions cause plan drift — Terraform proposes changes to align existing resources with code definitions. Import workflows require thorough pre-import resource auditing to minimise unintended post-import plan changes.
ASR Recovery Time Dependency on Recovery Plan Complexity : Azure Site Recovery RTO targets depend heavily on recovery plan complexity — the number of VM groups, script steps, and manual intervention points in the recovery plan directly affects failover duration. Estimated 2–4 hour RTO assumes a well-tested recovery plan with pre-staged secondary region network infrastructure. Untested recovery plans consistently underperform RTO targets in actual failover scenarios — regular DR testing is essential for RTO validation.
VNet Peering Cost at Scale : Azure VNet peering charges apply to all traffic flowing between peered VNets — including hub-to-spoke and spoke-to-hub flows. In architectures with high inter-tier traffic volumes (frequent database queries, large application-to-web data transfers), VNet peering egress charges accumulate meaningfully at scale. VNet peering cost should be modelled against expected inter-tier traffic volumes during architecture sizing.
Terraform State Security : Terraform state files contain sensitive infrastructure details including resource IDs, configuration parameters, and potentially secret values if outputs are not carefully managed. Azure Blob Storage state backend access must be restricted through RBAC to authorised deployment identities only. Sensitive output values should never be written to Terraform state — use Key Vault references rather than outputting secret values through Terraform outputs.
Projected Outcomes
The architecture is designed to deliver the following operational and infrastructure outcomes in a production enterprise environment:
Scalable multi-tier enterprise cloud architecture with VNet-level isolation between web, application, and database tiers
Centralised network traffic governance through hub Azure Firewall with complete inter-tier traffic logging
Elastic web tier scaling through VMSS autoscaling responding to demand metrics without manual intervention
Private-only database connectivity through Azure SQL private endpoints eliminating public database exposure
Credential-free service-to-service authentication through Managed Identity across all application components
Cross-region disaster recovery readiness through ASR continuous replication and defined recovery plans
Consistent multi-environment deployment through modular parameterised Terraform codebase
Centralised operational monitoring through Azure Monitor with Terraform-managed diagnostic settings across all resources
Future Evolution
Kubernetes-based application tier orchestration replacing VM-based application hosting with AKS for containerised workload management
Service mesh integration (Istio) for mutual TLS between application tier microservices and fine-grained traffic management
Policy-as-Code governance through Azure Policy and OPA/Gatekeeper preventing non-compliant resource deployment
Active-active multi-region deployment replacing single-region primary with active traffic distribution across regions through Azure Traffic Manager or Azure Front Door
FinOps governance integration through Azure Cost Management budgets and Terraform cost estimation in deployment pipelines
Automated compliance validation pipelines running Terraform compliance scanning (Checkov, tfsec) on every pull request
AI-assisted infrastructure optimisation identifying right-sizing opportunities across VMSS instances and database service tiers
Zero Trust network automation through Azure Network Manager for centralised NSG policy management at scale
Key Takeaways
Hub-spoke with VNet-per-tier segmentation provides stronger isolation than subnet-per-tier — inter-tier traffic through the hub firewall ensures consistent inspection and logging that subnet-only segmentation cannot guarantee
Private endpoints for database connectivity should be a default enterprise architecture requirement — firewall-based public access control is a weaker model than eliminating public database exposure entirely
Terraform modular architecture with remote state is the correct IaC model for enterprise teams — monolithic configurations and local state create maintainability and collaboration constraints that compound as infrastructure complexity grows
VMSS requires stateless web tier application design — session state externalisation is an application architecture prerequisite for horizontal scaling, not an infrastructure option
Terraform import workflows are essential for realistic IaC adoption — greenfield-only IaC ignores the reality that most enterprise environments contain existing manually deployed resources requiring governance integration
ASR RTO targets must be validated through regular recovery plan testing — estimated RTOs based on untested plans are unreliable and consistently optimistic
Managed Identity should be the default authentication model for all Azure service-to-service communication — credential-based authentication creates management overhead and security risk that Managed Identity eliminates
