Security-as-Code & DevSecOps Governance for Azure Infrastructure

Security-as-Code & DevSecOps Governance for Azure Infrastructure

Application Security Groups, Policy-as-Code & Security-Integrated IaC Pipelines

Application Security Groups, Policy-as-Code & Security-Integrated IaC Pipelines

Description

This case study is an independent architecture design exercise developed to demonstrate DevSecOps governance methodology for Azure infrastructure — specifically focusing on application-centric network segmentation through Application Security Groups, security scanning integrated into IaC CI/CD pipelines, and Azure Policy-as-Code compliance enforcement. It was not associated with a production deployment. Infrastructure automation and Zero Trust foundational patterns are covered in depth in the Infrastructure Automation with Terraform & Azure DevOps and Zero Trust Enterprise Network case studies respectively — this study builds on those foundations with a specific focus on security embedded into the infrastructure delivery lifecycle.

This case study is an independent architecture design exercise developed to demonstrate DevSecOps governance methodology for Azure infrastructure — specifically focusing on application-centric network segmentation through Application Security Groups, security scanning integrated into IaC CI/CD pipelines, and Azure Policy-as-Code compliance enforcement. It was not associated with a production deployment. Infrastructure automation and Zero Trust foundational patterns are covered in depth in the Infrastructure Automation with Terraform & Azure DevOps and Zero Trust Enterprise Network case studies respectively — this study builds on those foundations with a specific focus on security embedded into the infrastructure delivery lifecycle.

Key Focus Areas:

  • Application Security Groups & Workload-Centric Segmentation

  • DevSecOps Pipeline Integration

  • Policy-as-Code Governance

  • Security Scanning in IaC Workflows

  • Managed Identity Architecture

  • Compliance-as-Code

Executive Summary

Architected a Security-as-Code governance platform for Azure infrastructure — embedding security controls, compliance validation, and policy enforcement directly into the infrastructure delivery lifecycle rather than applying them as post-deployment audit layers.

The architecture introduces three capabilities not covered in the existing Zero Trust and IaC studies: Application Security Groups (ASGs) for workload-centric dynamic network segmentation that scales without IP address management; DevSecOps pipeline integration embedding tfsec, Checkov, and Microsoft Defender for DevOps security scanning into Terraform CI/CD workflows; and Azure Policy-as-Code managing compliance enforcement as version-controlled infrastructure definitions rather than manually configured portal policies.

The design demonstrates that security in cloud infrastructure is not achieved at deployment time — it is built into the engineering process through automated validation, policy enforcement, and continuously tested compliance that prevents insecure configurations from ever reaching production.

Business Drivers

Cloud infrastructure security failures most commonly originate not from sophisticated attacks but from misconfiguration — publicly exposed storage accounts, overprivileged identities, missing encryption, and absent diagnostic settings deployed through ungoverned manual processes or IaC code without security validation.

This architecture was designed to address the DevSecOps governance requirements of organisations where existing infrastructure operations result in:

  • Security misconfigurations reaching production because IaC code is not scanned before deployment

  • Azure Policy compliance controls configured manually in the portal — inconsistently applied across environments and not version-controlled

  • NSG rules referencing IP address ranges that become stale as environments evolve — requiring manual rule updates creating operational overhead and security gaps

  • No automated mechanism preventing insecure resource configurations from being deployed — security review is manual and post-deployment

  • Compliance evidence produced through point-in-time portal exports rather than continuously enforced and automatically reportable policy state

  • Security controls treated as deployment artefacts rather than engineering requirements — applied after infrastructure is built rather than validated before it is deployed

Operational Constraints

The architecture was designed to operate within the following constraints typical of enterprise Azure DevOps environments:

  • Security scanning must integrate into existing CI/CD pipelines without requiring separate security tooling infrastructure

  • NSG rule management must scale with environment complexity — IP-based rules become unmanageable as workload counts grow

  • Azure Policy compliance enforcement must be version-controlled and deployable through the same IaC pipelines as infrastructure

  • Security scanning failures must block deployment — findings cannot be advisory-only for critical misconfigurations

  • Policy-as-Code must support both Deny effects (preventing non-compliant deployments) and DeployIfNotExists effects (remediating missing configurations)

  • Managed Identity must be enforced as the default authentication model — service principal credentials should not be deployable without explicit justification

  • Compliance reporting must be producible on demand from Azure Policy state — not from manual evidence collection

Objectives

  • Implement Application Security Groups providing workload-centric dynamic network segmentation that scales without IP address management overhead

  • Integrate tfsec and Checkov security scanning into CI/CD pipeline validation stages — blocking deployment on critical findings

  • Manage Azure Policy definitions, assignments, and initiatives as version-controlled Terraform code

  • Enforce Managed Identity as the default service authentication model through Policy-as-Code controls

  • Implement Azure Bastion as the secure administrative access model — eliminating public IP and management port exposure

  • Establish continuous compliance monitoring through Azure Policy compliance dashboards

  • Demonstrate DevSecOps maturity through security controls embedded at every stage of the infrastructure delivery lifecycle

Architecture Principles

  • Security-left — security validation occurs before deployment, not after

  • Policy-as-Code — compliance controls are version-controlled infrastructure definitions, not manually configured portal settings

  • Application-centric segmentation — network rules reference workload identity (ASGs) not IP addresses

  • Never trust, always verify — identity verification required for every access decision regardless of network position

  • Managed Identity by default — credential-based service authentication is a policy violation, not an option

  • Immutable compliance — Azure Policy enforces compliance state continuously, not at point-in-time audit intervals

  • DevSecOps integration — security is an engineering discipline embedded in delivery workflows, not a gate applied after delivery

Architecture Overview

The solution is structured as a four-layer Security-as-Code platform integrating application-centric network segmentation, DevSecOps pipeline governance, policy-as-code compliance, and secure operational access.

1. Application Security Groups — Workload-Centric Segmentation

Application Security Groups (ASGs) are the primary differentiating network segmentation capability in this study — providing workload-identity-based traffic rules that scale without IP address management.

The Problem with IP-Based NSG Rules at Scale:

Traditional NSG rules reference source and destination IP addresses or CIDR ranges:

# Traditional IP-based NSG rule does not scale
resource "azurerm_network_security_rule" "web_to_app" {
  source_address_prefix      = "10.1.1.0/24"   # Web tier subnet
  destination_address_prefix = "10.1.2.0/24"   # App tier subnet
  ...
}
# Traditional IP-based NSG rule does not scale
resource "azurerm_network_security_rule" "web_to_app" {
  source_address_prefix      = "10.1.1.0/24"   # Web tier subnet
  destination_address_prefix = "10.1.2.0/24"   # App tier subnet
  ...
}
# Traditional IP-based NSG rule does not scale
resource "azurerm_network_security_rule" "web_to_app" {
  source_address_prefix      = "10.1.1.0/24"   # Web tier subnet
  destination_address_prefix = "10.1.2.0/24"   # App tier subnet
  ...
}

As environments grow, IP ranges change, new subnets are added, and VM IPs evolve — NSG rules referencing specific IPs require continuous manual updates. Stale IP-based rules create security gaps (rules still permitting traffic to decommissioned IPs) and operational complexity (tracking which IPs map to which workloads across environments).

ASG-Based Rules — Workload Identity Over IP Address:

ASGs group VMs by workload role — NSG rules reference ASG membership rather than IP addresses:

hcl

# ASG definitions
resource "azurerm_application_security_group" "web_tier" {
  name                = "asg-web-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

resource "azurerm_application_security_group" "app_tier" {
  name                = "asg-app-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

resource "azurerm_application_security_group" "db_tier" {
  name                = "asg-db-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

# NSG rule referencing ASG membership not IP addresses
resource "azurerm_network_security_rule" "web_to_app" {
  name                                       = "allow-web-to-app"
  priority                                   = 100
  direction                                  = "Inbound"
  access                                     = "Allow"
  protocol                                   = "Tcp"
  source_port_range                          = "*"
  destination_port_range                     = "8080"
  source_application_security_group_ids      = [azurerm_application_security_group.web_tier.id]
  destination_application_security_group_ids = [azurerm_application_security_group.app_tier.id]
}

resource "azurerm_network_security_rule" "app_to_db" {
  name                                       = "allow-app-to-db"
  priority                                   = 110
  direction                                  = "Inbound"
  access                                     = "Allow"
  protocol                                   = "Tcp"
  source_port_range                          = "*"
  destination_port_range                     = "1433"
  source_application_security_group_ids      = [azurerm_application_security_group.app_tier.id]
  destination_application_security_group_ids = [azurerm_application_security_group.db_tier.id]
}

# Default deny all other traffic blocked
resource "azurerm_network_security_rule" "deny_all_inbound" {
  name                       = "deny-all-inbound"
  priority                   = 4096
  direction                  = "Inbound"
  access                     = "Deny"
  protocol                   = "*"
  source_port_range          = "*"
  destination_port_range     = "*"
  source_address_prefix      = "*"
  destination_address_prefix = "*"
}
# ASG definitions
resource "azurerm_application_security_group" "web_tier" {
  name                = "asg-web-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

resource "azurerm_application_security_group" "app_tier" {
  name                = "asg-app-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

resource "azurerm_application_security_group" "db_tier" {
  name                = "asg-db-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

# NSG rule referencing ASG membership not IP addresses
resource "azurerm_network_security_rule" "web_to_app" {
  name                                       = "allow-web-to-app"
  priority                                   = 100
  direction                                  = "Inbound"
  access                                     = "Allow"
  protocol                                   = "Tcp"
  source_port_range                          = "*"
  destination_port_range                     = "8080"
  source_application_security_group_ids      = [azurerm_application_security_group.web_tier.id]
  destination_application_security_group_ids = [azurerm_application_security_group.app_tier.id]
}

resource "azurerm_network_security_rule" "app_to_db" {
  name                                       = "allow-app-to-db"
  priority                                   = 110
  direction                                  = "Inbound"
  access                                     = "Allow"
  protocol                                   = "Tcp"
  source_port_range                          = "*"
  destination_port_range                     = "1433"
  source_application_security_group_ids      = [azurerm_application_security_group.app_tier.id]
  destination_application_security_group_ids = [azurerm_application_security_group.db_tier.id]
}

# Default deny all other traffic blocked
resource "azurerm_network_security_rule" "deny_all_inbound" {
  name                       = "deny-all-inbound"
  priority                   = 4096
  direction                  = "Inbound"
  access                     = "Deny"
  protocol                   = "*"
  source_port_range          = "*"
  destination_port_range     = "*"
  source_address_prefix      = "*"
  destination_address_prefix = "*"
}
# ASG definitions
resource "azurerm_application_security_group" "web_tier" {
  name                = "asg-web-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

resource "azurerm_application_security_group" "app_tier" {
  name                = "asg-app-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

resource "azurerm_application_security_group" "db_tier" {
  name                = "asg-db-tier"
  resource_group_name = var.resource_group_name
  location            = var.location
}

# NSG rule referencing ASG membership not IP addresses
resource "azurerm_network_security_rule" "web_to_app" {
  name                                       = "allow-web-to-app"
  priority                                   = 100
  direction                                  = "Inbound"
  access                                     = "Allow"
  protocol                                   = "Tcp"
  source_port_range                          = "*"
  destination_port_range                     = "8080"
  source_application_security_group_ids      = [azurerm_application_security_group.web_tier.id]
  destination_application_security_group_ids = [azurerm_application_security_group.app_tier.id]
}

resource "azurerm_network_security_rule" "app_to_db" {
  name                                       = "allow-app-to-db"
  priority                                   = 110
  direction                                  = "Inbound"
  access                                     = "Allow"
  protocol                                   = "Tcp"
  source_port_range                          = "*"
  destination_port_range                     = "1433"
  source_application_security_group_ids      = [azurerm_application_security_group.app_tier.id]
  destination_application_security_group_ids = [azurerm_application_security_group.db_tier.id]
}

# Default deny all other traffic blocked
resource "azurerm_network_security_rule" "deny_all_inbound" {
  name                       = "deny-all-inbound"
  priority                   = 4096
  direction                  = "Inbound"
  access                     = "Deny"
  protocol                   = "*"
  source_port_range          = "*"
  destination_port_range     = "*"
  source_address_prefix      = "*"
  destination_address_prefix = "*"
}

VM ASG Association:

hcl

# Assign web tier VMs to web ASG
resource "azurerm_network_interface_application_security_group_association" "web_vm_asg" {
  network_interface_id          = azurerm_network_interface.web_vm_nic.id
  application_security_group_id = azurerm_application_security_group.web_tier.id
}
# Assign web tier VMs to web ASG
resource "azurerm_network_interface_application_security_group_association" "web_vm_asg" {
  network_interface_id          = azurerm_network_interface.web_vm_nic.id
  application_security_group_id = azurerm_application_security_group.web_tier.id
}
# Assign web tier VMs to web ASG
resource "azurerm_network_interface_application_security_group_association" "web_vm_asg" {
  network_interface_id          = azurerm_network_interface.web_vm_nic.id
  application_security_group_id = azurerm_application_security_group.web_tier.id
}

ASG Segmentation Design:

ASG

Workload Members

Permitted Inbound

Permitted Outbound

asg-web-tier

Web application VMs

Load Balancer (HTTP/HTTPS)

asg-app-tier (8080)

asg-app-tier

Application logic VMs

asg-web-tier (8080)

asg-db-tier (1433)

asg-db-tier

Database VMs

asg-app-tier (1433)

None — deny all

asg-management

Jumpbox/Bastion VMs

BastionSubnet (RDP/SSH)

asg-web-tier, asg-app-tier, asg-db-tier (RDP/SSH)

asg-monitoring

Monitoring agents

asg-* (metrics ports)

Log Analytics endpoints

Operational Benefits of ASG-Based Segmentation:

  • New VMs automatically inherit correct security rules by joining the appropriate ASG — no NSG rule updates required

  • VM migrations and IP changes require no NSG rule modifications — ASG membership is independent of IP address

  • Security rules remain readable and auditable — "web tier can reach app tier on port 8080" is expressed directly rather than inferred from IP ranges

  • Environment promotion (dev → staging → prod) uses identical NSG rule sets with different ASG memberships — no environment-specific rule rewriting

2. DevSecOps Pipeline Integration — Security Scanning in IaC Workflows

Security scanning is embedded at every stage of the infrastructure delivery lifecycle — from developer workstation through pull request validation to pre-deployment pipeline gates.

Security Scanning Toolchain:

Tool

Stage

Purpose

Blocking

tfsec

Local + PR + Pipeline

Terraform security misconfiguration detection

Yes — critical findings

Checkov

Pipeline validation

CIS Azure benchmark compliance scanning

Yes — critical findings

Microsoft Defender for DevOps

Pipeline

Azure security posture integration

Advisory

terraform validate

Pipeline

HCL syntax and provider schema validation

Yes

terrascan

Pipeline

Multi-framework policy scanning

Yes — critical findings

Pipeline Security Integration Architecture:

yaml

stages:
  - stage: SecurityValidation
    displayName: 'Security & Compliance Validation'
    jobs:
      - job: TerraformSecurity
        steps:
          - script: |
              terraform init -backend=false
              terraform validate
            displayName: 'Terraform Validation'

          - script: |
              tfsec . --format junit \
                      --minimum-severity HIGH \
                      --out tfsec-results.xml
            displayName: 'tfsec Security Scan'

          - script: |
              checkov -d . \
                      --framework terraform \
                      --check CKV_AZURE_* \
                      --output junitxml \
                      --output-file checkov-results.xml \
                      --soft-fail-on MEDIUM
            displayName: 'Checkov Compliance Scan'

          - script: |
              terrascan scan -i terraform \
                            -t azure \
                            --severity HIGH \
                            --output junit-xml \
                            > terrascan-results.xml
            displayName: 'Terrascan Policy Scan'

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: 'JUnit'
              testResultsFiles: '*-results.xml'
            displayName: 'Publish Security Scan Results'

  - stage: Plan
    dependsOn: SecurityValidation
    condition: succeeded()    # Plan stage blocked if security scans fail
stages:
  - stage: SecurityValidation
    displayName: 'Security & Compliance Validation'
    jobs:
      - job: TerraformSecurity
        steps:
          - script: |
              terraform init -backend=false
              terraform validate
            displayName: 'Terraform Validation'

          - script: |
              tfsec . --format junit \
                      --minimum-severity HIGH \
                      --out tfsec-results.xml
            displayName: 'tfsec Security Scan'

          - script: |
              checkov -d . \
                      --framework terraform \
                      --check CKV_AZURE_* \
                      --output junitxml \
                      --output-file checkov-results.xml \
                      --soft-fail-on MEDIUM
            displayName: 'Checkov Compliance Scan'

          - script: |
              terrascan scan -i terraform \
                            -t azure \
                            --severity HIGH \
                            --output junit-xml \
                            > terrascan-results.xml
            displayName: 'Terrascan Policy Scan'

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: 'JUnit'
              testResultsFiles: '*-results.xml'
            displayName: 'Publish Security Scan Results'

  - stage: Plan
    dependsOn: SecurityValidation
    condition: succeeded()    # Plan stage blocked if security scans fail
stages:
  - stage: SecurityValidation
    displayName: 'Security & Compliance Validation'
    jobs:
      - job: TerraformSecurity
        steps:
          - script: |
              terraform init -backend=false
              terraform validate
            displayName: 'Terraform Validation'

          - script: |
              tfsec . --format junit \
                      --minimum-severity HIGH \
                      --out tfsec-results.xml
            displayName: 'tfsec Security Scan'

          - script: |
              checkov -d . \
                      --framework terraform \
                      --check CKV_AZURE_* \
                      --output junitxml \
                      --output-file checkov-results.xml \
                      --soft-fail-on MEDIUM
            displayName: 'Checkov Compliance Scan'

          - script: |
              terrascan scan -i terraform \
                            -t azure \
                            --severity HIGH \
                            --output junit-xml \
                            > terrascan-results.xml
            displayName: 'Terrascan Policy Scan'

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: 'JUnit'
              testResultsFiles: '*-results.xml'
            displayName: 'Publish Security Scan Results'

  - stage: Plan
    dependsOn: SecurityValidation
    condition: succeeded()    # Plan stage blocked if security scans fail

Security Finding Severity Treatment:

Severity

tfsec

Checkov

Pipeline Action

Critical

Block deployment

Block deployment

Pipeline fails — no exceptions

High

Block deployment

Block deployment

Pipeline fails — requires fix or suppression with justification

Medium

Advisory warning

Advisory warning

Pipeline continues — tracked in results

Low

Advisory

Advisory

Logged only

Suppression Management — Handling Justified Exceptions:

Not all security scanner findings represent actual vulnerabilities — some are intentional configurations appropriate for the specific context. Suppressions must be explicit, documented, and version-controlled:

hcl

# tfsec suppression intentional public access for demo environment
# tfsec:ignore:azure-storage-no-public-access
resource "azurerm_storage_account" "demo" {
  # Public access intentional for demo not for production
  allow_nested_items_to_be_public = true
}
# tfsec suppression intentional public access for demo environment
# tfsec:ignore:azure-storage-no-public-access
resource "azurerm_storage_account" "demo" {
  # Public access intentional for demo not for production
  allow_nested_items_to_be_public = true
}
# tfsec suppression intentional public access for demo environment
# tfsec:ignore:azure-storage-no-public-access
resource "azurerm_storage_account" "demo" {
  # Public access intentional for demo not for production
  allow_nested_items_to_be_public = true
}

yaml

# .checkov.yaml — global suppression configuration
skip-check:
  - CKV_AZURE_206   # Storage HTTPS — enforced through Azure Policy instead
# .checkov.yaml — global suppression configuration
skip-check:
  - CKV_AZURE_206   # Storage HTTPS — enforced through Azure Policy instead
# .checkov.yaml — global suppression configuration
skip-check:
  - CKV_AZURE_206   # Storage HTTPS — enforced through Azure Policy instead

Suppressions are reviewed during pull request code review — requiring justification comment before merge approval. Suppression accumulation is monitored — growing suppression lists indicate security debt requiring remediation.

Developer Shift-Left Security:

tfsec and Checkov run locally through pre-commit hooks — developers receive security feedback before committing code:

yaml

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    hooks:
      - id: terraform_tfsec
      - id

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    hooks:
      - id: terraform_tfsec
      - id

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    hooks:
      - id: terraform_tfsec
      - id

Security findings are caught at the developer workstation — the cheapest and fastest point in the delivery lifecycle to identify and fix misconfigurations.

3. Policy-as-Code — Azure Policy Managed Through Terraform

Azure Policy definitions, initiatives, and assignments are managed as version-controlled Terraform code — ensuring compliance controls are deployed consistently across environments through the same IaC governance model as infrastructure.

Why Policy-as-Code Matters:

Manually configured Azure Policy through the portal creates the same governance problems as manually deployed infrastructure — inconsistent application across environments, no version history, no deployment audit trail, and no automated testing of policy definitions before production assignment.

Terraform-Managed Policy Architecture:

hcl

# Custom policy definition deny public IP on VMs
resource "azurerm_policy_definition" "deny_public_ip_vm" {
  name         = "deny-public-ip-vm"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Deny public IP assignment to Virtual Machines"

  metadata = jsonencode({
    category = "Network"
    version  = "1.0.0"
  })

  policy_rule = jsonencode({
    if = {
      allOf = [
        {
          field  = "type"
          equals = "Microsoft.Network/networkInterfaces"
        },
        {
          not = {
            field  = "Microsoft.Network/networkInterfaces/ipconfigurations[*].publicIpAddress.id"
            exists = "false"
          }
        }
      ]
    }
    then = {
      effect = "Deny"
    }
  })
}

# Policy initiative grouping related controls
resource "azurerm_policy_set_definition" "zero_trust_baseline" {
  name         = "zero-trust-security-baseline"
  policy_type  = "Custom"
  display_name = "Zero Trust Security Baseline"

  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.deny_public_ip_vm.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.require_diagnostic_settings.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.enforce_managed_identity.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.require_tls_minimum.id
  }
}

# Policy assignment to subscription scope
resource "azurerm_subscription_policy_assignment" "zero_trust_baseline" {
  name                 = "zero-trust-baseline-assignment"
  policy_definition_id = azurerm_policy_set_definition.zero_trust_baseline.id
  subscription_id      = data.azurerm_subscription.current.id

  identity {
    type = "SystemAssigned"   # Required for DeployIfNotExists effects
  }
}
# Custom policy definition deny public IP on VMs
resource "azurerm_policy_definition" "deny_public_ip_vm" {
  name         = "deny-public-ip-vm"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Deny public IP assignment to Virtual Machines"

  metadata = jsonencode({
    category = "Network"
    version  = "1.0.0"
  })

  policy_rule = jsonencode({
    if = {
      allOf = [
        {
          field  = "type"
          equals = "Microsoft.Network/networkInterfaces"
        },
        {
          not = {
            field  = "Microsoft.Network/networkInterfaces/ipconfigurations[*].publicIpAddress.id"
            exists = "false"
          }
        }
      ]
    }
    then = {
      effect = "Deny"
    }
  })
}

# Policy initiative grouping related controls
resource "azurerm_policy_set_definition" "zero_trust_baseline" {
  name         = "zero-trust-security-baseline"
  policy_type  = "Custom"
  display_name = "Zero Trust Security Baseline"

  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.deny_public_ip_vm.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.require_diagnostic_settings.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.enforce_managed_identity.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.require_tls_minimum.id
  }
}

# Policy assignment to subscription scope
resource "azurerm_subscription_policy_assignment" "zero_trust_baseline" {
  name                 = "zero-trust-baseline-assignment"
  policy_definition_id = azurerm_policy_set_definition.zero_trust_baseline.id
  subscription_id      = data.azurerm_subscription.current.id

  identity {
    type = "SystemAssigned"   # Required for DeployIfNotExists effects
  }
}
# Custom policy definition deny public IP on VMs
resource "azurerm_policy_definition" "deny_public_ip_vm" {
  name         = "deny-public-ip-vm"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Deny public IP assignment to Virtual Machines"

  metadata = jsonencode({
    category = "Network"
    version  = "1.0.0"
  })

  policy_rule = jsonencode({
    if = {
      allOf = [
        {
          field  = "type"
          equals = "Microsoft.Network/networkInterfaces"
        },
        {
          not = {
            field  = "Microsoft.Network/networkInterfaces/ipconfigurations[*].publicIpAddress.id"
            exists = "false"
          }
        }
      ]
    }
    then = {
      effect = "Deny"
    }
  })
}

# Policy initiative grouping related controls
resource "azurerm_policy_set_definition" "zero_trust_baseline" {
  name         = "zero-trust-security-baseline"
  policy_type  = "Custom"
  display_name = "Zero Trust Security Baseline"

  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.deny_public_ip_vm.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.require_diagnostic_settings.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.enforce_managed_identity.id
  }
  policy_definition_reference {
    policy_definition_id = azurerm_policy_definition.require_tls_minimum.id
  }
}

# Policy assignment to subscription scope
resource "azurerm_subscription_policy_assignment" "zero_trust_baseline" {
  name                 = "zero-trust-baseline-assignment"
  policy_definition_id = azurerm_policy_set_definition.zero_trust_baseline.id
  subscription_id      = data.azurerm_subscription.current.id

  identity {
    type = "SystemAssigned"   # Required for DeployIfNotExists effects
  }
}

Policy Initiative — Zero Trust Security Baseline:

Policy

Effect

Enforcement Scope

Deny public IP assignment to VMs

Deny

All resource groups

Require diagnostic settings on all resources

DeployIfNotExists

All resource groups

Enforce Managed Identity — deny service principal secrets in Key Vault

Audit

All resource groups

Require TLS 1.2 minimum on storage accounts

Deny

All resource groups

Restrict allowed Azure regions

Deny

Subscription

Require resource tagging (environment, owner, cost-centre)

Deny

All resource groups

Enforce NSG on all subnets

Audit

Network resource groups

Managed Identity Enforcement Through Policy:

hcl

# Policy enforcing Managed Identity auditing service principal usage
resource "azurerm_policy_definition" "enforce_managed_identity" {
  name         = "audit-service-principal-key-vault-secrets"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Audit service principal secrets stored in Key Vault"

  policy_rule = jsonencode({
    if = {
      allOf = [
        {
          field  = "type"
          equals = "Microsoft.KeyVault/vaults/secrets"
        },
        {
          field    = "name"
          contains = "client-secret"
        }
      ]
    }
    then = {
      effect = "Audit"
    }
  })
}
# Policy enforcing Managed Identity auditing service principal usage
resource "azurerm_policy_definition" "enforce_managed_identity" {
  name         = "audit-service-principal-key-vault-secrets"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Audit service principal secrets stored in Key Vault"

  policy_rule = jsonencode({
    if = {
      allOf = [
        {
          field  = "type"
          equals = "Microsoft.KeyVault/vaults/secrets"
        },
        {
          field    = "name"
          contains = "client-secret"
        }
      ]
    }
    then = {
      effect = "Audit"
    }
  })
}
# Policy enforcing Managed Identity auditing service principal usage
resource "azurerm_policy_definition" "enforce_managed_identity" {
  name         = "audit-service-principal-key-vault-secrets"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Audit service principal secrets stored in Key Vault"

  policy_rule = jsonencode({
    if = {
      allOf = [
        {
          field  = "type"
          equals = "Microsoft.KeyVault/vaults/secrets"
        },
        {
          field    = "name"
          contains = "client-secret"
        }
      ]
    }
    then = {
      effect = "Audit"
    }
  })
}

Policy Testing Before Production Assignment: Azure Policy definitions are tested in development subscription before assignment to production — using terraform plan to preview policy assignments and Azure Policy compliance simulation to validate policy logic against existing resources before enforcement begins.

4. Secure Administrative Access — Azure Bastion

Azure Bastion provides browser-based RDP/SSH administrative access — eliminating public IP and management port exposure as a platform-enforced architectural requirement rather than a configuration guideline.

Azure Bastion Architecture:

hcl

resource "azurerm_bastion_host" "admin_access" {
  name                = "bastion-${var.environment}"
  resource_group_name = var.resource_group_name
  location            = var.location

  sku = "Standard"   # Standard SKU required for native client support and tunnelling

  ip_configuration {
    name                 = "configuration"
    subnet_id            = azurerm_subnet.bastion_subnet.id
    public_ip_address_id = azurerm_public_ip.bastion_pip.id
  }

  copy_paste_enabled     = true
  file_copy_enabled      = true    # Standard SKU feature
  shareable_link_enabled = false   # Disabled prevents unauthorised session sharing
  tunneling_enabled      = true    # Standard SKU enables native RDP/SSH client support
}
resource "azurerm_bastion_host" "admin_access" {
  name                = "bastion-${var.environment}"
  resource_group_name = var.resource_group_name
  location            = var.location

  sku = "Standard"   # Standard SKU required for native client support and tunnelling

  ip_configuration {
    name                 = "configuration"
    subnet_id            = azurerm_subnet.bastion_subnet.id
    public_ip_address_id = azurerm_public_ip.bastion_pip.id
  }

  copy_paste_enabled     = true
  file_copy_enabled      = true    # Standard SKU feature
  shareable_link_enabled = false   # Disabled prevents unauthorised session sharing
  tunneling_enabled      = true    # Standard SKU enables native RDP/SSH client support
}
resource "azurerm_bastion_host" "admin_access" {
  name                = "bastion-${var.environment}"
  resource_group_name = var.resource_group_name
  location            = var.location

  sku = "Standard"   # Standard SKU required for native client support and tunnelling

  ip_configuration {
    name                 = "configuration"
    subnet_id            = azurerm_subnet.bastion_subnet.id
    public_ip_address_id = azurerm_public_ip.bastion_pip.id
  }

  copy_paste_enabled     = true
  file_copy_enabled      = true    # Standard SKU feature
  shareable_link_enabled = false   # Disabled prevents unauthorised session sharing
  tunneling_enabled      = true    # Standard SKU enables native RDP/SSH client support
}

Bastion Standard SKU Capabilities Used:

Feature

Configuration

Purpose

Native client support

Enabled (tunnelling)

Full RDP/SSH client capability beyond browser

File copy

Enabled

Administrative file transfer without additional tools

Shareable links

Disabled

Prevents session URL sharing bypassing authentication

Session recording

Via Log Analytics

Administrative activity audit trail

Policy Enforcement of Bastion-Only Access: Azure Policy denies public IP assignment to VMs — enforcing Bastion as the only administrative access path at the platform level rather than relying on configuration discipline:

hcl

# This policy makes Bastion mandatory VMs cannot have public IPs
resource "azurerm_policy_definition" "deny_public_ip_vm" {
  # ... policy definition above ...
  # Effect: Deny deployment rejected at ARM layer
}
# This policy makes Bastion mandatory VMs cannot have public IPs
resource "azurerm_policy_definition" "deny_public_ip_vm" {
  # ... policy definition above ...
  # Effect: Deny deployment rejected at ARM layer
}
# This policy makes Bastion mandatory VMs cannot have public IPs
resource "azurerm_policy_definition" "deny_public_ip_vm" {
  # ... policy definition above ...
  # Effect: Deny deployment rejected at ARM layer
}

5. Managed Identity Architecture

Managed Identity is enforced as the default authentication model through a combination of architectural pattern, Terraform module standards, and Azure Policy auditing.

System-Assigned vs User-Assigned Managed Identity:

Type

Use Case

Lifecycle

This Architecture

System-Assigned

Single resource authentication

Tied to resource lifecycle

Default for VM and App Service workloads

User-Assigned

Shared identity across multiple resources

Independent lifecycle

Used for shared service authentication (monitoring agents, backup)

Terraform Managed Identity Pattern:

hcl

# VM with system-assigned Managed Identity
resource "azurerm_linux_virtual_machine" "workload" {
  name = "vm-workload-${var.environment}"
  # ... other configuration ...

  identity {
    type = "SystemAssigned"
  }
}

# Key Vault access policy for Managed Identity
resource "azurerm_key_vault_access_policy" "vm_kv_access" {
  key_vault_id = azurerm_key_vault.secrets.id
  tenant_id    = data.azurerm_client_config.current.tenant_id
  object_id    = azurerm_linux_virtual_machine.workload.identity[0].principal_id

  secret_permissions = ["Get", "List"]   # Minimum required permissions only
}
# VM with system-assigned Managed Identity
resource "azurerm_linux_virtual_machine" "workload" {
  name = "vm-workload-${var.environment}"
  # ... other configuration ...

  identity {
    type = "SystemAssigned"
  }
}

# Key Vault access policy for Managed Identity
resource "azurerm_key_vault_access_policy" "vm_kv_access" {
  key_vault_id = azurerm_key_vault.secrets.id
  tenant_id    = data.azurerm_client_config.current.tenant_id
  object_id    = azurerm_linux_virtual_machine.workload.identity[0].principal_id

  secret_permissions = ["Get", "List"]   # Minimum required permissions only
}
# VM with system-assigned Managed Identity
resource "azurerm_linux_virtual_machine" "workload" {
  name = "vm-workload-${var.environment}"
  # ... other configuration ...

  identity {
    type = "SystemAssigned"
  }
}

# Key Vault access policy for Managed Identity
resource "azurerm_key_vault_access_policy" "vm_kv_access" {
  key_vault_id = azurerm_key_vault.secrets.id
  tenant_id    = data.azurerm_client_config.current.tenant_id
  object_id    = azurerm_linux_virtual_machine.workload.identity[0].principal_id

  secret_permissions = ["Get", "List"]   # Minimum required permissions only
}

No Service Principal Credentials in Code — Enforced Through Module Standards: All Terraform modules enforce Managed Identity authentication — any module accepting service principal client secret parameters generates a Checkov finding that blocks pipeline execution.

Architecture Diagram

Technologies Used


Category

Technologies

Network Segmentation

Application Security Groups (ASGs), NSGs, Azure CNI

Secure Administrative Access

Azure Bastion (Standard SKU)

Identity & Access

Microsoft Entra ID, Managed Identity, Azure RBAC

Security Scanning

tfsec, Checkov, terrascan, Microsoft Defender for DevOps

Infrastructure as Code

Terraform

Policy-as-Code

Azure Policy (Deny, Audit, DeployIfNotExists), Policy Initiatives

CI/CD

Azure DevOps, GitHub Actions, YAML Pipelines

Monitoring

Azure Monitor, Log Analytics

Compliance Frameworks

CIS Azure Benchmark, NIST SP 800-53, Zero Trust Architecture (NIST SP 800-207)

Key Challenges Addressed

NSG rule management at scale without IP address tracking overhead — addressed through ASG-based segmentation replacing IP address references with workload identity group membership — NSG rules remain correct as VMs are added, moved, or retired without rule updates.

Security misconfiguration reaching production through ungoverned IaC — addressed through multi-tool security scanning pipeline integration blocking deployment on critical findings — tfsec, Checkov, and terrascan provide overlapping coverage across different rule sets.

Azure Policy applied inconsistently across environments — addressed through Terraform-managed Policy-as-Code — policy definitions, initiatives, and assignments are version-controlled, environment-parameterised, and deployed through the same CI/CD governance as infrastructure.

Service principal credentials proliferating across workloads — addressed through Managed Identity as the default Terraform module pattern, with Azure Policy auditing detecting and flagging service principal credentials in Key Vault — reducing but not yet eliminating credential-based authentication.

Security findings treated as advisory rather than blocking — addressed through pipeline stage dependency configuration where the Plan stage depends on SecurityValidation stage success — critical security findings cause pipeline failure that prevents plan execution and deployment.

Design Decisions & Rationale

ASGs over IP-Based NSG Rules for Workload Segmentation : IP-based NSG rules create operational debt that compounds with environment complexity. As workloads scale, IPs change, and subnets are reorganised, IP-based rules require continuous maintenance to remain accurate. ASGs abstract segmentation from IP addresses — VMs join ASGs representing their role, and NSG rules govern ASG-to-ASG communication. New VMs automatically inherit correct security policy through ASG membership without any NSG rule modification.

Multi-Tool Security Scanning over Single Scanner : No single security scanning tool has complete rule coverage — tfsec, Checkov, and terrascan each identify findings that others miss. Overlapping but distinct rule sets provide more comprehensive misconfiguration detection than any single tool. The operational overhead of maintaining multiple scanner configurations is justified by the detection coverage improvement — particularly for compliance-sensitive environments where missed findings carry regulatory consequences.

Policy-as-Code over Portal-Configured Azure Policy : Manually configured Azure Policy through the portal is not version-controlled — there is no audit trail of who created a policy, when it was modified, or what it previously contained. Terraform-managed Policy-as-Code applies the same engineering discipline to compliance controls as to infrastructure — version history, code review, deployment pipeline governance, and consistent application across environments.

Deny Effect for Critical Controls, Audit for Softer Controls : Azure Policy Deny effect prevents non-compliant resources from being deployed — appropriate for controls where non-compliance represents a security risk that must never reach production (public IP on VMs, missing TLS minimum). Audit effect reports non-compliance without blocking deployment — appropriate for controls where non-compliance is a governance concern but not an immediate security risk. Using Deny excessively creates operational friction; using Audit for security-critical controls creates security gaps.

Azure Bastion Standard over Basic SKU : Bastion Basic SKU provides browser-based access only — limiting administrative workflows for engineers requiring native RDP/SSH client capabilities. Standard SKU tunnelling enables native client support, providing full RDP/SSH client functionality through the Bastion tunnel without public IP exposure. The cost premium for Standard SKU is justified by the operational capability improvement for engineering teams with native client tooling dependencies.

Trade-offs & Design Constraints

ASG Regional Limitation : ASGs are regional resources — an ASG in East US cannot be referenced in NSG rules applied to resources in West Europe. Multi-region architectures require separate ASG definitions per region with identical naming conventions to maintain rule consistency. Terraform modules must account for regional ASG replication when designing multi-region deployments.

Security Scanner False Positive Management Overhead : Running three security scanners generates significantly more findings than a single scanner — including more false positives requiring suppression management. Without systematic suppression review processes, suppression lists grow unchecked until engineers disable findings rather than address them. Suppression governance — requiring justification comments, code review approval, and periodic suppression audit — is an ongoing operational requirement for multi-scanner DevSecOps pipelines.

Policy DeployIfNotExists Remediation Timing : DeployIfNotExists policies deploy remediation resources after the primary resource is created — with a processing delay. Infrastructure deployed through Terraform may complete successfully before Policy remediation runs, creating a window where resources exist without required configurations. Terraform should explicitly configure Policy-remediable settings (diagnostic settings, tag values) rather than relying on Policy remediation for Terraform-managed resources — using Policy remediation only for resources deployed outside IaC governance.

Azure Policy Deny Effect and Terraform Plan Accuracy : Terraform plan output shows planned resource creation as successful — Azure Policy Deny effect rejection occurs at deployment time, not plan time. A Terraform plan against an environment with Deny policies may show successful resource creation that will actually fail during apply if the resource violates a policy. terraform plan does not evaluate Azure Policy — teams must understand that plan success does not guarantee apply success in Policy-governed environments.

Bastion Standard SKU Cost : Azure Bastion Standard SKU has an hourly availability charge regardless of active sessions. For environments with infrequent administrative activity, Bastion represents a significant per-hour cost for a service used occasionally. Bastion cannot currently be stopped and started — it runs continuously once deployed. Cost modelling against actual administrative session frequency should validate Bastion Standard SKU justification compared to Basic SKU or jumpbox VM alternatives.

Projected Outcomes

The architecture is designed to deliver the following security and operational outcomes in a production enterprise environment:

  • ASG-based network segmentation scaling with workload growth without IP address management overhead or stale rule accumulation

  • Security misconfiguration prevention through pipeline-blocking tfsec, Checkov, and terrascan scanning before any infrastructure reaches deployment

  • Consistent compliance enforcement across all environments through Terraform-managed Azure Policy deployed through the same CI/CD governance as infrastructure

  • Managed Identity as the default authentication model — no service principal credentials in workload configurations

  • Complete administrative activity audit trail through Azure Bastion session logging without public IP or management port exposure

  • Shift-left security feedback through pre-commit hooks providing developer-workstation security scanning before code is committed

  • Policy compliance state continuously reportable on demand — not requiring manual point-in-time evidence collection

Future Evolution

  • OPA/Conftest integration for Terraform plan-time policy validation — evaluating planned resource configurations against governance rules before apply, closing the plan-to-apply policy evaluation gap

  • Automated drift detection through scheduled Terraform plan runs detecting out-of-band resource modifications that bypass IaC governance

  • Microsoft Defender for DevOps dashboard integration providing unified security posture visibility across pipeline scan results, Azure Policy compliance, and Defender for Cloud recommendations

  • Continuous compliance validation pipelines running compliance scans against deployed infrastructure state rather than only pre-deployment code

  • AI-assisted misconfiguration detection through GitHub Copilot for Security or similar tooling providing real-time IaC security guidance during development

  • Multi-cloud ASG equivalent governance — extending workload-centric segmentation patterns to AWS Security Groups and GCP firewall tags through unified Terraform module standards

  • Automated policy exemption lifecycle management — tracking, reviewing, and expiring Azure Policy exemptions through IaC-governed workflows

Key Takeaways

  • Application Security Groups eliminate IP address management overhead for NSG rule governance — workload identity-based segmentation scales with environment complexity where IP-based rules create compounding maintenance debt

  • DevSecOps security scanning must be blocking, not advisory — advisory-only findings are systematically ignored under delivery pressure; blocking pipeline failure is the only effective enforcement model

  • Multi-tool security scanning provides meaningfully better coverage than single-tool approaches — tfsec, Checkov, and terrascan each identify distinct finding categories that justify managing multiple scanner configurations

  • Policy-as-Code through Terraform is the correct governance model for Azure Policy — portal-configured policies are ungoverned, unversioned, and inconsistently applied across environments

  • Azure Policy Deny effect does not surface in Terraform plan output — teams must test policy behaviour explicitly rather than assuming plan success predicts apply success

  • Managed Identity enforcement requires both architectural pattern and policy auditing — architecture defines the default, policy detects violations, and pipeline scanning prevents misconfigurations from reaching deployment

  • Security-as-code is not a tool selection — it is a cultural and process commitment to embedding security validation at every stage of the infrastructure delivery lifecycle

Open to discussing infrastructure architecture, cloud transformation, or high-availability system design.

Whether the objective is infrastructure modernization, operational resilience, hybrid cloud transformation, or enterprise security architecture, I am always interested in discussing complex infrastructure environments and strategic technical initiatives.

Open to discussing infrastructure architecture, cloud transformation, or high-availability system design.

Whether the objective is infrastructure modernization, operational resilience, hybrid cloud transformation, or enterprise security architecture, I am always interested in discussing complex infrastructure environments and strategic technical initiatives.

Open to discussing infrastructure architecture, cloud transformation, or high-availability system design.

Whether the objective is infrastructure modernization, operational resilience, hybrid cloud transformation, or enterprise security architecture, I am always interested in discussing complex infrastructure environments and strategic technical initiatives.

ENTERPRISE INFRASTRUCTURE ARCHITECTURE

My work focuses on ensuring service continuity, optimizing performance, and supporting large-scale infrastructure transformations across multi-site and hybrid environments.

ENTERPRISE INFRASTRUCTURE ARCHITECTURE

My work focuses on ensuring service continuity, optimizing performance, and supporting large-scale infrastructure transformations across multi-site and hybrid environments.

ENTERPRISE INFRASTRUCTURE ARCHITECTURE

My work focuses on ensuring service continuity, optimizing performance, and supporting large-scale infrastructure transformations across multi-site and hybrid environments.