diff --git a/test/infra/azure/OFFLINE-DEPLOYMENT.md b/test/infra/azure/OFFLINE-DEPLOYMENT.md new file mode 100644 index 0000000000..07d6355dee --- /dev/null +++ b/test/infra/azure/OFFLINE-DEPLOYMENT.md @@ -0,0 +1,272 @@ +# Offline/Air-gapped Deployment Guide for Radius on AKS + +This guide provides detailed instructions for deploying Radius in a completely offline/air-gapped environment where there is no internet connectivity. + +## Architecture Overview + +The offline deployment creates: + +- **Private AKS Cluster**: No public endpoints, API server accessible only through private network +- **Private Azure Container Registry (ACR)**: For hosting all required container images +- **Virtual Network**: Isolated network environment with controlled outbound access +- **NAT Gateway**: Optional controlled outbound connectivity for critical updates +- **Private DNS Zones**: For internal name resolution + +## Pre-deployment Preparation + +### 1. Container Image Preparation + +Before deploying in an offline environment, you must prepare all required container images: + +#### Required Images for Radius + +```bash +# Core Radius images (update versions as needed) +radius/applications-rp:latest +radius/ucp:latest +radius/deployment-engine:latest + +# Kubernetes dependencies +jetstack/cert-manager-controller:v1.12.0 +jetstack/cert-manager-webhook:v1.12.0 +jetstack/cert-manager-cainjector:v1.12.0 +azure/aad-pod-identity/mic:v1.8.15 +azure/aad-pod-identity/nmi:v1.8.15 + +# Dapr images (if enabled) +daprio/dapr:1.11.2 +daprio/dapr-placement-server:1.11.2 +daprio/dapr-sentry:1.11.2 +daprio/dapr-sidecar-injector:1.11.2 +``` + +#### Image Transfer Process + +1. **From a connected environment**, pull and save images: + +```bash +# Create a directory for images +mkdir -p offline-images + +# Pull and save each image +docker pull jetstack/cert-manager-controller:v1.12.0 +docker save jetstack/cert-manager-controller:v1.12.0 -o offline-images/cert-manager-controller.tar + +# Repeat for all required images... +``` + +1. **Transfer images** to your offline environment using secure media + +1. **In the offline environment**, load images into your private registry: + +```bash +# Load images from files +docker load -i offline-images/cert-manager-controller.tar + +# Tag for your private registry +docker tag jetstack/cert-manager-controller:v1.12.0 yourregistry.azurecr.io/cert-manager-controller:v1.12.0 + +# Push to private registry (must be done before full isolation) +docker push yourregistry.azurecr.io/cert-manager-controller:v1.12.0 +``` + +## Deployment Steps + +### 1. Network Connectivity Planning + +Before deployment, plan your network connectivity: + +- **Management Access**: How will you access the private cluster? (Bastion host, VPN, ExpressRoute) +- **Outbound Connectivity**: Do you need any outbound access? (NAT Gateway vs. completely isolated) +- **DNS Resolution**: How will internal DNS be handled? + +### 2. Deploy Infrastructure + +```bash +# Deploy with offline-specific parameters +az deployment group create --resource-group [Resource Group Name] --template-file main.bicep \ + --parameters \ + privateClusterEnabled=true \ + enableNatGateway=true \ + installKubernetesDependencies=false \ + grafanaEnabled=false \ + virtualNetworkAddressPrefix='10.0.0.0/8' \ + aksSubnetAddressPrefix='10.240.0.0/16' \ + privateEndpointSubnetAddressPrefix='10.241.0.0/24' +``` + +**Note**: For custom VNet deployments, Azure AKS requires either `loadBalancer` or `userDefinedRouting` outbound types. The template automatically configures: + +- `userDefinedRouting` when `enableNatGateway=true` (creates empty route table, NAT Gateway handles routing) +- `loadBalancer` when `enableNatGateway=false` (uses Azure Load Balancer) + +When using `userDefinedRouting`, an empty route table is created and associated with the AKS subnet. The NAT Gateway at the subnet level handles the actual outbound traffic routing. This approach provides controlled outbound connectivity while satisfying AKS requirements. + +Key parameters for offline deployment: + +- `enableNatGateway=true/false`: Controls outbound connectivity method + - `true`: Creates NAT Gateway for controlled outbound access (recommended for partial connectivity) + - `false`: Uses Load Balancer only (for completely isolated environments) +- `installKubernetesDependencies=false`: Skip online dependency installation +- `grafanaEnabled=false`: Reduce complexity in offline environment + +### 3. Post-Deployment Configuration + +#### Access the Private Cluster + +1. **Set up management connectivity** (choose one): + - Deploy a jump box VM in the same VNet + - Configure VPN Gateway for remote access + - Use Azure Bastion for secure access + +2. **Configure kubectl**: + +```bash +# From management host with VNet access +az aks get-credentials --resource-group [RG] --name [AKS] --admin +kubectl get nodes +``` + +#### Manual Installation of Dependencies + +Since `installKubernetesDependencies=false`, you'll need to manually install: + +1. **cert-manager** using images from your private ACR: + +```yaml +# cert-manager-deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: cert-manager + namespace: cert-manager +spec: + replicas: 1 + selector: + matchLabels: + app: cert-manager + template: + metadata: + labels: + app: cert-manager + spec: + containers: + - name: cert-manager + image: yourregistry.azurecr.io/cert-manager-controller:v1.12.0 + # ... rest of configuration +``` + +1. **Configure image pull secrets** if needed: + +```bash +kubectl create secret docker-registry acr-secret \ + --docker-server=yourregistry.azurecr.io \ + --docker-username=[ACR-USERNAME] \ + --docker-password=[ACR-PASSWORD] \ + --namespace=cert-manager +``` + +## Monitoring and Maintenance + +### Health Checks + +Regular health checks for offline environments: + +```bash +# Check cluster health +kubectl get nodes +kubectl get pods --all-namespaces + +# Check ACR connectivity +kubectl run test-acr --image=yourregistry.azurecr.io/test:latest --rm -it + +# Check DNS resolution +kubectl run -it --rm debug --image=yourregistry.azurecr.io/busybox --restart=Never -- nslookup kubernetes.default +``` + +### Update Process + +For updates in offline environments: + +1. **Test updates** in a connected environment first +2. **Prepare new images** following the same offline process +3. **Transfer and load** new images +4. **Rolling updates** using kubectl or Helm + +## Security Considerations + +### Network Security + +- All ingress/egress traffic should be monitored and controlled +- Regular security scanning of container images before import +- Network segmentation between different application tiers + +### Image Security + +- Implement image scanning pipeline before importing to offline registry +- Use signed images where possible +- Regular vulnerability assessment of stored images + +### Access Control + +- Strict RBAC policies for cluster access +- Regular rotation of certificates and credentials +- Audit logging for all administrative actions + +## Troubleshooting + +### Common Issues in Offline Environments + +1. **Image Pull Failures**: + - Verify image exists in private ACR + - Check image pull secrets + - Validate network connectivity to ACR private endpoint + +2. **DNS Resolution Issues**: + - Check private DNS zone configuration + - Verify DNS forwarding rules + - Test with `nslookup` from within cluster + +3. **Certificate Issues**: + - Ensure cert-manager is properly configured for offline operation + - Check if certificate authorities are accessible + - Consider using internal CA for offline scenarios + +### Diagnostic Commands + +```bash +# Network connectivity test +kubectl run network-test --image=yourregistry.azurecr.io/busybox -it --rm -- /bin/sh + +# Check private endpoint status +az network private-endpoint list --resource-group [RG] --output table + +# ACR health check +az acr check-health --name [ACR-NAME] + +# DNS resolution test +kubectl run dns-test --image=yourregistry.azurecr.io/busybox -it --rm -- nslookup yourregistry.azurecr.io +``` + +## Maintenance Scripts + +Consider creating maintenance scripts for routine operations: + +```bash +#!/bin/bash +# check-offline-health.sh + +echo "Checking AKS cluster health..." +kubectl get nodes + +echo "Checking critical pods..." +kubectl get pods -n kube-system +kubectl get pods -n cert-manager + +echo "Checking ACR connectivity..." +kubectl run acr-test --image=$ACR_NAME.azurecr.io/busybox --rm -it --restart=Never -- echo "ACR connectivity OK" + +echo "Health check complete." +``` + +This guide ensures your Radius deployment can operate effectively in a completely offline environment while maintaining security and operational excellence. diff --git a/test/infra/azure/README.md b/test/infra/azure/README.md index e4fcc6afad..ae2bf3cf04 100644 --- a/test/infra/azure/README.md +++ b/test/infra/azure/README.md @@ -1,20 +1,32 @@ -# Build Radius infrastructure to Azure +# Build Radius infrastructure to Azure (Offline/Air-gapped Environment) -This directory includes the Bicep templates to deploy the following resources on Azure for running Radius: +This directory includes the Bicep templates to deploy the following resources on Azure for running Radius in an offline/air-gapped network environment: +- Virtual Network with private subnets and NAT Gateway for outbound connectivity +- Azure Container Registry (ACR) with private endpoints for container image storage - Log Analytics Workspace for log - Azure Monitor Workspace for metric -- AKS Cluster +- Private AKS Cluster + - Deployed in private subnet with no public endpoints - Installed extensions: Azure Keyvault CSI driver, Dapr -- Grafana dashboard -- Installed tools - - cert-manager v1.20.0 - - Azure workload identity mutating admission webhook controller v1.1.0 +- Grafana dashboard (optional) +- Network security groups for secure communication +- Private DNS zones for name resolution + +## Key Features for Offline Environment + +- **Private AKS Cluster**: API server is only accessible through private endpoints +- **Private Container Registry**: ACR with private endpoints for secure image storage +- **Network Isolation**: All components deployed in private subnets +- **Controlled Outbound Access**: Uses NAT Gateway for controlled internet access +- **Private DNS**: Custom DNS zones for internal name resolution ## Prerequisite 1. [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) 2. [Azure subscription](https://azure.com) to which you have a Owner/Contributor role +3. **Network connectivity**: You'll need a connection to your Azure environment (VPN, ExpressRoute, or Azure Bastion) +4. **Pre-loaded container images**: For a fully offline environment, you'll need to pre-load required container images into the ACR ## Steps @@ -64,43 +76,163 @@ This directory includes the Bicep templates to deploy the following resources on 1. Deploy main.bicep: - By default, `grafanaEnabled` is false. We do not need to set any parameters unless you need Grafana dashboard. If you want to see Grafana dashboard later, you can redeploy main.bicep with `grafanaEnabled` and `grafanaAdminObjectId` later--bicep will install only Grafana dashboard with your existing cluster. + The template now includes parameters for configuring the private network environment: ```bash - az deployment group create --resource-group [Resource Group Name] --template-file main.bicep --parameters grafanaEnabled=[Grafana Dashboard Enabled] grafanaAdminObjectId='[Grafana Admin Object Id]' + az deployment group create --resource-group [Resource Group Name] --template-file main.bicep \ + --parameters \ + grafanaEnabled=[Grafana Dashboard Enabled] \ + grafanaAdminObjectId='[Grafana Admin Object Id]' \ + privateClusterEnabled=true \ + enableNatGateway=true \ + virtualNetworkAddressPrefix='10.0.0.0/8' \ + aksSubnetAddressPrefix='10.240.0.0/16' \ + privateEndpointSubnetAddressPrefix='10.241.0.0/24' ``` + Key parameters for offline deployment: + - **privateClusterEnabled**: Set to `true` for private AKS cluster (default: true) + - **enableNatGateway**: Set to `true` to enable NAT Gateway for controlled outbound connectivity (default: true) + - When `true`: Uses `userDefinedRouting` with custom NAT Gateway + - When `false`: Uses `loadBalancer` for outbound connectivity + - **virtualNetworkAddressPrefix**: Address space for the virtual network (default: 10.0.0.0/8) + - **aksSubnetAddressPrefix**: Subnet for AKS nodes (default: 10.240.0.0/16) + - **privateEndpointSubnetAddressPrefix**: Subnet for private endpoints (default: 10.241.0.0/24) - **[Grafana Dashboard Enabled]**: Set `true` if you want to see metrics and its dashboard with Azure managed Prometheus and Grafana dashboard. Otherwise, `false` is recommended to save the cost. - **[Grafana Admin Object Id]**: Set the object ID of the Grafana Admin user or group. To find the object id, search for the admin user or group name on [AAD Portal Overview search box](https://portal.azure.com/#view/Microsoft_AAD_IAM/ActiveDirectoryMenuBlade/~/Overview) and get the object id or run `az ad signed-in-user show` to get your own user object id. -## Monitor Radius +## Post-Deployment Steps for Offline Environment + +### 1. Access the Private AKS Cluster + +Since the AKS cluster is private, you'll need to access it from within the virtual network or through a connection that has access to the private network: -### Query Radius Logs +```bash +# If you have a VM in the same VNet or connected network +az aks get-credentials --resource-group [Resource Group Name] --name [AKS Cluster Name] --admin -1. Go to AKS cluster in your resource group on Azure Portal -1. Click AKS cluster resource -> `Monitoring` in Blade menu -> `Logs` -1. Enter the following kusto query to get the logs in 2 hours. +# Verify connectivity +kubectl get nodes +``` - ```kusto +### 2. Pre-load Container Images (For Fully Offline Environment) - ContainerLogV2 - | sort by TimeGenerated desc - | extend jsonEntry = parse_json(LogMessage) - | where isnotnull(jsonEntry) and TimeGenerated > ago(2hr) - | evaluate bag_unpack(jsonEntry) - | project TimeGenerated, timestamp, serviceName, hostName, severity, message, resourceId, traceId, spanId, caller, name, version, LogMessage - ``` +For a completely offline environment, you'll need to pre-load all required container images into your private ACR: + +```bash +# Get ACR login server (from deployment output) +ACR_NAME="[Your ACR Name]" +az acr login --name $ACR_NAME + +# Import required images (examples) +az acr import --name $ACR_NAME --source docker.io/jetstack/cert-manager-controller:v1.12.0 --image cert-manager-controller:v1.12.0 +az acr import --name $ACR_NAME --source docker.io/jetstack/cert-manager-webhook:v1.12.0 --image cert-manager-webhook:v1.12.0 +az acr import --name $ACR_NAME --source docker.io/jetstack/cert-manager-cainjector:v1.12.0 --image cert-manager-cainjector:v1.12.0 + +# Add any other required images for your Radius applications +``` + +### 3. Configure kubectl for Private Cluster Access + +You may need to set up additional connectivity options: + +- **Azure Bastion**: For secure RDP/SSH access to a jump box VM +- **VPN Gateway**: For site-to-site or point-to-site connectivity +- **ExpressRoute**: For dedicated private connectivity + +## Network Architecture + +The deployment creates the following network topology: + +```text +Internet + │ + ▼ +┌─────────────┐ +│ NAT Gateway │ (Outbound only) +└─────────────┘ + │ + ▼ +┌───────────────────────────────────────┐ +│ Virtual Network │ +│ (10.0.0.0/8) │ +│ │ +│ ┌─────────────────────────────────┐ │ +│ │ AKS Subnet │ │ +│ │ (10.240.0.0/16) │ │ +│ │ │ │ +│ │ ┌─────────────────────────┐ │ │ +│ │ │ Private AKS Cluster │ │ │ +│ │ └─────────────────────────┘ │ │ +│ └─────────────────────────────────┘ │ +│ │ +│ ┌─────────────────────────────────┐ │ +│ │ Private Endpoint Subnet │ │ +│ │ (10.241.0.0/24) │ │ +│ │ │ │ +│ │ ┌─────────────────────────┐ │ │ +│ │ │ ACR Private Endpoint │ │ │ +│ │ └─────────────────────────┘ │ │ +│ └─────────────────────────────────┘ │ +└───────────────────────────────────────┘ +``` + +## Security Considerations + +This offline deployment provides enhanced security through: + +1. **Network Isolation**: AKS nodes are in private subnets with no direct internet access +2. **Private API Server**: Kubernetes API server is only accessible through private endpoints +3. **Private Container Registry**: ACR is only accessible through private endpoints +4. **Network Security Groups**: Restrict traffic between subnets +5. **Controlled Outbound Access**: Only necessary outbound traffic through NAT Gateway + +## Troubleshooting + +### Common Issues + +1. **Cannot access AKS cluster**: + - Ensure you're connecting from a network that has access to the private VNet + - Check if kubectl is configured correctly with admin credentials - > Visit to learn more about Kusto query syntax +2. **Pod image pull failures**: + - Verify ACR private endpoint is correctly configured + - Ensure required images are available in the private ACR + - Check if AKS has proper permissions to pull from ACR -### Use Grafana dashboard +3. **DNS resolution issues**: + - Verify private DNS zones are correctly linked to the VNet + - Check if custom DNS servers are properly configured -1. Go to Grafana dashboard resource in your resource group on Azure Portal -1. Find and browse the Grafana endpoint - > If you have no permission, go back to Grafana dashboard resource and assign `Grafana Admin` role to yourself. -1. Import [two dashboard templates](../../../grafana/). +### Useful Commands + +```bash +# Check AKS cluster connectivity +kubectl get nodes +kubectl get pods --all-namespaces + +# Check ACR connectivity +az acr check-health --name [ACR_NAME] + +# List images in ACR +az acr repository list --name [ACR_NAME] + +# Check private endpoint status +az network private-endpoint list --resource-group [Resource Group Name] +``` + +## Cost Optimization + +For development/testing environments, consider: + +- Setting `grafanaEnabled=false` to avoid Grafana costs +- Using smaller VM sizes for AKS nodes +- Implementing auto-scaling to reduce costs during idle periods +- Using Azure Spot instances for non-production workloads (if supported in your scenario) ## References -- -- +- [Azure Private AKS Cluster](https://docs.microsoft.com/en-us/azure/aks/private-clusters) +- [Azure Container Registry Private Endpoints](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-private-link) +- [Azure NAT Gateway](https://docs.microsoft.com/en-us/azure/virtual-network/nat-gateway/nat-overview) +- [Azure Private DNS Zones](https://docs.microsoft.com/en-us/azure/dns/private-dns-overview) diff --git a/test/infra/azure/main.bicep b/test/infra/azure/main.bicep index a28c0a3936..35c7439294 100644 --- a/test/infra/azure/main.bicep +++ b/test/infra/azure/main.bicep @@ -47,6 +47,8 @@ param azureMonitorWorkspaceName string = '${prefix}-azm-workspace' param azureMonitorWorkspaceLocation string = 'westus2' @description('Specifies the name of aks cluster. Default is {prefix}-aks.') +@minLength(1) +@maxLength(63) param aksClusterName string = '${prefix}-aks' @description('Enables Azure Monitoring and Grafana Dashboard. Default is false.') @@ -61,10 +63,68 @@ param grafanaDashboardName string = '${prefix}-dashboard' @description('Specifies whether to install the required tools for running Radius. Default is true.') param installKubernetesDependencies bool = true +@description('Specifies whether the AKS cluster should be private. Default is true for offline environments.') +param privateClusterEnabled bool = true + +@description('Specifies the name of the virtual network. Default is {prefix}-vnet.') +param virtualNetworkName string = '${prefix}-vnet' + +@description('Specifies the address prefix of the virtual network. Default is 10.0.0.0/8.') +param virtualNetworkAddressPrefix string = '10.0.0.0/8' + +@description('Specifies the name of the subnet for AKS nodes. Default is aks-subnet.') +param aksSubnetName string = 'aks-subnet' + +@description('Specifies the address prefix of the AKS subnet. Default is 10.240.0.0/16.') +param aksSubnetAddressPrefix string = '10.240.0.0/16' + +@description('Specifies the name of the subnet for private endpoints. Default is pe-subnet.') +param privateEndpointSubnetName string = 'pe-subnet' + +@description('Specifies the address prefix of the private endpoint subnet. Default is 10.241.0.0/24.') +param privateEndpointSubnetAddressPrefix string = '10.241.0.0/24' + +@description('Specifies the name of the Azure Container Registry. Default is {prefix}registry.') +@minLength(5) +@maxLength(50) +param acrName string = '${replace(prefix, '-', '')}registry' + +@description('Specifies whether to create NAT Gateway for outbound connectivity. Default is true.') +param enableNatGateway bool = true + param defaultTags object = { radius: 'infra' } +// Deploy Virtual Network for private connectivity +module virtualNetwork './modules/vnet.bicep' = { + name: virtualNetworkName + params: { + name: virtualNetworkName + location: location + addressPrefix: virtualNetworkAddressPrefix + aksSubnetName: aksSubnetName + aksSubnetAddressPrefix: aksSubnetAddressPrefix + privateEndpointSubnetName: privateEndpointSubnetName + privateEndpointSubnetAddressPrefix: privateEndpointSubnetAddressPrefix + enableNatGateway: enableNatGateway + tags: defaultTags + } +} + +// Deploy Azure Container Registry for private image storage +module containerRegistry './modules/acr.bicep' = { + name: acrName + params: { + name: acrName + location: location + sku: 'Premium' + privateEndpointSubnetId: virtualNetwork.outputs.privateEndpointSubnetId + vnetId: virtualNetwork.outputs.vnetId + tags: defaultTags + } +} + // Deploy Log Analytics Workspace for log. module logAnalyticsWorkspace './modules/loganalytics-workspace.bicep' = { name: logAnalyticsWorkspaceName @@ -110,6 +170,10 @@ module aksCluster './modules/akscluster.bicep' = { workloadIdentityEnabled: true imageCleanerEnabled: true imageCleanerIntervalHours: 24 + // Private cluster configuration + privateClusterEnabled: privateClusterEnabled + vnetSubnetId: virtualNetwork.outputs.aksSubnetId + outboundType: enableNatGateway ? 'userDefinedRouting' : 'loadBalancer' tags: defaultTags } } @@ -170,7 +234,8 @@ module alertManagement './modules/alert-management.bicep' = if (grafanaEnabled) } // This is a workaround to get the AKS cluster resource created by aksCluster module -resource aks 'Microsoft.ContainerService/managedClusters@2023-05-01' existing = { +// Note: Accessing admin credentials may fail for private clusters during deployment +resource aks 'Microsoft.ContainerService/managedClusters@2023-10-01' existing = if (grafanaEnabled) { name: aksCluster.name } @@ -181,13 +246,13 @@ module promConfigMap './modules/ama-metrics-setting-configmap.bicep' = if (grafa kubeConfig: aks.listClusterAdminCredential().kubeconfigs[0].value } dependsOn: [ - aks, aksCluster, dataCollection, alertManagement + aks, dataCollection, alertManagement ] } // Run deployment script to bootstrap the cluster for Radius. -module deploymentScript './modules/deployment-script.bicep' = if (installKubernetesDependencies) { - name: 'deploymentScript' +module deploymentScript './modules/deployment-script-offline.bicep' = if (installKubernetesDependencies) { + name: 'offlineDeploymentScript' params: { name: 'installKubernetesDependencies' clusterName: aksCluster.outputs.name @@ -198,9 +263,22 @@ module deploymentScript './modules/deployment-script.bicep' = if (installKuberne tags: defaultTags } dependsOn: [ - aksCluster + containerRegistry ] } +module mongoDB './modules/mongodb.bicep' = { + name: 'mongodb' + params: { + name: '${prefix}-mongodb' + location: location + } +} + +output mongodbAccountID string = mongoDB.outputs.cosmosMongoAccountID output aksControlPlaneFQDN string = aksCluster.outputs.controlPlaneFQDN output grafanaDashboardFQDN string = grafanaEnabled ? grafanaDashboard.outputs.dashboardFQDN : '' +output acrLoginServer string = containerRegistry.outputs.loginServer +output acrName string = containerRegistry.outputs.name +output vnetId string = virtualNetwork.outputs.vnetId +output aksSubnetId string = virtualNetwork.outputs.aksSubnetId diff --git a/test/infra/azure/modules/acr.bicep b/test/infra/azure/modules/acr.bicep new file mode 100644 index 0000000000..0e7b720dc6 --- /dev/null +++ b/test/infra/azure/modules/acr.bicep @@ -0,0 +1,126 @@ +/* +Copyright 2023 The Radius Authors. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +@description('Specifies the name of the Azure Container Registry.') +param name string + +@description('Specifies the location of the Azure Container Registry.') +param location string = resourceGroup().location + +@description('Specifies the SKU of the Azure Container Registry.') +@allowed([ + 'Basic' + 'Standard' + 'Premium' +]) +param sku string = 'Premium' + +@description('Specifies the subnet ID for the private endpoint.') +param privateEndpointSubnetId string + +@description('Specifies the virtual network ID.') +param vnetId string + +@description('Specifies the resource tags.') +param tags object = {} + +// Azure Container Registry +resource containerRegistry 'Microsoft.ContainerRegistry/registries@2023-07-01' = { + name: name + location: location + tags: tags + sku: { + name: sku + } + properties: { + adminUserEnabled: false + networkRuleSet: { + defaultAction: 'Deny' + ipRules: [] + } + publicNetworkAccess: 'Disabled' + encryption: { + status: 'disabled' + } + dataEndpointEnabled: false + networkRuleBypassOptions: 'AzureServices' + } +} + +// Private DNS Zone for ACR +resource privateDnsZone 'Microsoft.Network/privateDnsZones@2020-06-01' = { + name: 'privatelink.azurecr.io' + location: 'global' + tags: tags +} + +// Link Private DNS Zone to VNet +resource privateDnsZoneVnetLink 'Microsoft.Network/privateDnsZones/virtualNetworkLinks@2020-06-01' = { + parent: privateDnsZone + name: '${name}-vnet-link' + location: 'global' + properties: { + registrationEnabled: false + virtualNetwork: { + id: vnetId + } + } +} + +// Private Endpoint for ACR +resource privateEndpoint 'Microsoft.Network/privateEndpoints@2023-06-01' = { + name: '${name}-pe' + location: location + tags: tags + properties: { + subnet: { + id: privateEndpointSubnetId + } + privateLinkServiceConnections: [ + { + name: '${name}-pe-connection' + properties: { + privateLinkServiceId: containerRegistry.id + groupIds: [ + 'registry' + ] + } + } + ] + } +} + +// Private DNS Zone Group for Private Endpoint +resource privateDnsZoneGroup 'Microsoft.Network/privateEndpoints/privateDnsZoneGroups@2023-06-01' = { + parent: privateEndpoint + name: 'default' + properties: { + privateDnsZoneConfigs: [ + { + name: 'acr-config' + properties: { + privateDnsZoneId: privateDnsZone.id + } + } + ] + } +} + +// Outputs +output id string = containerRegistry.id +output name string = containerRegistry.name +output loginServer string = containerRegistry.properties.loginServer +output resourceId string = containerRegistry.id diff --git a/test/infra/azure/modules/akscluster.bicep b/test/infra/azure/modules/akscluster.bicep index 9b33443009..47ba87ae2e 100644 --- a/test/infra/azure/modules/akscluster.bicep +++ b/test/infra/azure/modules/akscluster.bicep @@ -61,6 +61,12 @@ param dnsServiceIP string = '172.16.0.10' ]) param loadBalancerSku string = 'standard' +@description('Specifies whether the cluster is private or not.') +param privateClusterEnabled bool = false + +@description('Specifies the subnet resource ID for the AKS cluster.') +param vnetSubnetId string = '' + @description('Specifies outbound (egress) routing method. - loadBalancer or userDefinedRouting.') @allowed([ 'loadBalancer' @@ -373,6 +379,7 @@ resource aksCluster 'Microsoft.ContainerService/managedClusters@2023-10-01' = { nodeLabels: systemAgentPoolNodeLabels nodeTaints: systemAgentPoolNodeTaints kubeletDiskType: systemAgentPoolKubeletDiskType + vnetSubnetID: !empty(vnetSubnetId) ? vnetSubnetId : null } { name: toLower(userAgentPoolName) @@ -394,6 +401,7 @@ resource aksCluster 'Microsoft.ContainerService/managedClusters@2023-10-01' = { nodeLabels: userAgentPoolNodeLabels nodeTaints: userAgentPoolNodeTaints kubeletDiskType: userAgentPoolKubeletDiskType + vnetSubnetID: !empty(vnetSubnetId) ? vnetSubnetId : null } ] addonProfiles: { @@ -420,6 +428,11 @@ resource aksCluster 'Microsoft.ContainerService/managedClusters@2023-10-01' = { oidcIssuerProfile: { enabled: oidcIssuerProfileEnabled } + apiServerAccessProfile: privateClusterEnabled ? { + enablePrivateCluster: privateClusterEnabled + privateDNSZone: 'system' + enablePrivateClusterPublicFQDN: false + } : null enableRBAC: true networkProfile: { networkPlugin: networkPlugin @@ -501,5 +514,5 @@ resource daprExtension 'Microsoft.KubernetesConfiguration/extensions@2022-07-01' output id string = aksCluster.id output name string = aksCluster.name output location string = aksCluster.location -output controlPlaneFQDN string = aksCluster.properties.fqdn +output controlPlaneFQDN string = privateClusterEnabled ? aksCluster.properties.privateFQDN : aksCluster.properties.fqdn output principalIdentity string = aksCluster.identity.principalId diff --git a/test/infra/azure/modules/bootstrap-offline.sh b/test/infra/azure/modules/bootstrap-offline.sh new file mode 100644 index 0000000000..2bf42e73df --- /dev/null +++ b/test/infra/azure/modules/bootstrap-offline.sh @@ -0,0 +1,144 @@ +#!/bin/bash + +# Offline bootstrap script for Radius AKS cluster +# This script assumes all required container images are pre-loaded into the private ACR + +# Variables for offline deployment +CertManagerVersion="v1.12.0" +WorkloadIdentityVersion="v1.1.0" + +echo "Installing kubectl..." +az aks install-cli --only-show-errors + +# Get AKS credentials +echo "Getting AKS credentials..." +az aks get-credentials \ + --admin \ + --name $clusterName \ + --resource-group $resourceGroupName \ + --subscription $subscriptionId \ + --only-show-errors + +# Check if kubectl is working +echo "Verifying cluster connectivity..." +kubectl get nodes + +# Create namespace for cert-manager +echo "Creating cert-manager namespace..." +kubectl create namespace cert-manager --dry-run=client -o yaml | kubectl apply -f - + +# Install cert-manager using pre-loaded images from ACR +# Note: This assumes cert-manager images have been pre-loaded into the ACR +echo "Installing cert-manager from private registry..." +cat <$AZ_SCRIPTS_OUTPUT_PATH diff --git a/test/infra/azure/modules/datacollection.bicep b/test/infra/azure/modules/datacollection.bicep index d96e8eadbf..e26d6f7e22 100644 --- a/test/infra/azure/modules/datacollection.bicep +++ b/test/infra/azure/modules/datacollection.bicep @@ -95,9 +95,6 @@ module azureMonitorMetricsDcraClusterResourceId './datacollection-dcra.bicep' = dcraName: dcraName clusterLocation: clusterLocation } - dependsOn: [ - dce - ] } // Output diff --git a/test/infra/azure/modules/deployment-script-offline.bicep b/test/infra/azure/modules/deployment-script-offline.bicep new file mode 100644 index 0000000000..edecbf8f92 --- /dev/null +++ b/test/infra/azure/modules/deployment-script-offline.bicep @@ -0,0 +1,97 @@ +// For more information, see https://docs.microsoft.com/en-us/azure/azure-resource-manager/bicep/deployment-script-bicep +@description('Specifies the name of the deployment script uri.') +param name string = 'OfflineBashScript' + +@description('Specifies the name of the AKS cluster.') +param clusterName string + +@description('Specifies the resource group name') +param resourceGroupName string = resourceGroup().name + +@description('Specifies the subscription id.') +param subscriptionId string = subscription().subscriptionId + +@description('Specifies the tenant id.') +param tenantId string = subscription().tenantId + +@description('Specifies the current datetime') +param utcValue string = utcNow() + +@description('Specifies the location.') +param location string = resourceGroup().location + +@description('Specifies the resource tags.') +param tags object + +// Variables +var clusterAdminRoleDefinitionId = resourceId('Microsoft.Authorization/roleDefinitions', '0ab0b1a8-8aac-4efd-b8c2-3ee1fb270be8') + +// Resources +resource aksCluster 'Microsoft.ContainerService/managedClusters@2023-05-01' existing = { + name: clusterName +} + +resource managedIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = { + name: 'scriptManagedIdentity' + location: location + tags: tags +} + +resource clusterAdminContributorRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = { + name: guid(managedIdentity.id, aksCluster.id, clusterAdminRoleDefinitionId) + scope: aksCluster + properties: { + roleDefinitionId: clusterAdminRoleDefinitionId + principalId: managedIdentity.properties.principalId + principalType: 'ServicePrincipal' + } +} + +// Script +resource deploymentScript 'Microsoft.Resources/deploymentScripts@2020-10-01' = { + name: name + location: location + kind: 'AzureCLI' + identity: { + type: 'UserAssigned' + userAssignedIdentities: { + '${managedIdentity.id}': {} + } + } + properties: { + forceUpdateTag: utcValue + azCliVersion: '2.50.0' + timeout: 'PT30M' + containerSettings: { + containerGroupName: '${name}-container-group' + } + environmentVariables: [ + { + name: 'clusterName' + value: clusterName + } + { + name: 'resourceGroupName' + value: resourceGroupName + } + { + name: 'subscriptionId' + value: subscriptionId + } + { + name: 'tenantId' + value: tenantId + } + ] + scriptContent: loadTextContent('bootstrap-offline.sh') + cleanupPreference: 'OnSuccess' + retentionInterval: 'P1D' + } +} + +resource log 'Microsoft.Resources/deploymentScripts/logs@2020-10-01' existing = { + parent: deploymentScript + name: 'default' +} + +output log string = log.properties.log diff --git a/test/infra/azure/modules/mongodb.bicep b/test/infra/azure/modules/mongodb.bicep new file mode 100644 index 0000000000..c214e3d16e --- /dev/null +++ b/test/infra/azure/modules/mongodb.bicep @@ -0,0 +1,27 @@ +param name string + +param location string = resourceGroup().location + +resource account 'Microsoft.DocumentDB/databaseAccounts@2020-04-01' = { + name: name + location: location + kind: 'MongoDB' + tags: { + radiustest: 'corerp-resources-mongodb' + } + properties: { + consistencyPolicy: { + defaultConsistencyLevel: 'Session' + } + locations: [ + { + locationName: location + failoverPriority: 0 + isZoneRedundant: false + } + ] + databaseAccountOfferType: 'Standard' + } +} + +output cosmosMongoAccountID string = account.id diff --git a/test/infra/azure/modules/vnet.bicep b/test/infra/azure/modules/vnet.bicep new file mode 100644 index 0000000000..779757c459 --- /dev/null +++ b/test/infra/azure/modules/vnet.bicep @@ -0,0 +1,226 @@ +/* +Copyright 2023 The Radius Authors. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +@description('Specifies the name of the virtual network.') +param name string + +@description('Specifies the location of the virtual network.') +param location string = resourceGroup().location + +@description('Specifies the address prefix of the virtual network.') +param addressPrefix string = '10.0.0.0/8' + +@description('Specifies the name of the subnet for AKS nodes.') +param aksSubnetName string = 'aks-subnet' + +@description('Specifies the address prefix of the AKS subnet.') +param aksSubnetAddressPrefix string = '10.240.0.0/16' + +@description('Specifies the name of the subnet for private endpoints.') +param privateEndpointSubnetName string = 'pe-subnet' + +@description('Specifies the address prefix of the private endpoint subnet.') +param privateEndpointSubnetAddressPrefix string = '10.241.0.0/24' + +@description('Specifies whether to create NAT Gateway for outbound connectivity.') +param enableNatGateway bool = true + +@description('Specifies the resource tags.') +param tags object = {} + +// NAT Gateway Public IP +resource natGatewayPublicIP 'Microsoft.Network/publicIPAddresses@2023-06-01' = if (enableNatGateway) { + name: '${name}-natgw-pip' + location: location + tags: tags + sku: { + name: 'Standard' + tier: 'Regional' + } + properties: { + publicIPAllocationMethod: 'Static' + publicIPAddressVersion: 'IPv4' + idleTimeoutInMinutes: 4 + } +} + +// NAT Gateway +resource natGateway 'Microsoft.Network/natGateways@2023-06-01' = if (enableNatGateway) { + name: '${name}-natgw' + location: location + tags: tags + sku: { + name: 'Standard' + } + properties: { + publicIpAddresses: [ + { + id: natGatewayPublicIP.id + } + ] + idleTimeoutInMinutes: 4 + } +} + +// Route Table for AKS subnet (required for userDefinedRouting) +resource aksRouteTable 'Microsoft.Network/routeTables@2023-06-01' = if (enableNatGateway) { + name: '${aksSubnetName}-rt' + location: location + tags: tags + properties: { + routes: [] + disableBgpRoutePropagation: false + } +} + +// Network Security Group for AKS subnet +resource aksNsg 'Microsoft.Network/networkSecurityGroups@2023-06-01' = { + name: '${aksSubnetName}-nsg' + location: location + tags: tags + properties: { + securityRules: [ + { + name: 'AllowAKSInternalTraffic' + properties: { + protocol: '*' + sourcePortRange: '*' + destinationPortRange: '*' + sourceAddressPrefix: aksSubnetAddressPrefix + destinationAddressPrefix: aksSubnetAddressPrefix + access: 'Allow' + priority: 100 + direction: 'Inbound' + } + } + { + name: 'AllowAzureLoadBalancer' + properties: { + protocol: '*' + sourcePortRange: '*' + destinationPortRange: '*' + sourceAddressPrefix: 'AzureLoadBalancer' + destinationAddressPrefix: '*' + access: 'Allow' + priority: 200 + direction: 'Inbound' + } + } + { + name: 'DenyAllInbound' + properties: { + protocol: '*' + sourcePortRange: '*' + destinationPortRange: '*' + sourceAddressPrefix: '*' + destinationAddressPrefix: '*' + access: 'Deny' + priority: 1000 + direction: 'Inbound' + } + } + ] + } +} + +// Network Security Group for Private Endpoints subnet +resource peNsg 'Microsoft.Network/networkSecurityGroups@2023-06-01' = { + name: '${privateEndpointSubnetName}-nsg' + location: location + tags: tags + properties: { + securityRules: [ + { + name: 'AllowVnetInbound' + properties: { + protocol: '*' + sourcePortRange: '*' + destinationPortRange: '*' + sourceAddressPrefix: 'VirtualNetwork' + destinationAddressPrefix: 'VirtualNetwork' + access: 'Allow' + priority: 100 + direction: 'Inbound' + } + } + { + name: 'DenyAllInbound' + properties: { + protocol: '*' + sourcePortRange: '*' + destinationPortRange: '*' + sourceAddressPrefix: '*' + destinationAddressPrefix: '*' + access: 'Deny' + priority: 1000 + direction: 'Inbound' + } + } + ] + } +} + +// Virtual Network +resource virtualNetwork 'Microsoft.Network/virtualNetworks@2023-06-01' = { + name: name + location: location + tags: tags + properties: { + addressSpace: { + addressPrefixes: [ + addressPrefix + ] + } + subnets: [ + { + name: aksSubnetName + properties: { + addressPrefix: aksSubnetAddressPrefix + networkSecurityGroup: { + id: aksNsg.id + } + routeTable: enableNatGateway ? { + id: aksRouteTable.id + } : null + natGateway: enableNatGateway ? { + id: natGateway.id + } : null + privateEndpointNetworkPolicies: 'Disabled' + privateLinkServiceNetworkPolicies: 'Enabled' + } + } + { + name: privateEndpointSubnetName + properties: { + addressPrefix: privateEndpointSubnetAddressPrefix + networkSecurityGroup: { + id: peNsg.id + } + privateEndpointNetworkPolicies: 'Disabled' + privateLinkServiceNetworkPolicies: 'Disabled' + } + } + ] + } +} + +// Outputs +output vnetId string = virtualNetwork.id +output vnetName string = virtualNetwork.name +output aksSubnetId string = resourceId('Microsoft.Network/virtualNetworks/subnets', virtualNetwork.name, aksSubnetName) +output privateEndpointSubnetId string = resourceId('Microsoft.Network/virtualNetworks/subnets', virtualNetwork.name, privateEndpointSubnetName) +output natGatewayId string = enableNatGateway ? natGateway.id : '' +output routeTableId string = enableNatGateway ? aksRouteTable.id : ''