Adding a GPT model to Azure via Terraform for Resources API

A client recently asked me to set up a GPT-4.1 deployment in Azure using Terraform—only to discover that the official documentation barely mentions the OpenAI endpoint or how to call it via the Responses API.

In fact, most examples focus on Chat Completions, which left me scrambling to figure out how to wire up /openai/v1/responses against a custom deployment. I’ll walk through exactly what I did—including the Terraform code, a minimal Python test, and tips on region checks and deployment naming.

Prerequisites

Terraform ≥ 1.5.0 installed and configured.
An existing Azure Key Vault (we’ll store the OpenAI key there).
An Azure Resource Group where you have write permissions.
The Terraform AzureRM provider configured (service principal or managed identity).
Python ≥ 3.8 and the openai package (pip install openai).

Checking Your Region

Before you start, confirm that your target region supports both the Azure AI Services (OpenAI) resource and the desired model for the responses api (GPT-4.1). When I tried to spin this up, I discovered that not every region offers the “DataZoneStandard” SKU or that specific model version. https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/responses?tabs=python-secure

Terraform Configuration

Below is the Terraform code I used to:

Create an Azure AI Services account
Deploy GPT-4.1 as a cognitive deployment (named gpt41)
Store the AI Services primary key in Key Vault

#-----------------------------------------
# 1. AI Services Account
#-----------------------------------------
resource "azurerm_ai_services" "ai_services" {
  name                   = "ai-services"
  location               = "swedencentral"                    
  resource_group_name    = azurerm_resource_group.rg.name
  sku_name               = "DataZoneStandard"                           
  custom_subdomain_name  = "demo-ai-custom-subdomain" 
  lifecycle {
    prevent_destroy = true
  }
}

#-----------------------------------------
# 2. Cognitive Deployment for GPT-4.1
#-----------------------------------------
resource "azurerm_cognitive_deployment" "gpt41" {
  name                 = "gpt41"
  cognitive_account_id = azurerm_ai_services.ai_services.id

  model {
    format  = "OpenAI"
    name    = "gpt-4.1"
    version = "2025-04-14"
  }

  sku {
    name     = "DataZoneStandard"
    capacity = 300
  }

  version_upgrade_option = "OnceCurrentVersionExpired"

  lifecycle {
    prevent_destroy = true
  }
}

#-----------------------------------------
# 3. Store Primary Access Key in Key Vault
#-----------------------------------------
resource "azurerm_key_vault_secret" "ai_key" {
  name         = "AiServices--Key"
  value        = azurerm_ai_services.ai_services.primary_access_key
  key_vault_id = azurerm_key_vault.kv.id
}

3.1. `azurerm_ai_services`

location: Must match a region that supports your desired model.
sku_name: I chose DataZoneStandard , but there are different choices as per your own needs : https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=global-standard%2Cstandard-chat-completions#model-summary-table-and-region-availability
custom_subdomain_name: Defines the subdomain (https://demo-ai-custom-subdomain.openai.azure.com), which you’ll need to call /openai/v1/responses later. If you omit this, Azure generates a random subdomain that’s harder to remember.

3.2. `azurerm_cognitive_deployment`

name = "gpt41": This is your deployment name. It must match exactly in your client code when you call deployment_name=....
model { format, name, version }:
- format = "OpenAI" tells Azure we’re using an OpenAI‐compatible model.
- name = "gpt-4.1", version = "2025-04-14". Pin to a specific version so you don’t get unintended updates.
sku { name, capacity }:
- name = "DataZoneStandard".
- capacity = 300. Adjust based on your performance/throughput needs.

Deploying with Terraform

Initialize Terraform
terraform init
Review the plan
terraform plan
Apply the plan
terraform apply
Verify in the Portal
- Go to Resource Groups → your-rg → ai-services.
- Under Deployments, confirm gpt41 appears.
- Under Keys and Endpoint, copy the Primary Access Key and note the endpoint URL (e.g. https://demo-ai-custom-subdomain.openai.azure.com).

Minimal Test Script (Python)

With Terraform done, let’s verify via the Responses API. Azure’s docs often mention Chat Completions, but you can hit /openai/v1/responses directly. Create a simple Python script (test_gpt.py):

import os
from openai import OpenAI

# Retrieve the key from an environment variable
subscription_key = os.getenv("AZURE_OPENAI_API_KEY")

client = OpenAI(
    api_key = subscription_key,
    base_url = "https://demo-ai-custom-subdomain.openai.azure.com/openai/v1/",
    default_query = {"api-version": "preview"},
)

response = client.responses.create(
    deployment_name = "gpt41",   # Must match Terraform’s name
    prompts = [
        {"role": "user", "content": "This is a test."}
    ]
)

print(response.model_dump_json(indent=2))

Set the environment variable to the Key Vault secret you stored (fetch it from Key Vault or hardcode for a quick test). export AZURE_OPENAI_API_KEY="<primary-key-from-key-vault>"
Run the script
python3 test_gpt.py
Expected output: A JSON object with the GPT-4.1 response. If you see any "error":"invalid_client", double-check your key/endpoint/deployment_name.

Why the Custom Name Matters

Azure’s documentation mostly shows /openai/deployments/{name}/chat/completions. They rarely mention /openai/v1/responses, which is why I hit several dead ends.
The custom_subdomain_name ensures your endpoint is predictable: https://demo-ai-custom-subdomain.openai.azure.com/openai/v1/. If you don’t set it, Azure generates a GUID‐like subdomain, and you constantly have to copy/paste from the Portal.
The deployment name (gpt41) is required in client.responses.create(deployment_name=…). If you call deployment_name="gpt-4.1", it won’t work—Azure expects exactly the resource name you created in Terraform.

Whenever you see “model=…” in Azure samples, that actually maps to your custom deployment. Always double‐check that your deployment_name in client code matches exactly what you created.

Conclusion

I hope this guide saves you time—when I first tackled this, Azure’s docs were sparse on the Responses API and custom deployment naming, forcing me into a lot of trial and error. With Terraform you can:

Provision the AI Services resource in a supported region
Deploy GPT-4.1 under a custom name
Store the access key securely in Key Vault
Hit /openai/v1/responses with a minimal Python script

Huge thanks to Marc Rufer and Cédric Mendelin for reviewing the Terraform code and logic. Once you’ve verified your region, subdomain, and deployment name, you’ll have a rock‐solid GPT-4.1 endpoint up and running in minutes.