My Journey of Building a Terraform AI Agent — Automating Cloud Infrastructure with AI

By Anoop Kumar, Lead DevOps Consultant, Rackspace Technology

The vision: Why I built this AI agent

As a DevOps Engineer who is passionate about AI, I always seek ways to optimize and automate cloud infrastructure. Managing Terraform code manually is time-consuming, error-prone and requires constant validation. I wanted to build something that could help my team and customers deploy resources faster, smarter and with fewer mistakes.

I shared my idea with my manager, Marc Bloom, who encouraged me to explore the idea further. I also sought feedback from my colleagues, who helped me refine my approach. This led to the creation of the Terraform AI Agent, a generative AI-powered tool that generates, validates and deploys Terraform code in Microsoft® Azure® automatically.

The journey: From idea to execution

Two months ago, I started researching how to automate Terraform with AI. Initially, my goal was simple: Generate Terraform code automatically and save it into .tf files.

The challenge: AI hallucination and duplicate code

However, I soon faced a major issue. AI was generating the same code repeatedly for similar requests, leading to duplicate resources in Azure. This could create conflicts and unnecessary costs.

The solution: Smart code generation and validation

To fix this, I implemented a function that checks existing Terraform infrastructure before generating new code. This helps ensure:

  • No duplicate resources are created
  • Unique and necessary code is generated
  • Custom modules and templates can be used to meet compliance requirements
  • AI hallucinations (incorrect code generation) are eliminated

After solving the code generation problem, I focused on validating and deploying the Terraform code seamlessly

 

How the Terraform AI agent works

Step 1: Enter infrastructure request

Instead of manually writing Terraform code, users enter their infrastructure request in a simple prompt, including:

  • Example: “Create a resource group named aiagent-terraform-rg in uksouth”
  • The AI instantly generates Terraform code while checking for existing infrastructure to prevent duplicates
  • Supports custom modules and templates to ensure compliance

Step 2: Validate and auto-fix code

Before deployment, the AI performs multiple checks:

  • Runs terraform fmt to format the code
  • Uses terraform validate to ensure correctness
  • Applies TFLint to detect security misconfigurations
  • Auto-fixes errors to maintain best practices

Step 3: Deploy via GitHub Actions

With a single click on “Save & Deploy,” the Terraform AI agent pushes validated Terraform code to GitHub and triggers GitHub Actions, which run:

  • terraform init
  • terraform validate
  • terraform plan
  • terraform apply

It then deploys infrastructure in Azure and stores the Terraform state in Blob Storage.

 

Terraform AI agent architecture and workflow

GitHub repository: https://github.com/anoopkum/terraform-automation

Step-by-step implementation

The heart of our solution is the AI agent that converts natural language into Terraform code. Looking at the repository, we can see how this is implemented in aiagent.py.

Initially, I use OpenAI’s GPT-4o model. But I later switched to O3-mini, which is good at reasoning/code to convert natural language prompts into structured Terraform code.

User interface workflow

UI has three main action buttons that guide users through the infrastructure creation process:

1. Generate Terraform code button

When users enter their infrastructure requirements as a natural-language prompt, they can click the “Generate Terraform Code” button. This triggers the AI agent to:

  • Process the natural language description
  • Convert it into properly structured Terraform code
  • Display the generated code for review

The generated code appears in a code editor interface where users can review it and make any desired adjustments.

Terraform AI Agent Pic 2

2. Validate Terraform code button

After generating the code, users can click the “Validate Terraform Code” button to initiate the validation process, which triggers:

  • Code formatting with terraform fmt
  • Syntax checking with terraform validate
  • Best practices verification with tflint

Green checkmarks indicate successful validations, while red alerts highlight areas needing attention.

Terraform AI Agent Pic 3

3. Save & deploy button:

Once the code passes validation, users can click the “Save & Deploy” button to initiate the deployment process, which: 

  • Saves the code in a terraform file (main.tf)
  • Creates a new pull request in the GitHub repository
  • Triggers the GitHub Actions workflow
  • Automates the approval and merge process
  • Deploys the infrastructure to Azure

The interface provides real-time status updates during the deployment process, showing the progress from plan generation to resource creation.

Terraform AI Agent Pic 4

Terraform state management

A critical aspect of Terraform infrastructure management is the state file. In our architecture, the Terraform state is stored securely in Azure Blob Storage.

The GitHub Actions workflow initializes Terraform with backend configuration that points to the Azure storage account and container. The storage account access key is securely fetched from Azure. 

Key vault during deployment

Terraform AI Agent Pic 5
Terraform AI Agent Pic 6

Frontend implementation

The frontend interface is implemented using a modern React application. Let’s look at how the core user interface component is structured:

UI implementation (streamlit_app.py)

This Streamlit app implements the step-by-step workflow with the three main buttons and manages the entire process from prompt input to deployment.

View the live demo of Terraform AI agent here
 

Business impact

  • Reduces deployment time from hours to minutes
  • Eliminates human errors with auto-validation and fixes
  • Ensures compliance with pre-configured modules and security checks
  • Improves efficiency for DevOps and cloud engineers

 

Future enhancements

Currently, the Terraform AI agent supports resource creation. The next phase will include:

  • Modify and delete capabilities for the full infrastructure lifecycle management
  • Support for additional cloud platforms beyond Azure

 

Tech stack and AI models

  • Azure OpenAI (GPT-4o/O3-mini) for code generation
  • Terraform for infrastructure as code
  • GitHub Actions for CI/CD automation
  • Python (FastAPI, Streamlit) for AI agent logic and UI
  • GitHub as a repository
  • Azure cloud

 

Check out the terraform-automation repository to get your hands dirty and learn to create the AI agent.

Learn more about our cloud automation and AI services