Terraform and Azure DevOps – thomashafermalz.net

With this and the upcoming posts, we continue the series on managing Azure infrastructure with Terraform. After describing basic Infrastructure as Code (IaC) aspects, dynamization options and working with modules, we proceed with another step towards automation – the implementation within deployment pipelines. Here in this blog we will take a look at what’s needed in general to have a base for developing an infrastructure pipeline.

In the DevOps/pipeline sector, there are a couple of options, especially two relevant prominent representatives in particular, Azure DevOps and Github Actions. The principles are very similar and adaptable, the post itself will focus on Azure DevOps. If you are interested in implementing it with Github Actions, you will find here[1] a good starting point.

Azure DevOps is a popular platform from Microsoft for automating software creation and deployment. With Azure DevOps, IT teams can deploy their applications quickly, securely, and reliably by setting up continuous integration and continuous deployment (CI/CD) pipelines. The features of Azure DevOps are very large and cover much more than just the deployment of infrastructure. At this point, however, we will stick to the Terraform-specific topics. Task management, branching strategies for repositories, package management, etc. are left out.

Why CI/CD pipelines for IaC?

Automation pipelines aim to ensure deployments based on code (changes) and to integrate certain additional steps for stability and security. Of course, this also applies to IaC deployments based on Terraform. Once the pipeline is created and configured correctly, the focus is only on code development. Deployments are usually intended to take place on multiple environments without duplicating the entire code. A commit in the repository, for example, directly triggers a deployment of the infrastructure in a development stage. Many manual steps are then taken off our hands. Combined with pull request strategies and approval gates, for example, the code finally finds its way to productive stages. The whole thing is repeatable and not limited to fixed, possibly elongated release cycles.

Azure DevOps Pipelines for Terraform

In this section, we’ll show how to set up Azure DevOps Pipelines for Terraform deployments. We will first clarify in a more general way which areas of Azure DevOps are mainly relevant for this and what needs to be considered there. Then it’s about the individual steps of the pipeline.

Now, in order to have Terraform code delivered through a DevOps pipeline; we follow the same mechanisms as for other software development. We need an Azure DevOps project with a code repository as a base, which is checked out on a virtual agent machine during a pipeline run. Depending on the agent’s pool, software must also be installed on it. The pipeline includes the previously mentioned manual steps as a task and together with a service principal who has the appropriate rights, the deployment can be executed. We will now take a closer look at the individual areas, especially with a focus on Terraform and things such as statefile handling.

Azure DevOps project and organization

For testing and demo purposes, both the overarching Azure DevOps organization and a DevOps project within it can be created for free, if you don’t already have one. For the project, Git should be chosen as the repository type, the work item process doesn’t really matter. In addition, the Service Connection described later in the project settings is created at this level. At the level of the organization, for example, the agent pools for the pipelines are configured if necessary and it is possible to see how many parallel jobs are available to execute the pipelines. To start with a private repository and public agent, the one usable job with 1800 minutes of running time per month is easily sufficient before more would have to be purchased. More on that in a moment.

Repository

The repository can be initiated directly in the project after the project has been created, preferably directly with the proposal for a Terraform-specific .gitignore file. As the name says, this file is there to ignore certain files or directories in a commit that should not be checked into the Git repository. For example, these are temporary or locally specific files that are created by build processes, operating systems, or users. With Terraform, for example, it includes local state files and the complete .terraform folder in a common form – the provider modules are downloaded “fresh” during a pipeline run and the statefile should always be configured remotely in a DevOps context. It can also contain sensitive information that has no place in a repository.

For a Terraform project, a typical .gitignore file might look like this:

# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Ignore CLI configuration files
.terraformrc
terraform.rc

In keeping with the topic of the repository, the branching strategies should still be defined and pull request policies defined, but this would go beyond the scope of this article. In short: even Terraform IaC code does not belong directly as a commit in the main fallow outside of a demo scope.

Connection to Azure

In order to be able to deploy resources through a pipeline, we need to set up a service connection. Behind this is essentially a name for the connection and a – in the classic way – service principal who has the necessary RBAC rights in the Azure environment. The principal is either already available or can be created by the wizard in the DevOps portal. When creating a connection in the context of the corporate DevOps organization, the user often does not have enough rights in the Entra ID for the app registrations required in the background. In addition, there is often some confusion because a user cannot see the service connections due to the DevOps project settings, but can use them for pipelines.

The new recommended approach is working with a Workload Identity Federation^[2]. This also works through an app registration or a user-assigned identity, which can be linked with external identity providers. This eliminates the need for key rotation of the app and reduces the risk of secret theft. However, this approach is not yet supported for all scenarios. Regarding the pipeline definition, we see no difference in the end—what matters is that the service connection works.

State file handling

The Terraform Statefile is also indispensable in DevOps processes. The question is, at most, where can it be? For various logical reasons, such as presumably working in teams, the statefile must be configured remotely, in our case usually in an existing Azure Storage account. The storage is usually outside of a Terraform configuration, the famous chicken-and-egg problem has already been described in the series of articles. A local file would have to be recreated on a “fresh” agent VM and thus conflict with existing resources, and checking in to the repository is also not an option, especially from a security point of view.

In principle, the configurations for the remote backend such as resource group, storage account, container and file name can be specified in the Terraform code or as parameters in the init task in the pipeline. At this point, the second variant is chosen to abstract the provider configuration as far as possible into the pipeline. For more details on Hashicorp’s recommended Partial Configuration, please visit[3].

In this case, however, we still need a backend “azurerm”{} configuration block in our Terraform files. For the necessary access to Statefile, it is recommended to choose authorization via roles in Entra ID (formerly Azure Active Directory) instead of Access Keys or Shared Access Signatures (SAS). The Key/SAS variant also works. However, according to good practice, it then requires a clean injection of the keys via Azure Key Vault references, for example, so that they do not end up in the repository, and an access key is also quite powerful. We add the parameter use_azuread_auth = true in the backend configuration block. [4]

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "~>3.99.0"
}
  }
  Backend "azurerm" {
    use_azuread_auth = true
  }
}

For access via Entra ID, the service principal must also be assigned the role of Storage Blob Data Contributor to the account with the statefile. This works, for example, via a new role assignment in the Access Control (IAM) section of the Storage Account.

Managed and unmanaged agents

The last point in this section briefly discusses the different types of agents for pipeline deployments. In Azure DevOps, there are two types of build and release agents: Managed (also known as Microsoft-Hosted) and Unmanaged (also known as Self-Hosted or Private).

The managed variant is publicly accessible via Azure Devops. The agents are managed and hosted by Microsoft and are always available and up to date as they reset after each job. Each pipeline run gets a “fresh” VM. For these, there are current images of Windows, Linux and MacOS with many tools already installed – including Terraform. Other tools can be installed, but this has to be done every time. Changes in the file system, such as checking out coders or build products, are not directly available across jobs. In terms of finances and several parallel jobs, this variant can cost more than the self-managed one. There [5] is an overview of the Microsoft Managed Agents with links to the installed packages.

Unmanaged (Self-Hosted) Agents

As the name suggests, these agents are managed and hosted by the user himself, including the responsibility for patching the operating system or framework updates. The agents can run on a VM or now also container-based. Essentially, the DevOps agent software and the corresponding agent pool configuration in the DevOps organization settings are needed for them to be available in the pipelines. In addition to specific software configurations and agent persistence beyond a pipeline job run, it is primarily network integrations where self-hosted agents are used. For example, if resources are to be deployed or configured in environments that are not accessible from public networks, VNet-integrated private agents are needed to ensure that access for the pipelines works. Depending on the protection requirements of the environment and service, the alternative approach could be taken, e.g. for a storage account, to relax the public endpoint and firewall settings in the pipeline task, to allow the IP address of the agent and to close it again after work.

Microsoft Managed Agents are used for this article because no VNet integration is required and the existing software package is sufficient. However, we will find that this requires a few additional tasks in the pipeline.

In this post we covered the basics need to know for Azure DevOps setup.

The next post will be about pipeline implemention considerations.

[1] https://github.com/Azure-Samples/terraform-github-actions

[2] https://learn.microsoft.com/en-us/entra/workload-id/workload-identity-federation

[3] https://developer.hashicorp.com/terraform/language/settings/backends/configuration#partial-configuration

[4] https://developer.hashicorp.com/terraform/language/settings/backends/azurerm

[5] https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/hosted?view=azure-devops&tabs=yaml