We arrive at one of the most characteristic—and initially confusing—concepts of Terraform: state. This is what makes Terraform idempotent and smart, but also one of the things that causes the most problems if not well understood. Let's demystify it completely.

The Problem State Solves

Remember the declarative approach (subchapter 9.2): you declare what you want and Terraform makes reality match. But for that, Terraform needs to answer a crucial question:

"What have I already created, and what is missing or extra?"

Without an answer, Terraform wouldn't know whether it should create a new resource, modify an existing one, or do nothing. It needs to remember which resources it manages and what their current state is. That memory is the state file.

What the State File Is

The state is a file (by default called terraform.tfstate) where Terraform records all the resources it manages and their information: what it created, with which IDs, with which configuration.

Analogy: The state is Terraform's inventory. It's like the notebook where a warehouse manager writes down everything they have: "I have 3 servers with these IDs, 1 bucket named this, 2 subnets with these ranges...". Without that notebook, they wouldn't know what's there or what changed.

The file is in JSON format and maps what you wrote in your code to the real resources in AWS:

Your code:          State (tfstate):           Reality in AWS:
aws_instance.web  ←→  id = "i-0abc123..."    ←→  [Real instance i-0abc123]

How Terraform Uses State

Every time you run plan or apply, Terraform:

  1. Reads the state to know what it thinks exists.
  2. Checks reality in AWS (what actually exists).
  3. Compares the three things: your code (the desired state), the state (what it thinks is there), and reality (what is actually there).
  4. Calculates the differences and decides what to create, modify, or destroy.
   Your .tf code  ──┐
   (desired state)   │
                     ▼
   tfstate state ───► COMPARE ──► Change plan
   (recorded)        ▲
                     │
   AWS reality  ─────┘
   (what exists)

That's why state is essential: it's the reference that allows Terraform to detect changes and be idempotent.

Why State Is So Important (and Delicate)

State is not just any file. It has several critical properties:

  1. It Is Terraform's Source of Truth

Terraform relies on state to know what it manages. If the state is lost or corrupted, Terraform "forgets" about the resources it created and can behave dangerously (try to recreate things, or leave orphaned resources that keep costing money).

  1. It Contains Sensitive Data ⚠️

State can include sensitive information in plain text: database passwords, keys, private resource data. Therefore:

⚠️ State Security Rules:

  • NEVER upload terraform.tfstate to a public Git repository (nor to a private one without encryption). You could leak secrets.
  • Always add terraform.tfstate and *.tfstate.* to your .gitignore.
  • For teams, store state in an encrypted remote backend (subchapter 11.3).

  1. It Is Not Edited by Hand

The state file should not be modified manually. It's meant to be managed by Terraform. Editing it by hand can corrupt it. If you need to manipulate it, there are specific commands (like terraform state mv, terraform state rm) made for that purpose.

State and "Drift"

State also helps detect drift (remember Chapter 9): if someone changes something manually in the AWS console, reality no longer matches the state. The next time you run plan, Terraform will detect it and warn you:

"Hey, this resource has changed outside of Terraform. Your code says one thing, but reality is another."

This allows you to reconcile the situation: either adjust your code, or let Terraform return the resource to the declared state. It's one of the great advantages of having state.

Useful Commands Related to State

So you know they exist (you'll use them later):

Command What it's for
terraform state list See all resources managed by Terraform
terraform state show <resource> See the details of a resource in the state
terraform refresh Update the state with AWS reality
terraform state rm Remove a resource from the state (without destroying it)
terraform import Bring an existing resource into the state (Chapter 20)

What You Should Remember

  • The state (terraform.tfstate) is the inventory where Terraform records the resources it manages and their information.
  • It is essential for Terraform to compare the desired state (code), the recorded state (state), and the real state (AWS), and thus calculate changes. It is the basis of idempotence.
  • It is delicate: if it is lost or corrupted, Terraform "forgets" your resources. Handle it with care.
  • It contains sensitive data: never upload it to Git; add it to .gitignore and, in a team, use an encrypted remote backend.
  • Do not edit it by hand: use the terraform state ... commands if you need to manipulate it.
  • State allows you to detect drift (changes made manually outside of Terraform).

In the next subchapter, we will look at a key topic for teamwork: local vs remote state, and how to store it securely and share it with S3 and DynamoDB.

Cloud, AWS & Terraform — From Zero to Expert

Chapter 1 · What is cloud computing

Chapter 2 · The cloud market and major providers

Chapter 3 · Regions, availability zones and edge

Chapter 4 · Compute: EC2

Chapter 5 · Storage: S3

Chapter 6 · Networking: VPC

Chapter 7 · Identity and access: IAM

Chapter 8 · Managed databases

Chapter 9 · Why Infrastructure as Code

Chapter 10 · HCL: the Terraform language

Chapter 11 · Providers and state

Chapter 12 · Your first real infrastructure in Terraform

Chapter 13 · Load balancing and auto scaling

Chapter 14 · Serverless with Lambda

Chapter 15 · Messaging and events

Chapter 16 · Content delivery and DNS

Chapter 17 · Containers on AWS

Chapter 18 · Modules: reuse and composition

Chapter 19 · Workspaces and environment management

Chapter 20 · Remote backends and locking

Chapter 21 · Infrastructure testing

Chapter 22 · Terraform in CI/CD

Chapter 23 · Defense in depth

Chapter 24 · Observability: logs, metrics and traces

Chapter 25 · Cost optimization

Chapter 26 · High availability and disaster recovery

Chapter 27 · AWS Well-Architected Framework

Chapter 28 · Serverless architectures at scale

Chapter 29 · Data platforms on AWS

Chapter 30 · Multi-account and landing zones

Chapter 31 · Platform Engineering and Internal Developer Platform

Chapter 32 · Relevant AWS certifications

Chapter 33 · Projects to consolidate what you've learned

Chapter 34 · Resources and community

© Copyright 2024. All rights reserved