In the previous subchapter, Compute Optimizer recommended the optimal size for our resources. That technique—adjusting the size of what you use to what you really need—is called rightsizing, and it is one of the most direct and effective ways to save money in the cloud without losing performance. It is so important that it deserves its own subchapter. The idea is simple but powerful: don’t pay for capacity you don’t use.

The problem: oversizing “just in case”

A very common mistake, inherited from the physical server mentality, is to choose resources larger than necessary “just in case.” In the era of physical servers this made some sense (buying a server was an investment for years, and upgrading later was difficult). But in the cloud it is a waste of money:

Contracted server:  ████████████████  (capacity: 16 GB RAM, 8 CPUs)
Actual usage:       ███               (uses: 2 GB RAM, 1 CPU)
                    └─ you pay for ALL this, use only this ─┘
   → you’re paying ~8 times more than you need

You pay for all that capacity every hour, whether you use it or not. Multiplied by many resources and many months, it’s a huge amount of wasted money.

What is rightsizing

Rightsizing means adjusting the size of each resource to what it really needs: not more (you waste money) nor less (it falls short and runs slowly). It’s about finding the “just right size” based on actual usage, not assumptions.

BEFORE (oversized):   large server, uses 10%  → wasteful
AFTER (rightsized):   medium server, uses 60% → efficient
   → same user performance, LOWER cost

Analogy: rightsizing is like choosing the right clothing size. If you buy an XXL when you wear an M, the clothes are way too big (you paid extra for fabric you don’t need). If you buy an S, it’s tight and uncomfortable (too small). The M size, the one that fits you, is perfect: comfortable and without waste. Rightsizing is giving each resource “its size.”

Another analogy: it’s like renting the right car for your trip. If you’re just going to the office alone, you don’t rent a 50-seat bus (you pay for the whole thing just for yourself). But if you’re going with the whole family and luggage, a tiny compact car won’t do. You choose the car that fits your real need.

How rightsizing is done

Rightsizing is based on real usage data, not intuition. The typical process:

  1. Measure the actual usage of your resources over time (with CloudWatch, subchapter 24.1, and with Compute Optimizer, subchapter 25.2): how much CPU, memory, etc., do they really use?
  2. Identify the oversized ones: those that use only a small fraction of their capacity.
  3. Adjust the size to a smaller one that still covers actual usage with some margin.
  4. Verify that after the change everything still performs well.

⚠️ Be careful not to overdo it: the goal is not to choose the smallest and cheapest resource possible, but the right one. If you cut too much, the resource falls short, runs slowly, and worsens the user experience (or crashes). Rightsizing is about balance: the smallest size that comfortably covers your real need, leaving room for spikes.

Rightsizing and cloud elasticity

Rightsizing is possible and safe thanks to a unique advantage of the cloud: changing size is easy and reversible. Remember elasticity (subchapter 1.3) and autoscaling (subchapter 13.3):

  • If you fall short, increasing the size takes just minutes (unlike buying a new physical server).
  • With autoscaling, you don’t even have to guess: the system adds or removes resources according to real demand, doing “automatic rightsizing” of the quantity.

This ease of readjustment is what makes rightsizing low risk: if you make a mistake, you can fix it right away. That’s why you can afford to choose tight sizes instead of inflating them “just in case.”

Real world example: a company migrated its applications to AWS by copying the size they had on their physical servers (which were huge “just in case”). After a few months, they review with Compute Optimizer and discover that most of their servers use less than 15% of their capacity. They do rightsizing: they reduce each server to a size according to its actual usage, leaving room for spikes. Result: the compute bill drops by almost half, and users don’t notice any difference (performance is the same, because the removed capacity wasn’t being used). It’s literally like stopping throwing money away.

Beyond servers

Rightsizing applies to many resources, not just EC2 servers:

  • Databases (RDS, Chapter 8): choosing the right instance size.
  • Lambda (Chapter 14): assigning the right amount of memory (which also affects performance and cost).
  • Storage: using the appropriate storage class (remember the S3 classes, subchapter 5.x, for infrequently accessed data).

The philosophy is always the same: pay for what you need, not for what’s left over.

What you should remember

  • Oversizing “just in case” (inherited from physical servers) is a waste of money in the cloud: you pay for capacity you don’t use, every hour.
  • Rightsizing is adjusting the size of each resource to what it really needs: not too much (wasteful) nor too little (falls short). Like choosing the right clothing size or the right car for the trip.
  • It’s based on real usage data (CloudWatch, Compute Optimizer), not intuition: measure → identify oversized → adjust → verify.
  • ⚠️ The goal is the right size, not the cheapest: cutting too much worsens performance. It’s about balance, leaving room for spikes.
  • It’s safe thanks to the elasticity of the cloud: changing size is easy and reversible, and autoscaling adjusts the quantity automatically.
  • Applies to servers, databases, Lambda, and storage. Philosophy: pay for what you need, not for what’s left over.

In the next subchapter we’ll see another great lever for savings, but through a different path: committing to usage in exchange for discounts, with Savings Plans and Reserved Instances.

Cloud, AWS & Terraform — From Zero to Expert

Chapter 1 · What is cloud computing

Chapter 2 · The cloud market and major providers

Chapter 3 · Regions, availability zones and edge

Chapter 4 · Compute: EC2

Chapter 5 · Storage: S3

Chapter 6 · Networking: VPC

Chapter 7 · Identity and access: IAM

Chapter 8 · Managed databases

Chapter 9 · Why Infrastructure as Code

Chapter 10 · HCL: the Terraform language

Chapter 11 · Providers and state

Chapter 12 · Your first real infrastructure in Terraform

Chapter 13 · Load balancing and auto scaling

Chapter 14 · Serverless with Lambda

Chapter 15 · Messaging and events

Chapter 16 · Content delivery and DNS

Chapter 17 · Containers on AWS

Chapter 18 · Modules: reuse and composition

Chapter 19 · Workspaces and environment management

Chapter 20 · Remote backends and locking

Chapter 21 · Infrastructure testing

Chapter 22 · Terraform in CI/CD

Chapter 23 · Defense in depth

Chapter 24 · Observability: logs, metrics and traces

Chapter 25 · Cost optimization

Chapter 26 · High availability and disaster recovery

Chapter 27 · AWS Well-Architected Framework

Chapter 28 · Serverless architectures at scale

Chapter 29 · Data platforms on AWS

Chapter 30 · Multi-account and landing zones

Chapter 31 · Platform Engineering and Internal Developer Platform

Chapter 32 · Relevant AWS certifications

Chapter 33 · Projects to consolidate what you've learned

Chapter 34 · Resources and community

© Copyright 2024. All rights reserved