We close Part IV with EKS (Elastic Kubernetes Service), the most powerful—and most complex—option for running containers on AWS. It’s important to understand what Kubernetes is, what EKS brings to the table, and above all, when it’s worth it compared to the simplicity of ECS. Spoiler: not always.
What is Kubernetes
Kubernetes (often abbreviated as K8s) is the most popular container orchestrator in the world. It does the same as ECS (run, scale, repair, and coordinate containers), but with two important differences:
- It’s open source and an industry standard: it’s used by companies of all sizes and works on any cloud (AWS, Azure, Google...) or even on your own servers.
- It’s much more powerful and flexible... but also much more complex.
ECS = AWS orchestrator, simple, only on AWS K8s = standard orchestrator, extremely powerful, anywhere, but complex
Analogy: if ECS is an automatic car (easy to drive, AWS manages almost everything for you), Kubernetes is a Formula 1 car: incredibly capable and configurable, but requires an expert driver and lots of maintenance. To go to the supermarket, the automatic is more than enough.
What is EKS
Managing Kubernetes on your own is very difficult: you have to install and maintain its “brain” (the control plane), which is complex and delicate. EKS (Elastic Kubernetes Service) solves this: it’s Kubernetes managed by AWS. AWS takes care of the hardest part (the control plane), and you use Kubernetes without having to maintain its core.
Kubernetes on your own: you maintain EVERYTHING (very difficult) EKS: AWS maintains the "brain"; you use Kubernetes
Even so—and this is key—EKS is still more complex than ECS. AWS takes away part of the work, but Kubernetes itself has a steep learning curve: its own concepts (pods, deployments, services, ingress...), its own way of configuring, its own ecosystem of tools.
ECS vs EKS: The Key Comparison
| ECS | EKS (Kubernetes) | |
|---|---|---|
| Complexity | Low, easy to learn | High, steep curve |
| Power/flexibility | Enough for most | Huge, highly configurable |
| Portability | AWS only | Any cloud or on-prem |
| Ecosystem | AWS’s | Huge open source ecosystem |
| Required team | Any team | Ideally, Kubernetes expertise |
| Ideal for | Getting started, simplicity | Complex needs, multi-cloud |
When to Choose Kubernetes (EKS)?
EKS makes sense in these cases:
- You already use Kubernetes or your team has experience with it. You leverage that knowledge.
- You need multi-cloud or want to avoid “vendor lock-in”: you want your application to run on AWS, another cloud, or your own servers without rewriting it. Kubernetes is the common standard.
- Very complex needs: advanced network configurations, sophisticated deployments, or you want to use the huge Kubernetes ecosystem of tools (Helm, operators, service meshes...).
- Large organization with many teams and applications, where Kubernetes’s power outweighs its complexity.
When NOT to Choose Kubernetes?
And here’s the most valuable advice of the subchapter. Many people choose Kubernetes because it’s trendy, without needing it, and end up paying a huge complexity cost for nothing. Do NOT choose EKS if:
- You’re just starting with containers. Start with ECS/Fargate; it’s much simpler.
- Your team doesn’t know Kubernetes. The learning curve and maintenance will steal time you should spend on your product.
- Your needs are normal: running a few applications, scaling them, distributing traffic. ECS does this perfectly and with much less effort.
- You want operational simplicity. Fargate (with ECS) gives you containers with almost nothing to manage.
The classic mistake: a small team sets up a complex Kubernetes cluster to run three microservices, and ends up spending more time maintaining Kubernetes than developing their product. With ECS + Fargate they would have had the same thing running in a fraction of the time. Choose the tool for real need, not for fashion.
The Golden Rule
Do I really need the power and portability of Kubernetes, and do I have (or want to acquire) the expertise to manage it? YES → EKS NO → ECS + Fargate (the answer for most)
Always start with the simple (ECS/Fargate). Move to EKS only when you have a concrete and justified reason to need Kubernetes. Complexity must earn its place, not be adopted by default.
What You Should Remember
- Kubernetes (K8s) is the industry standard container orchestrator: extremely powerful, open source, and portable to any cloud, but very complex. Like a Formula 1 compared to the automatic car that is ECS.
- EKS (Elastic Kubernetes Service) is Kubernetes managed by AWS: AWS maintains the “brain,” but it’s still more complex than ECS.
- Choose EKS if: you already use Kubernetes, need multi-cloud/avoid lock-in, have very complex needs, or a large organization with expertise.
- Do NOT choose EKS if you’re starting out, your team doesn’t know Kubernetes, or your needs are normal: ECS + Fargate does the same with much less effort.
- Golden rule: start with the simple (ECS/Fargate) and move to EKS only with a justified reason. Choose by need, not by trend.
You’ve finished Chapter 17 and Part IV! You now master intermediate AWS architectures: load balancing and auto-scaling, serverless, messaging, content delivery, and containers. In Part V we’ll raise the level of Terraform: starting with modules, the key to reusing and organizing your infrastructure like a pro.
Cloud, AWS & Terraform — From Zero to Expert
Chapter 1 · What is cloud computing
- 1.1 The traditional client-server model
- 1.2 Problems the cloud came to solve
- 1.3 On-premise vs cloud vs hybrid
- 1.4 The three service models: IaaS, PaaS, SaaS
- 1.5 The five pillars of cloud (according to NIST)
- 1.6 Real advantages: elasticity, pay-as-you-go, global availability
Chapter 2 · The cloud market and major providers
- 2.1 AWS, Azure and GCP: differences and market share
- 2.2 Why learn AWS first
- 2.3 Concepts that are universal among providers
Chapter 3 · Regions, availability zones and edge
- 3.1 What is an AWS region and how to choose it
- 3.2 Availability Zones: high availability by design
- 3.3 Edge locations and CloudFront
- 3.4 Latency, resilience and data sovereignty
Chapter 4 · Compute: EC2
- 4.1 Instances: types, families and when to choose each
- 4.2 AMIs, key pairs and Security Groups
- 4.3 Instance lifecycle
- 4.4 Elastic IPs and Placement Groups
- 4.5 Savings Plans vs Reserved vs On-Demand vs Spot
Chapter 5 · Storage: S3
- 5.1 Buckets, objects and keys
- 5.2 Storage classes (Standard, IA, Glacier…)
- 5.3 Versioning and object lifecycle
- 5.4 Bucket policies and ACLs
- 5.5 Static website hosting
Chapter 6 · Networking: VPC
- 6.1 What is a VPC and why you need it
- 6.2 Public and private subnets
- 6.3 Internet Gateway and NAT Gateway
- 6.4 Route Tables and Network ACLs
- 6.5 VPC Peering and endpoints
Chapter 7 · Identity and access: IAM
- 7.1 Users, groups, roles and policies
- 7.2 The principle of least privilege
- 7.3 Identity-based vs resource-based policies
- 7.4 MFA and temporary credentials (STS)
- 7.5 IAM security best practices
Chapter 8 · Managed databases
- 8.1 RDS: engines, Multi-AZ and read replicas
- 8.2 Aurora and its advantages over vanilla RDS
- 8.3 DynamoDB: key-value / document model
- 8.4 ElastiCache for in-memory cache
- 8.5 When to use each type of database
Chapter 9 · Why Infrastructure as Code
- 9.1 Problems with manual provisioning
- 9.2 Declarative vs imperative IaC
- 9.3 Terraform vs CloudFormation vs Pulumi vs CDK
- 9.4 The plan → apply → destroy cycle
Chapter 10 · HCL: the Terraform language
- 10.1 Resource, variable, output, locals blocks
- 10.2 Data types: string, number, bool, list, map, object
- 10.3 Expressions, references and built-in functions
- 10.4 Conditionals and loops (count, for_each, for)
Chapter 11 · Providers and state
- 11.1 How the AWS provider works
- 11.2 The terraform.tfstate file and its importance
- 11.3 Local state vs remote state (S3 + DynamoDB)
- 11.4 Essential commands: init, plan, apply, destroy, fmt, validate
Chapter 12 · Your first real infrastructure in Terraform
- 12.1 Create a VPC with subnets from scratch
- 12.2 Launch a public EC2 instance
- 12.3 Associate a Security Group and an Elastic IP
- 12.4 Outputs and references between resources
- 12.5 Team workflow: PR review of plans
Chapter 13 · Load balancing and auto scaling
- 13.1 Application Load Balancer vs Network Load Balancer
- 13.2 Target Groups, listeners and rules
- 13.3 Auto Scaling Groups: policies and metrics
- 13.4 Warm pools and lifecycle hooks
Chapter 14 · Serverless with Lambda
- 14.1 The Lambda execution model
- 14.2 Triggers: API Gateway, S3, DynamoDB Streams, SQS
- 14.3 Dependency management and layers
- 14.4 Cold starts and strategies to reduce them
- 14.5 Limits and anti-patterns
Chapter 15 · Messaging and events
- 15.1 SQS: standard vs FIFO queues, DLQ
- 15.2 SNS: topics, subscriptions, fan-out
- 15.3 EventBridge: event buses and rules
- 15.4 Patterns: pub/sub, decoupling, saga
Chapter 16 · Content delivery and DNS
- 16.1 Route 53: record types and routing policies
- 16.2 CloudFront: distributions, caches and origins
- 16.3 ACM: free SSL/TLS certificates
- 16.4 WAF integrated with CloudFront
Chapter 17 · Containers on AWS
- 17.1 Docker: quick review of key concepts
- 17.2 ECR: private image registry
- 17.3 ECS: task definitions, services, Fargate vs EC2
- 17.4 EKS: when Kubernetes and when not
Chapter 18 · Modules: reuse and composition
- 18.1 Anatomy of a Terraform module
- 18.2 Input variables, outputs and dependencies
- 18.3 Local modules vs Terraform Registry modules
- 18.4 Module versioning with Git tags
- 18.5 Design of generic vs domain-specific modules
Chapter 19 · Workspaces and environment management
- 19.1 Terraform workspaces: use cases and limitations
- 19.2 Directory strategy per environment (dev/stg/prod)
- 19.3 Terragrunt: DRY for environment configurations
- 19.4 Environment variables and .tfvars files
Chapter 20 · Remote backends and locking
- 20.1 Configure S3 + DynamoDB as backend
- 20.2 State locking: avoiding team corruption
- 20.3 State migration between backends
- 20.4 terraform import: bring existing resources into state
Chapter 21 · Infrastructure testing
- 21.1 Terraform validate and fmt in CI
- 21.2 Checkov and tfsec: static security analysis
- 21.3 Terratest: integration tests in Go
- 21.4 Contract testing between modules
Chapter 22 · Terraform in CI/CD
- 22.1 Basic pipeline: lint → plan → apply in GitHub Actions
- 22.2 Atlantis: GitOps for Terraform
- 22.3 Terraform Cloud / HCP Terraform
- 22.4 Drift detection and automatic reconciliation
Chapter 23 · Defense in depth
- 23.1 AWS Organizations and Service Control Policies
- 23.2 AWS Config: continuous compliance
- 23.3 GuardDuty: threat detection
- 23.4 Security Hub: centralized view
- 23.5 KMS: key management and rotation
- 23.6 Secrets Manager vs Parameter Store
Chapter 24 · Observability: logs, metrics and traces
- 24.1 CloudWatch Logs, metrics and alarms
- 24.2 CloudWatch Dashboards and Contributor Insights
- 24.3 X-Ray: distributed tracing
- 24.4 OpenTelemetry on AWS
- 24.5 Managed Grafana and Managed Prometheus
Chapter 25 · Cost optimization
- 25.1 AWS Cost Explorer and budgets with alerts
- 25.2 Trusted Advisor and Compute Optimizer
- 25.3 Rightsizing: how to detect overprovisioning
- 25.4 Savings Plans vs Reserved Instances: strategic decision
- 25.5 FinOps: culture and processes to control spending
Chapter 26 · High availability and disaster recovery
- 26.1 RTO and RPO: defining objectives
- 26.2 Strategies: backup/restore, pilot light, warm standby, multi-site
- 26.3 Route 53 health checks and automatic failover
- 26.4 AWS Backup: centralized backup policy
Chapter 27 · AWS Well-Architected Framework
- 27.1 The six pillars: operational excellence, security, reliability, performance efficiency, cost optimization, sustainability
- 27.2 Well-Architected Tool: formal reviews
- 27.3 How to apply the framework in design decisions
Chapter 28 · Serverless architectures at scale
- 28.1 Event-driven architecture with Lambda + EventBridge
- 28.2 Saga pattern for distributed transactions
- 28.3 Step Functions: orchestration of complex workflows
- 28.4 Lambda@Edge and CloudFront Functions
Chapter 29 · Data platforms on AWS
- 29.1 Data Lake with S3, Glue and Athena
- 29.2 Kinesis Data Streams and Firehose for streaming
- 29.3 Redshift: data warehousing at scale
- 29.4 Lake Formation: data governance
Chapter 30 · Multi-account and landing zones
- 30.1 Why separate workloads into different accounts
- 30.2 AWS Control Tower and Account Factory
- 30.3 Centralized log and security management
- 30.4 Terraform at multi-account scale with shared modules
Chapter 31 · Platform Engineering and Internal Developer Platform
- 31.1 Golden paths and abstractions over Terraform
- 31.2 AWS Service Catalog
- 31.3 Backstage as a developer portal
- 31.4 Terraform modules as internal product
Chapter 32 · Relevant AWS certifications
- 32.1 Cloud Practitioner: is it worth it?
- 32.2 Solutions Architect Associate → Professional
- 32.3 DevOps Engineer Professional
- 32.4 Specialty: Security, Database, Networking
- 32.5 HashiCorp Terraform Associate
Chapter 33 · Projects to consolidate what you've learned
- 33.1 Project 1: serverless blog (S3 + CloudFront + Lambda + DynamoDB)
- 33.2 Project 2: REST API with ECS Fargate + RDS + ALB
- 33.3 Project 3: data platform with Glue + Athena + Redshift
- 33.4 Project 4: multi-account landing zone with Terraform and Control Tower
