S3 stores your data, but who can see or modify it? Controlling access to your buckets is one of the most critical aspects of cloud security. Data breaches due to misconfigured buckets have made headlines many times. In this subchapter, you’ll learn how to control who accesses what, and the mistakes you should never make.
The starting point: everything is locked
Good news to start: by default, a new bucket is completely private. Only your account can access it. For anyone else (another account, a service, or the public) to access it, you have to explicitly grant it. AWS applies the principle of “deny by default” here.
The problem almost always comes when someone opens access too much by accident. Let’s look at the tools to open it (carefully).
Ways to control access
There are several ways to manage permissions in S3. The main ones:
- Bucket policies (the main tool)
A bucket policy is a document (in JSON format) attached to a bucket that defines detailed rules about who can do what with the objects.
Analogy: It’s like the access rules of a building, posted at the entrance: “delivery people can enter the warehouse; the public only to the reception; no one enters the offices without a card.”
A bucket policy specifies, in each rule:
- Who (which account, user, or service, or “everyone”).
- What action (read, write, delete…).
- On which objects (the whole bucket or a part).
- Allow or deny.
Conceptual example of a policy:
This would be useful, for example, for a static website (subchapter 5.5) where you want everyone to see certain files but not others.
Here’s a real example of a bucket policy that allows public reading of a folder:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadWeb",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket-web/public/*"
}
]
}Don’t worry about understanding every line now; we’ll see it in detail with IAM in Chapter 7. The important parts: Effect is allow/deny, Principal is who, Action is what they can do, and Resource is what it applies to.
- ACLs (the old method)
ACLs (Access Control Lists) are an older and simpler mechanism for giving basic permissions to objects or buckets.
AWS Recommendation: Avoid ACLs. Nowadays, AWS recommends using bucket policies and IAM instead, because they are clearer and more powerful. In fact, AWS now disables ACLs by default on new buckets. You should know they exist (you’ll see them in old documentation), but don’t use them if you can avoid it.
- IAM Policies
You can also control access from the user side with IAM policies (Chapter 7): “this user can read from these buckets.” This complements bucket policies. The difference: bucket policies are attached to the resource (the bucket), IAM policies are attached to the identity (the user or role).
The safety net: Block Public Access
AWS learned from the many incidents of exposed buckets and created an extra protection called Block Public Access.
It’s a security switch that, when enabled (enabled by default on new buckets), prevents the bucket from being made public, even if someone mistakenly writes a policy that allows it. It’s like a double lock.
⚠️ Golden rule: Leave Block Public Access enabled unless you have a very specific and conscious reason to disable it (such as hosting a public website). And even then, consider putting CloudFront in front instead of exposing the bucket directly. Disabling it “just to test” and forgetting about it is exactly how breaches happen.
Mistakes you should NEVER make
These are the errors that have caused famous data breaches:
- Accidentally making a bucket public. If you put sensitive data (customer data, backups) in a public bucket, anyone on the internet can download it. This has exposed millions of records at real companies.
- Disabling Block Public Access unnecessarily.
- Giving overly broad permissions (“anyone can write/delete”). An attacker could delete or hijack your data.
- Using legacy ACLs instead of clear policies.
Real case (repeated pattern): Numerous companies have suffered breaches because a developer left a bucket with customer data configured as public “temporarily” and forgot about it. Security researchers (and attackers) scan the internet looking for these open buckets. The lesson: treat every bucket as private by default and open only what is strictly necessary.
Best practices summarized
- Private by default: open access only when strictly necessary.
- Least privilege: give only the necessary permissions, not one more (key concept we’ll see in Chapter 7).
- Use bucket policies and IAM, not ACLs.
- Keep Block Public Access enabled unless there’s a justified exception.
- Encrypt your data (S3 encrypts at rest by default; we’ll see this with KMS in Chapter 23).
- For public websites, serve through CloudFront instead of exposing the bucket directly.
What you should remember
- Buckets are private by default; you grant access explicitly.
- Bucket policies (JSON) are the main and recommended way to control access: they define who, what action, on what, and whether it’s allowed or denied.
- ACLs are the old method: avoid them, AWS disables them by default.
- Block Public Access is a safety net that prevents exposing the bucket by mistake: leave it enabled.
- The most serious and frequent mistake is making a bucket with sensitive data public. Private by default, least privilege.
In the last S3 subchapter, we’ll see a very practical and popular use: hosting a static website directly on S3.
Cloud, AWS & Terraform — From Zero to Expert
Chapter 1 · What is cloud computing
- 1.1 The traditional client-server model
- 1.2 Problems the cloud came to solve
- 1.3 On-premise vs cloud vs hybrid
- 1.4 The three service models: IaaS, PaaS, SaaS
- 1.5 The five pillars of cloud (according to NIST)
- 1.6 Real advantages: elasticity, pay-as-you-go, global availability
Chapter 2 · The cloud market and major providers
- 2.1 AWS, Azure and GCP: differences and market share
- 2.2 Why learn AWS first
- 2.3 Concepts that are universal among providers
Chapter 3 · Regions, availability zones and edge
- 3.1 What is an AWS region and how to choose it
- 3.2 Availability Zones: high availability by design
- 3.3 Edge locations and CloudFront
- 3.4 Latency, resilience and data sovereignty
Chapter 4 · Compute: EC2
- 4.1 Instances: types, families and when to choose each
- 4.2 AMIs, key pairs and Security Groups
- 4.3 Instance lifecycle
- 4.4 Elastic IPs and Placement Groups
- 4.5 Savings Plans vs Reserved vs On-Demand vs Spot
Chapter 5 · Storage: S3
- 5.1 Buckets, objects and keys
- 5.2 Storage classes (Standard, IA, Glacier…)
- 5.3 Versioning and object lifecycle
- 5.4 Bucket policies and ACLs
- 5.5 Static website hosting
Chapter 6 · Networking: VPC
- 6.1 What is a VPC and why you need it
- 6.2 Public and private subnets
- 6.3 Internet Gateway and NAT Gateway
- 6.4 Route Tables and Network ACLs
- 6.5 VPC Peering and endpoints
Chapter 7 · Identity and access: IAM
- 7.1 Users, groups, roles and policies
- 7.2 The principle of least privilege
- 7.3 Identity-based vs resource-based policies
- 7.4 MFA and temporary credentials (STS)
- 7.5 IAM security best practices
Chapter 8 · Managed databases
- 8.1 RDS: engines, Multi-AZ and read replicas
- 8.2 Aurora and its advantages over vanilla RDS
- 8.3 DynamoDB: key-value / document model
- 8.4 ElastiCache for in-memory cache
- 8.5 When to use each type of database
Chapter 9 · Why Infrastructure as Code
- 9.1 Problems with manual provisioning
- 9.2 Declarative vs imperative IaC
- 9.3 Terraform vs CloudFormation vs Pulumi vs CDK
- 9.4 The plan → apply → destroy cycle
Chapter 10 · HCL: the Terraform language
- 10.1 Resource, variable, output, locals blocks
- 10.2 Data types: string, number, bool, list, map, object
- 10.3 Expressions, references and built-in functions
- 10.4 Conditionals and loops (count, for_each, for)
Chapter 11 · Providers and state
- 11.1 How the AWS provider works
- 11.2 The terraform.tfstate file and its importance
- 11.3 Local state vs remote state (S3 + DynamoDB)
- 11.4 Essential commands: init, plan, apply, destroy, fmt, validate
Chapter 12 · Your first real infrastructure in Terraform
- 12.1 Create a VPC with subnets from scratch
- 12.2 Launch a public EC2 instance
- 12.3 Associate a Security Group and an Elastic IP
- 12.4 Outputs and references between resources
- 12.5 Team workflow: PR review of plans
Chapter 13 · Load balancing and auto scaling
- 13.1 Application Load Balancer vs Network Load Balancer
- 13.2 Target Groups, listeners and rules
- 13.3 Auto Scaling Groups: policies and metrics
- 13.4 Warm pools and lifecycle hooks
Chapter 14 · Serverless with Lambda
- 14.1 The Lambda execution model
- 14.2 Triggers: API Gateway, S3, DynamoDB Streams, SQS
- 14.3 Dependency management and layers
- 14.4 Cold starts and strategies to reduce them
- 14.5 Limits and anti-patterns
Chapter 15 · Messaging and events
- 15.1 SQS: standard vs FIFO queues, DLQ
- 15.2 SNS: topics, subscriptions, fan-out
- 15.3 EventBridge: event buses and rules
- 15.4 Patterns: pub/sub, decoupling, saga
Chapter 16 · Content delivery and DNS
- 16.1 Route 53: record types and routing policies
- 16.2 CloudFront: distributions, caches and origins
- 16.3 ACM: free SSL/TLS certificates
- 16.4 WAF integrated with CloudFront
Chapter 17 · Containers on AWS
- 17.1 Docker: quick review of key concepts
- 17.2 ECR: private image registry
- 17.3 ECS: task definitions, services, Fargate vs EC2
- 17.4 EKS: when Kubernetes and when not
Chapter 18 · Modules: reuse and composition
- 18.1 Anatomy of a Terraform module
- 18.2 Input variables, outputs and dependencies
- 18.3 Local modules vs Terraform Registry modules
- 18.4 Module versioning with Git tags
- 18.5 Design of generic vs domain-specific modules
Chapter 19 · Workspaces and environment management
- 19.1 Terraform workspaces: use cases and limitations
- 19.2 Directory strategy per environment (dev/stg/prod)
- 19.3 Terragrunt: DRY for environment configurations
- 19.4 Environment variables and .tfvars files
Chapter 20 · Remote backends and locking
- 20.1 Configure S3 + DynamoDB as backend
- 20.2 State locking: avoiding team corruption
- 20.3 State migration between backends
- 20.4 terraform import: bring existing resources into state
Chapter 21 · Infrastructure testing
- 21.1 Terraform validate and fmt in CI
- 21.2 Checkov and tfsec: static security analysis
- 21.3 Terratest: integration tests in Go
- 21.4 Contract testing between modules
Chapter 22 · Terraform in CI/CD
- 22.1 Basic pipeline: lint → plan → apply in GitHub Actions
- 22.2 Atlantis: GitOps for Terraform
- 22.3 Terraform Cloud / HCP Terraform
- 22.4 Drift detection and automatic reconciliation
Chapter 23 · Defense in depth
- 23.1 AWS Organizations and Service Control Policies
- 23.2 AWS Config: continuous compliance
- 23.3 GuardDuty: threat detection
- 23.4 Security Hub: centralized view
- 23.5 KMS: key management and rotation
- 23.6 Secrets Manager vs Parameter Store
Chapter 24 · Observability: logs, metrics and traces
- 24.1 CloudWatch Logs, metrics and alarms
- 24.2 CloudWatch Dashboards and Contributor Insights
- 24.3 X-Ray: distributed tracing
- 24.4 OpenTelemetry on AWS
- 24.5 Managed Grafana and Managed Prometheus
Chapter 25 · Cost optimization
- 25.1 AWS Cost Explorer and budgets with alerts
- 25.2 Trusted Advisor and Compute Optimizer
- 25.3 Rightsizing: how to detect overprovisioning
- 25.4 Savings Plans vs Reserved Instances: strategic decision
- 25.5 FinOps: culture and processes to control spending
Chapter 26 · High availability and disaster recovery
- 26.1 RTO and RPO: defining objectives
- 26.2 Strategies: backup/restore, pilot light, warm standby, multi-site
- 26.3 Route 53 health checks and automatic failover
- 26.4 AWS Backup: centralized backup policy
Chapter 27 · AWS Well-Architected Framework
- 27.1 The six pillars: operational excellence, security, reliability, performance efficiency, cost optimization, sustainability
- 27.2 Well-Architected Tool: formal reviews
- 27.3 How to apply the framework in design decisions
Chapter 28 · Serverless architectures at scale
- 28.1 Event-driven architecture with Lambda + EventBridge
- 28.2 Saga pattern for distributed transactions
- 28.3 Step Functions: orchestration of complex workflows
- 28.4 Lambda@Edge and CloudFront Functions
Chapter 29 · Data platforms on AWS
- 29.1 Data Lake with S3, Glue and Athena
- 29.2 Kinesis Data Streams and Firehose for streaming
- 29.3 Redshift: data warehousing at scale
- 29.4 Lake Formation: data governance
Chapter 30 · Multi-account and landing zones
- 30.1 Why separate workloads into different accounts
- 30.2 AWS Control Tower and Account Factory
- 30.3 Centralized log and security management
- 30.4 Terraform at multi-account scale with shared modules
Chapter 31 · Platform Engineering and Internal Developer Platform
- 31.1 Golden paths and abstractions over Terraform
- 31.2 AWS Service Catalog
- 31.3 Backstage as a developer portal
- 31.4 Terraform modules as internal product
Chapter 32 · Relevant AWS certifications
- 32.1 Cloud Practitioner: is it worth it?
- 32.2 Solutions Architect Associate → Professional
- 32.3 DevOps Engineer Professional
- 32.4 Specialty: Security, Database, Networking
- 32.5 HashiCorp Terraform Associate
Chapter 33 · Projects to consolidate what you've learned
- 33.1 Project 1: serverless blog (S3 + CloudFront + Lambda + DynamoDB)
- 33.2 Project 2: REST API with ECS Fargate + RDS + ALB
- 33.3 Project 3: data platform with Glue + Athena + Redshift
- 33.4 Project 4: multi-account landing zone with Terraform and Control Tower
