๐Ÿฆ

Rackoon Tycoon

Build your cloud empire. Tame the traffic.

AWS Certified Solutions Architect โ€“ Associate (SAA-C03) study guide. Gap-focused prep for distributed systems veterans โ€” skips the basics, targets 2023โ€“2026 additions and exam traps.

Exam: SAA-C03 ยท 65 questions ยท 130 min ยท 720/1000 passing ยท Scenario-based

๐ŸŽฎ Play the companion game โ†’ (serve the repo, then open via localhost)

Study Progress 0 / 0 checked

๐Ÿ”ด Priority Gaps โ€” From Your Interview

Weak spots found during the gap interview (2026-06-16). Highest ROI before exam day. Pattern across all of them: you diagnose the problem but don't name the exact AWS fix โ€” the exam rewards the mechanism, not the concept. Click each to self-quiz.

1 Gateway VPC Endpoint costnet

Named NAT Gateway as the cost, didn't name the fix.
Gateway VPC Endpoint for S3/DynamoDB โ€” FREE, routes over private AWS backbone, kills NAT Gateway processing ($0.045/GB).

Gateway endpoints = S3 + DynamoDB ONLY, always free. Interface endpoints (PrivateLink) = everything else, cost money. Reflexive answer for "EC2 in private subnet + S3 cost".

Q: EC2 in private subnet reads 500GB/day from same-region S3 through a NAT Gateway. Cheapest fix?
A: Add a Gateway VPC Endpoint for S3. Free, removes NAT data-processing charge. S3 same-region transfer itself is already free.

2 VPC Lattice vs PrivateLink net

No VPC Lattice experience.
Lattice = internal service mesh across YOUR accounts/VPCs (no peering/TGW). PrivateLink = expose a service to an EXTERNAL org/consumer.

Exam trigger: "microservices across accounts, same org" โ†’ Lattice. "endpoint service + interface endpoint" / external partner โ†’ PrivateLink.

Q: ECS service A (account 1) must call ECS service B (account 2), same org, with built-in auth + weighted routing, no peering. What?
A: VPC Lattice. (PrivateLink is for cross-org exposure, not internal mesh.)

3 Kinesis Streams vs Firehose data

Said Firehose can replay; said Streams retention = "minutes/hours".
Firehose CANNOT replay (one-way delivery, failed records โ†’ error S3 prefix). Streams retention = 24h default, up to 365 days.

"Minutes" = Firehose's delivery BUFFER (60sโ€“15min / 1โ€“128MB), not stream retention. Need replay/multiple consumers โ†’ Streams. Load to S3/Redshift/OpenSearch, no code โ†’ Firehose.

Q: Real-time anomaly detection (sub-second, replayable) + separate S3 data-lake load (no processing). Which Kinesis for each?
A: Data Streams for the real-time/replay consumer; Firehose for the S3 load. Firehose has no replay.

4 Aurora Limitless vs Serverless v2 db

Picked Limitless for a modest-write spiky workload.
Serverless v2 = vertical auto-scale (ACUs) for spikes + zero capacity planning โ€” right for spiky/unpredictable. Limitless = horizontal write SHARDING, only when a single writer is maxed.

Reach for Limitless only when you can prove single-node writer is the bottleneck (single writer โ‰ˆ up to 256 ACUs). It does NOT solve spike elasticity โ€” that's v2's job.

Q: 800K writes/day, 10x growth, flash-sale spikes, zero capacity planning. v2 or Limitless?
A: Serverless v2 โ€” fits one writer easily, auto-scales for spikes. Limitless is overkill until writes exceed a single node.

5 Lambda-in-VPC inherits EC2 networking compute

Vague on whether moving S3 access to Lambda changes the cost answer.
Lambda with NO VPC config โ†’ S3 via public AWS endpoints, free path, no NAT. VPC-attached Lambda โ†’ same as EC2: needs Gateway VPC Endpoint or it pays NAT.

"Serverless" โ‰  "no networking constraints". Attach Lambda to a VPC and it inherits subnet/route-table/NAT behavior.

Q: VPC-attached Lambda pulling from S3 racks up NAT charges. Fix?
A: Gateway VPC Endpoint for S3 โ€” same fix as EC2. (Or detach from VPC if it doesn't need VPC resources.)

6 RCP = hard resource ceiling sec

Right outcome ("can't delete") but reasoning was just "most restrictive wins".
An RCP (Resource Control Policy) is a guardrail on the RESOURCE. Explicit Deny there beats any identity policy, SCP allow, or even account root.

SCP restricts what principals CAN DO. RCP restricts what CAN BE DONE TO a resource. Both are org-level Deny ceilings; explicit Deny always wins.

Q: Identity policy Allows s3:DeleteBucket, no SCP mentions S3, an org RCP Denies s3:DeleteBucket. Can they delete?
A: No. RCP explicit Deny is a hard ceiling on the resource, independent of identity-side evaluation.

7 Cognito User Pools vs Identity Pools sec

Strong on Identity Center vs Cognito, but didn't split Cognito's two halves.
User Pools = authentication + JWT (who are you). Identity Pools (federated identities) = vend temporary AWS credentials (STS) so users hit AWS resources directly.

Workforce SSO across 15 accounts โ†’ IAM Identity Center. 2M app users via Google โ†’ Cognito (User Pool to auth, Identity Pool if they need direct S3/DynamoDB access).

Q: Mobile users sign in with Google, then must upload directly to S3 with scoped creds. Which Cognito piece vends the AWS creds?
A: Identity Pool (federated identities). User Pool handles the login/JWT.

8 RDS cross-AZ cost fixes costdb

Math right ($0.01/GB each way) but no fix options.
Options: (a) co-locate EC2+RDS same AZ (loses HA), (b) read replica in same AZ โ€” reads local/free, writes still cross-AZ, (c) accept it as the price of HA. No free Gateway endpoint exists for RDS.

When the exam says "reduce cost WITHOUT sacrificing availability" โ†’ the same-AZ read replica is usually the answer.

Q: Heavy cross-AZ RDS read traffic costs are high; must keep HA. Best fix?
A: Add a read replica in the EC2's AZ; route reads locally. Writes still cross to primary (accepted), HA preserved.

Exam Domains

Weights

Domain 1

30%
Design Secure Architectures
IAM, encryption, network security, data protection

Domain 2

26%
Design Resilient Architectures
HA, DR, fault tolerance, decoupling

Domain 3

24%
Design High-Performing Architectures
Scaling, caching, optimized storage/compute

Domain 4

20%
Design Cost-Optimized Architectures
Pricing models, rightsizing, storage tiers

Gap Analysis โ€” 17yr Veteran

Review these

AI/ML Services (Likely Blind Spot)

Bedrock, SageMaker Canvas, Comprehend, Kendra, Q Business, Rekognition positioning on the exam. Not about using them โ€” about when to choose which.

  • Bedrock = managed FM API, no infra, pay per token
  • SageMaker = full MLOps pipeline, bring your own model
  • Comprehend = NLP (sentiment, entities, PII detection)
  • Kendra = intelligent enterprise search over docs
  • Rekognition = image/video analysis (faces, objects, content moderation)

New Networking: VPC Lattice & Verified Access (2023)

Both likely on exam. VPC Lattice replaces many PrivateLink patterns for service-to-service. Verified Access is zero-trust app access without VPN.

Resource Control Policies (RCPs) โ€” 2024

New SCP-like control at Organizations level but applies to resources not identities. Different from SCPs. Controls what external principals can do to your resources even if their IAM allows it.

S3 Express One Zone (2023)

10x faster than S3 Standard. Single AZ only. Used for ML training, HPC, latency-sensitive analytics. Different pricing model โ€” charged per request + storage per GB.

Aurora Limitless Database (2024)

Horizontal write scaling beyond single Aurora instance. Distributed sharding, managed by Aurora. Exam may contrast with Aurora Global vs Aurora Serverless v2 vs Limitless.

IAM Identity Center (was SSO) โ€” naming/features evolved

Now integrates with external IdPs, trusted identity propagation to analytics services. Replaces per-account IAM role federation for most multi-account patterns.

New Services 2023โ€“2026

High exam weight

Amazon Bedrock Generative AI Platform

AI/ML 2023
  • Managed access to foundation models (Claude, Llama, Titan, Mistral, Stable Diffusion)
  • No infra to manage, pay per token
  • Bedrock Agents โ€” multi-step orchestration with tool use
  • Bedrock Knowledge Bases โ€” RAG with S3/managed vector store
  • Bedrock Guardrails โ€” content filtering, PII redaction
  • Bedrock Model Evaluation โ€” compare model outputs
Use Bedrock when question says "managed", "no ML expertise", "foundation model". Use SageMaker when "custom training", "MLOps pipeline", "bring own model".

VPC Lattice Service Networking

Networking 2023
  • Logical application-layer network across VPCs and accounts
  • Service-to-service communication without VPC peering complexity
  • Built-in auth, observability, traffic management
  • Works with EC2, ECS, EKS, Lambda targets
  • Cheaper and simpler than PrivateLink for internal services
PrivateLink = expose to external consumers (SaaS). VPC Lattice = internal service mesh across your own accounts.

AWS Verified Access Zero-Trust App Access

Security Networking 2023
  • Provides VPN-less secure access to corporate apps
  • Evaluates each request against identity + device posture
  • Integrates with IAM Identity Center, Okta, JumpCloud
  • Integrates with CrowdStrike/Jamf for device trust signals
  • Access logs to S3/CloudWatch/Firehose
Question: "eliminate VPN, verify user identity + device health per request" โ†’ Verified Access, not Client VPN.

S3 Express One Zone High-Speed Object Storage

Storage 2023
  • 10x lower latency than S3 Standard (single-digit ms)
  • Single AZ โ€” not cross-AZ replicated
  • Ideal: ML training datasets, financial analytics, HPC
  • Uses "directory buckets" (different API path from S3)
  • Charged per request + GB, no free tier
Single AZ = not durable to AZ failure. Exam will tempt you to pick this for HA scenarios โ€” wrong. Also: NOT a replacement for EFS for shared file access.

Aurora Limitless Database

Database 2024
  • Horizontal write scaling beyond single Aurora writer limit
  • AWS manages distributed sharding transparently
  • Still Aurora-compatible SQL interface
  • Compare: Aurora Serverless v2 = scale down to zero, Aurora Global = multi-region reads, Limitless = multi-shard writes
Don't confuse with Aurora Serverless. Limitless = massive write scale. Serverless = variable, spiky workloads that need scale-to-zero.

ElastiCache Serverless

Database 2023
  • No cluster sizing decisions, auto-scales instantly
  • Supports both Redis and Memcached engine
  • Sub-millisecond latency maintained at scale
  • Pay per ECU consumed + GB stored
Exam may frame as "unpredictable cache demand" โ†’ Serverless. Predictable steady-state โ†’ provisioned (cheaper).

Resource Control Policies (RCPs)

Security 2024
  • Organization-level guardrails on resources (not identities)
  • Prevents cross-account access even if destination IAM allows
  • Complements SCPs: SCPs control principals, RCPs control resources
  • Example: "deny S3 bucket access from outside org" in one policy
SCP denies what org members CAN DO. RCP denies what CAN BE DONE TO your resources regardless of who grants permission.

AWS Graviton4 ARM Compute

Compute 2024
  • 40% better price-performance vs x86 for general workloads
  • R8g instances: memory-optimized, database workloads
  • M8g, C8g instances available
  • Fully compatible with containers, Java, Python, Go, Rust
  • Graviton3: M7g, C7g, R7g โ€” still heavily used/tested
Exam: "reduce EC2 cost, no application changes required" โ†’ Graviton (if app is already Linux/compiled). Wrong answer: Reserved Instances alone without instance type change.

AWS Trainium2 / Inferentia2

AI/ML Compute 2024
  • Trainium2 (Trn2): ML training chip, cheaper than GPU for large models
  • Inferentia2 (Inf2): ML inference, optimized throughput/latency/cost
  • Capacity Blocks for ML: reserved GPU/accelerator capacity for time-boxed training runs
  • P5en instances: H100 GPUs for extreme training
Training = Trn2 (cheapest). Inference at scale = Inf2. Need NVIDIA compatibility = P4/P5. Exam may distinguish these.

Amazon Q Business Enterprise AI

AI/ML 2024
  • AI assistant grounded in your enterprise data
  • Connects to S3, Confluence, SharePoint, Salesforce, Jira, etc.
  • Respects existing IAM/SAML permissions when answering
  • Different from Kendra: Q Business = conversational AI, Kendra = document search API
"Build internal ChatGPT over company docs" โ†’ Q Business. "Search API over document corpus" โ†’ Kendra. "Custom AI product" โ†’ Bedrock.

AWS Clean Rooms

Security 2023
  • Collaborate on data analysis without sharing raw data
  • Each party brings data, run joint queries, no raw export
  • Built-in controls: aggregation-only, noise injection, column restrictions
  • For: ad measurement, financial analytics, healthcare research
"Analyze joint data with partner without exposing PII to each other" โ†’ Clean Rooms. Not Lake Formation, not just S3 permissions.

IAM Identity Center (evolved from SSO)

Security Updated 2023+
  • Central SSO for all AWS accounts + business apps
  • Trusted identity propagation: pass user identity to EMR, Redshift, S3 Access Grants
  • Integrates with external IdPs via SAML 2.0 / OIDC
  • Permission sets = IAM roles deployed across accounts
  • Replaces per-account federation for multi-account setups
Multi-account + SSO = Identity Center, not Cognito. Cognito = app user pools (B2C). Identity Center = workforce/employee access.

AWS Private 5G

Networking 2023
  • Deploy private 5G/LTE network in your facility
  • AWS provides hardware + SIM cards + management
  • Low latency industrial/manufacturing connectivity
  • Integrates with AWS Outposts

Amazon DataZone Data Governance

Database 2023
  • Data catalog + marketplace + governance platform
  • Publish, discover, and subscribe to data assets
  • Works with Redshift, Glue, S3, Athena
  • Lake Formation = row/column access control. DataZone = catalog + discovery + business glossary
Lake Formation for fine-grained access. DataZone for data discovery and cross-team sharing workflows.

Amazon OpenSearch Serverless

Database 2023
  • No cluster sizing for OpenSearch
  • Auto-scales index + search capacity independently
  • Collections: time-series or search (different optimizations)
  • Pay per OCU (OpenSearch Compute Unit) consumed

AWS Supply Chain

AI/ML 2023
  • ML-powered supply chain visibility and risk management
  • Connects to ERP, WMS, TMS systems
  • Demand planning, inventory optimization recommendations

Compute โ€” Exam Focus

EC2 ยท Lambda ยท ECS ยท EKS ยท Batch

EC2 Instance Types โ€” When to Pick What

  • M-family: general purpose, balanced CPU/RAM (web servers, app servers)
  • C-family: compute optimized (HPC, batch, gaming servers, media encoding)
  • R-family: memory optimized (in-memory DBs, Redis, real-time analytics)
  • I-family: storage optimized NVMe (NoSQL DBs, data warehousing, Elasticsearch)
  • P/G-family: GPU (ML training, graphics rendering)
  • Trn/Inf: ML-specific training/inference (cheaper than P-family for compatible workloads)
  • Graviton (g suffix: m7g, c7g): ARM, 20-40% cost reduction
  • Mac instances: Xcode CI/CD on actual macOS

EC2 Pricing Models

  • On-Demand: no commitment, highest rate
  • Reserved (1yr/3yr): up to 72% off, Standard RI or Convertible RI
  • Savings Plans (Compute): flexible, applies to EC2+Lambda+Fargate, hourly commitment
  • Savings Plans (EC2): region+family locked, higher discount
  • Spot: up to 90% off, interruptible 2min notice
  • Dedicated Host: physical server, BYOL (Oracle, Windows Server)
  • Dedicated Instance: single-tenant, no visibility into host
Compute Savings Plans > EC2 Savings Plans in flexibility. RI Standard = can't change instance type. RI Convertible = can change, lower discount. Dedicated Host โ‰  Dedicated Instance.

Auto Scaling Policies

  • Target tracking: maintain metric at target (e.g., CPU 60%) โ€” simplest
  • Step scaling: add N instances based on alarm breach magnitude
  • Simple scaling: one action per alarm, has cooldown โ€” legacy
  • Scheduled: predictable load patterns
  • Predictive: ML forecast of future load, pre-scales
  • Warm pools: pre-initialized stopped instances, fast scale-out
Target tracking is usually the right answer for "minimize costs while maintaining performance". Predictive = known traffic pattern. Warm pools = slow boot time problem.

Lambda Limits & Patterns

  • Max execution: 15 minutes
  • Memory: 128MB โ€“ 10GB (CPU scales with memory)
  • Ephemeral storage (/tmp): up to 10GB
  • Concurrency: 1000 default per region (can increase)
  • Cold start mitigations: Provisioned Concurrency, SnapStart (Java)
  • Lambda SnapStart: 10x faster Java cold starts, pre-initialized snapshots
  • Layers: share code/deps up to 250MB total
  • Container images: up to 10GB (no layer limit constraint)
Lambda timeout = 15min max. Need longer? Use Fargate or ECS. Lambda SnapStart only works with Java runtime. Provisioned Concurrency โ‰  reserved concurrency (different thing).

ECS vs EKS vs Fargate

  • ECS (Fargate): AWS-managed, no node management, simpler, AWS-native
  • ECS (EC2): control over host, GPU support, spot, custom AMIs
  • EKS (Fargate): Kubernetes API, serverless pods, no node mgmt
  • EKS (EC2/managed nodes): K8s, full control, daemonsets, stateful
  • EKS Anywhere: run K8s on-prem with EKS tooling
  • ECS Anywhere: run ECS tasks on-prem servers
"We use Kubernetes" โ†’ EKS. "Migrate Docker containers with minimal overhead" โ†’ ECS Fargate. "On-prem container orchestration" โ†’ ECS Anywhere or EKS Anywhere.

AWS Batch

  • Fully managed batch computing on EC2/Spot/Fargate
  • Job queues, compute environments, job definitions
  • Array jobs: parallel processing of independent work items
  • Fair-share scheduling: multi-tenant priority allocation
  • AWS Batch Multi-Node: distributed ML training
Batch vs Lambda: Lambda = event-driven, short. Batch = long-running, high-compute, parallel. Batch vs Step Functions: Step Functions = workflow orchestration with state. Batch = embarrassingly parallel compute.

Storage

S3 ยท EBS ยท EFS ยท FSx ยท Storage Gateway

S3 Storage Classes โ€” Decision Tree

  • Standard: frequent access, ms latency
  • Standard-IA: infrequent access, per-GB retrieval fee, min 30 days, 128KB min
  • One Zone-IA: like IA, single AZ, 20% cheaper, not HA
  • Glacier Instant: archive with ms retrieval (replaced old Glacier)
  • Glacier Flexible: min/hours retrieval, cheapest active archive
  • Glacier Deep Archive: 12hr+ retrieval, lowest cost storage in AWS
  • Intelligent-Tiering: auto-moves between tiers, no retrieval fees, monitoring fee per object
  • Express One Zone: 10x faster than Standard, single AZ, premium price
Intelligent-Tiering has NO retrieval fees but has per-object monitoring cost. Bad for millions of tiny files. Standard-IA min 30-day billing โ€” delete before 30 days? Still charged.

S3 Key Features

  • Object Lock: WORM โ€” Compliance mode (nobody can delete) vs Governance mode (privileged can override)
  • Replication: CRR (cross-region) / SRR (same-region). Requires versioning on both buckets.
  • Transfer Acceleration: CloudFront edge โ†’ private AWS backbone โ†’ S3. For large uploads from far clients.
  • Multipart Upload: required >5GB, recommended >100MB
  • S3 Access Points: named network endpoints with separate policies per app/team
  • S3 Access Grants: delegate fine-grained access with identity propagation from IAM Identity Center
  • Event Notifications: โ†’ Lambda, SQS, SNS, EventBridge
  • Lifecycle rules: transition/expire objects based on age or prefix
Replication does NOT replicate existing objects โ€” only new ones after rule created. Cross-account replication needs bucket policy on destination. Object Lock requires versioning.

EBS Volume Types

  • gp3: general SSD, 3000 IOPS baseline, up to 16K IOPS, 1GB-16TB. Default.
  • gp2: older gen, IOPS scales with size (3 IOPS/GB), max 16K
  • io2/io2 Block Express: provisioned IOPS SSD, up to 256K IOPS, multi-attach capable, 99.999% durability
  • io1: older provisioned IOPS, max 64K IOPS
  • st1: throughput HDD, sequential large block (data warehouses, ETL). Can't be boot volume.
  • sc1: cold HDD, lowest cost, infrequent access. Can't be boot volume.
gp3 vs gp2: gp3 you configure IOPS independently (cheaper for high-IOPS low-storage). Multi-attach = io1/io2 only, same AZ, use with cluster-aware apps. Boot volume can only be SSD (gp2/gp3/io1/io2).

EFS vs FSx

  • EFS: NFS, Linux-only, multi-AZ, scales automatically, POSIX. Bursting vs Provisioned throughput.
  • EFS One Zone: single AZ, 47% cheaper, same API
  • FSx for Windows: SMB, Active Directory integrated, Windows ACLs, DFS
  • FSx for Lustre: parallel HFS for HPC/ML. Native S3 integration. Sub-ms latency.
  • FSx for NetApp ONTAP: NFS+SMB+iSCSI, snapshots, replication, dedup/compression, multi-protocol
  • FSx for OpenZFS: NFS-compatible, ZFS snapshots, data compression, 1M IOPS
Linux shared storage = EFS. Windows file server = FSx for Windows. HPC/ML parallelism = FSx for Lustre. Mixed-OS or NetApp migration = FSx for ONTAP. ZFS snapshots = FSx OpenZFS.

Storage Gateway

  • File Gateway: NFS/SMB to S3. Local cache. Files stored as S3 objects.
  • Volume Gateway (Cached): iSCSI block, primary data in S3, frequent data cached locally
  • Volume Gateway (Stored): entire volume on-prem, async backup to S3
  • Tape Gateway: VTL for backup software (Veeam, Veritas). Archives to Glacier.
File Gateway = NFS on-prem to S3 (file share). Volume = block/iSCSI. Tape = backup software VTL. "Replace tape backup" โ†’ Tape Gateway. "On-prem NFS to cloud" โ†’ File Gateway.

Database

RDS ยท Aurora ยท DynamoDB ยท ElastiCache ยท Redshift

RDS โ€” Exam Points

  • Read replicas: async replication, can be promoted, cross-region
  • Multi-AZ: sync replication, automatic failover, NOT for read scaling
  • Multi-AZ DB Cluster: 2 readable standby instances (new in 2022+)
  • RDS Proxy: connection pooling, reduces DB connections for Lambda
  • Performance Insights: DB load monitoring, wait events
  • Automated backups: 1-35 days retention, PITR
  • Storage auto-scaling: enabled separately from compute
Multi-AZ = HA failover, NOT read scale. Read replicas = read scale, NOT automatic failover (must promote manually). Lambda + RDS = must use RDS Proxy (connection exhaustion).

Aurora โ€” Specifics

  • Shared storage layer, 6 copies across 3 AZs, quorum writes
  • Up to 15 read replicas with sub-10ms replica lag
  • Aurora Global: 1 primary + up to 5 secondary regions, <1s replication
  • Aurora Serverless v2: scales in 0.5 ACU increments, scales to 0
  • Aurora Limitless: horizontal write sharding (2024)
  • Aurora I/O-Optimized: predictable pricing for I/O-heavy workloads
  • Backtrack: rewind DB without restore (MySQL only)
Aurora Global RTO <1min for promoted secondary. Aurora Multi-AZ != RDS Multi-AZ (different architecture). Serverless v2 does NOT scale to 0 during active connections. Backtrack = not a backup, limited window.

DynamoDB โ€” Exam Deep Cuts

  • On-demand vs Provisioned (with Auto Scaling)
  • Global tables: multi-active multi-region, eventual consistency across regions
  • DynamoDB Streams: change data capture, triggers Lambda
  • TTL: auto-delete expired items (no cost for deletes)
  • DAX: in-memory cache, microsecond reads, write-through
  • Transactions: ACID across multiple items (TransactWriteItems)
  • GSI: different partition key, eventual consistency, no uniqueness
  • LSI: same partition key, different sort key, must be at creation, strongly consistent reads
  • PartiQL: SQL-compatible query language for DynamoDB
LSI only at table creation. GSI can be added later. Hot partition = design problem (not fixable with more capacity). DAX = read cache only (not write-through for updates). Global Tables require on-demand or the same provisioned settings in all regions.

ElastiCache

  • Redis: data structures, pub/sub, sorted sets, persistence, replication, cluster mode, Lua scripting
  • Memcached: simple K/V, no persistence, multi-threaded, horizontal scale (no replication)
  • Redis Cluster Mode: sharding across up to 500 node groups
  • Redis Serverless: auto-scale (2023)
  • Global Datastore: Redis cross-region replication
Need replication/failover = Redis. Need multi-threaded pure caching = Memcached. Need persistence = Redis. Need leaderboard/sorted set = Redis. Memcached = no persistence, no failover, no pub/sub.

Redshift

  • Columnar, MPP data warehouse (PostgreSQL-compatible)
  • RA3 nodes: managed storage (S3), scale compute/storage independently
  • Redshift Serverless: auto-scales, pay per RPU-second
  • Redshift Spectrum: query S3 directly from Redshift without loading
  • Data Sharing: live access to Redshift data across accounts/clusters without copy
  • Redshift ML: CREATE MODEL trains SageMaker model via SQL
  • Auto-copy from S3: continuously loads new S3 files into tables
Redshift = analytical/OLAP, not transactional OLTP. "Query S3 without loading" = Spectrum or Athena (Athena if no Redshift cluster needed). Redshift Serverless โ‰  Aurora Serverless (different engines/purposes).

Database Selection Guide

  • Relational, known schema โ†’ RDS (MySQL/Postgres/SQL Server/Oracle)
  • Relational, massive scale, managed โ†’ Aurora
  • Key-value, millisecond, massive scale โ†’ DynamoDB
  • In-memory cache โ†’ ElastiCache (Redis or Memcached)
  • In-memory data store with durability โ†’ MemoryDB for Redis
  • Analytics/OLAP โ†’ Redshift
  • Time series โ†’ Timestream
  • Graph โ†’ Neptune (fraud, social, knowledge graphs)
  • Document store โ†’ DocumentDB (MongoDB-compatible)
  • Ledger/immutable audit โ†’ QLDB
  • Search โ†’ OpenSearch

MemoryDB for Redis

  • Durable Redis-compatible in-memory DB (not just cache)
  • Multi-AZ transaction log ensures durability
  • Microsecond reads, single-digit ms writes
  • Primary data store (not just cache layer)
  • Use when: Redis as primary DB, need durability, need fast reads
ElastiCache Redis = cache. MemoryDB = primary DB with Redis API. MemoryDB is durable. ElastiCache = optional persistence (less durable).

Networking

VPC ยท CloudFront ยท Route 53 ยท Direct Connect ยท Transit Gateway

VPC โ€” Advanced Concepts

  • CIDR non-overlapping required for peering/TGW attachments
  • VPC Peering: non-transitive (Aโ†”B, Bโ†”C โ‰  Aโ†”C)
  • Transit Gateway: hub-and-spoke, transitive routing, up to 5000 VPCs
  • AWS Network Firewall: stateful/stateless inspection, IDS/IPS in VPC
  • Security Groups: stateful (return traffic auto-allowed), instance-level
  • NACLs: stateless (need explicit inbound + outbound), subnet-level, numbered rules
  • Gateway endpoints: S3 and DynamoDB โ€” free, no NAT needed
  • Interface endpoints (PrivateLink): ENI in subnet, charges apply, most other services
NACLs are stateless โ€” you MUST allow ephemeral ports (1024-65535) for return traffic. SGs stateful. VPC peering = non-transitive. Need transitive = Transit Gateway or VPN.

VPC Lattice 2023

  • App-layer service network across VPCs/accounts
  • Service directory: register services, auto-discover
  • Auth policies: who can call which service
  • Traffic management: weighted routing, path-based
  • Observability: CloudWatch metrics, access logs
  • Works with EC2, Lambda, ECS, EKS, IP targets

Direct Connect

  • Dedicated 1/10/100 Gbps private connection to AWS
  • DX Gateway: connect to multiple VPCs/regions from one DX connection
  • Virtual Interfaces: Private VIF (to VPC), Public VIF (to AWS public services), Transit VIF (to TGW)
  • Hosted Connection: via partner, 50Mbpsโ€“10Gbps
  • Resilient: dual DX connections across locations for HA
  • DX + VPN: encrypted DX traffic (DX is not encrypted by default)
DX NOT encrypted. Add VPN over DX for encryption. DX Gateway doesn't enable VPC-to-VPC routing. SLA requires 2 DX connections from different locations.

Route 53 Routing Policies

  • Simple: single resource, no health check routing
  • Failover: active-passive, health check required
  • Weighted: split traffic by %, blue/green deploys
  • Latency: route to lowest-latency region
  • Geolocation: route based on user country/continent
  • Geoproximity: route based on geography + bias adjustment (requires Traffic Flow)
  • Multi-value: return multiple healthy IPs, client-side random
  • IP-based: route based on client CIDR (2023)
Geolocation vs Geoproximity: Geo = exact country match. Geoproximity = distance with bias shifting. Multi-value โ‰  load balancer (no stickiness, no connection draining).

CloudFront

  • Origins: S3, ALB, EC2, custom HTTP, MediaStore
  • OAC (Origin Access Control): S3 auth, replaces OAI (Origin Access Identity)
  • Cache behaviors: per path pattern, TTL, query strings, headers
  • Edge Functions: CloudFront Functions (lightweight JS at edge, sub-ms) vs Lambda@Edge (full runtime, regional)
  • Field-level encryption: encrypt specific form fields at edge
  • Signed URLs / Signed Cookies: time-limited access
  • Price classes: limit edge locations to reduce cost
  • Real-time logs: to Kinesis Data Streams
OAI is deprecated, use OAC. CloudFront Functions = header manipulation, URL rewrites, light auth. Lambda@Edge = full Node/Python, response body manipulation. CF Functions are ~10x cheaper.

Load Balancers

  • ALB: HTTP/HTTPS/WebSocket, path/host/header routing, Lambda targets, user auth via Cognito/OIDC
  • NLB: TCP/UDP/TLS, ultra-low latency, static IPs/Elastic IPs, PrivateLink
  • GWLB: transparent inspection at L3, sends traffic to appliances (firewalls, IDS)
  • Target groups: EC2, ECS, Lambda, IPs
  • Connection draining / deregistration delay: graceful shutdown
  • Sticky sessions: ALB (app cookie or LB cookie)
Need static IP for whitelist = NLB. Need path routing = ALB. Need transparent network appliance = GWLB. ALB with Lambda target = free response body transformation. NLB for UDP (gaming, VoIP).

Global Accelerator vs CloudFront

  • Global Accelerator: TCP/UDP, static Anycast IPs, routes via AWS backbone, all traffic types, DDoS protection
  • CloudFront: HTTP(S) only, caching at edge, content delivery
  • GA: good for non-HTTP (gaming, VoIP, IoT), or HTTP that can't be cached
  • CF: good for static assets, cacheable HTTP APIs
"Static IPs for global app" โ†’ Global Accelerator. "Cache images globally" โ†’ CloudFront. GA โ‰  caching. Both use AWS global network.

Security

IAM ยท KMS ยท WAF ยท Shield ยท GuardDuty

IAM โ€” Exam Nuances

  • Identity-based policies: attached to user/group/role
  • Resource-based policies: attached to resource (S3, SQS, KMS)
  • Permission boundaries: max permissions, doesn't grant itself
  • Session policies: passed during AssumeRole, further restricts
  • SCPs: org-level max, affects all accounts including root (but not management account by default)
  • RCPs: org-level resource control (2024) โ€” see Gap Analysis
  • Evaluation: deny โ†’ org SCPs โ†’ resource policy โ†’ identity policy โ†’ permission boundary โ†’ session policy
Explicit deny always wins. SCP doesn't grant permissions โ€” it restricts. Permission boundary restricts but doesn't grant. Cross-account: need BOTH resource policy + identity policy (unless resource-based policy alone grants it).

KMS

  • AWS managed keys (aws/service): free, automatic rotation, limited control
  • Customer managed keys (CMK): full control, key policies, $1/month/key
  • Key policies: resource-based, primary access control for KMS
  • Grants: temporary delegated access to keys
  • Envelope encryption: KMS encrypts data key, data key encrypts data
  • Multi-region keys: replicate key material, same key ID prefix
  • External key material (BYOK): import your own key material
  • KMS CloudHSM-backed: FIPS 140-2 Level 3
KMS max encrypt payload = 4KB. For larger data: envelope encryption. CloudHSM = FIPS 140-2 Level 3 (KMS standard = Level 2). "Audit key usage" = KMS CloudTrail. Multi-region keys share key material but are independent keys.

WAF, Shield, Firewall Manager

  • WAF: L7 rules (SQL injection, XSS, geo-block, rate limit, IP sets, managed rule groups). Attaches to ALB, CloudFront, API GW, AppSync.
  • Shield Standard: free, always on, DDoS protection L3/L4
  • Shield Advanced: $3K/month, L7 DDoS, cost protection, 24/7 DRT team, near real-time visibility
  • Firewall Manager: centrally deploy WAF rules, Shield, SGs, Network Firewall across org
  • Network Firewall: stateful/stateless inspection in VPC, IDS/IPS
WAF on ALB protects only that ALB. Firewall Manager manages WAF across hundreds of accounts centrally. Shield Advanced required for WAF cost protection during attacks.

Threat Detection & Monitoring

  • GuardDuty: threat detection from CloudTrail, VPC Flow Logs, DNS, S3 data events. ML-based anomaly detection.
  • Macie: discover and protect sensitive data in S3 (PII, credentials, financial)
  • Inspector: vulnerability scanning for EC2 (CVEs), Lambda (function code), container images in ECR
  • Security Hub: aggregates findings from GuardDuty, Inspector, Macie, Firewall Manager, etc. CSPM. CIS benchmarks.
  • Detective: investigate security incidents โ€” graph analysis of GuardDuty/CloudTrail findings
GuardDuty = detection. Inspector = vulnerabilities. Macie = data classification. Detective = investigation. Security Hub = aggregation/CSPM. These are different things โ€” exam will mix them.

Secrets & Certificate Management

  • Secrets Manager: store/rotate secrets, auto-rotation via Lambda, cross-account, $0.40/secret/month
  • SSM Parameter Store: free (standard), hierarchical, no auto-rotation, SecureString uses KMS
  • ACM (Certificate Manager): free TLS certs for ALB/CloudFront/API GW. Auto-renews.
  • ACM Private CA: issue private certs for internal services. Not free.
DB passwords needing rotation = Secrets Manager. Config values/flags = SSM Parameter Store. "Auto-rotate RDS password" โ†’ Secrets Manager. SSM Standard = free, max 10K params, 4KB value.

Encryption In Transit/At Rest

  • S3: SSE-S3 (AWS managed), SSE-KMS (CMK), SSE-C (customer key, client provides), client-side
  • EBS: KMS encryption, encrypted AMI = encrypted snapshots
  • RDS/Aurora: at-rest KMS, in-transit SSL/TLS
  • DynamoDB: always encrypted at rest (can specify KMS CMK)
  • "Require HTTPS only" on S3: bucket policy condition aws:SecureTransport
S3 SSE-KMS: you control key + audit via CloudTrail. SSE-S3: AWS manages everything. "Customer manages encryption keys fully" = SSE-C or client-side. Encrypted EBS snapshot can be shared cross-account (must re-encrypt with shared key).

AI / ML Services

Bedrock ยท SageMaker ยท Comprehend ยท Rekognition ยท Kendra

Service Selection โ€” AI/ML Decision Tree

  • No ML experience + use foundation model โ†’ Bedrock
  • Build/train custom model, MLOps โ†’ SageMaker
  • Analyze text (entities, sentiment, language, PII) โ†’ Comprehend
  • Search documents/internal knowledge base โ†’ Kendra (API) or Q Business (conversational)
  • Analyze images/video (faces, objects, moderation) โ†’ Rekognition
  • Text-to-speech โ†’ Polly
  • Speech-to-text โ†’ Transcribe
  • Translation โ†’ Translate
  • Fraud detection โ†’ Fraud Detector
  • Personalization/recommendations โ†’ Personalize
  • Forecasting โ†’ Forecast
  • OCR / document extraction โ†’ Textract

Amazon Bedrock โ€” Deep Dive

  • Foundation models: Anthropic Claude, Meta Llama, Mistral, AI21, Amazon Titan, Stability AI
  • Bedrock Agents: multi-step orchestration, tool use (call Lambda/APIs)
  • Knowledge Bases: RAG with vector embeddings, S3 data source, managed vector store
  • Guardrails: content filtering, grounding checks, PII redaction, topic denial
  • Model Evaluation: compare models on custom datasets
  • Custom model fine-tuning: fine-tune Titan or Cohere models on your data
  • Bedrock Studio: no-code prototyping for Bedrock apps

SageMaker Key Components

  • Studio: web IDE for ML development
  • Training jobs: managed distributed training on EC2
  • Endpoints: deploy models for real-time inference
  • Batch Transform: offline inference on large datasets
  • Pipelines: CI/CD for ML workflows
  • Feature Store: centralized ML feature repository
  • Model Registry: versioning, approval workflow
  • JumpStart: pre-built models, fine-tuning templates
  • Canvas: no-code ML for business analysts
  • Ground Truth: data labeling with human workforce

Serverless & Integration

Lambda ยท API GW ยท SQS ยท SNS ยท EventBridge ยท Step Functions

API Gateway

  • REST API: full features, stages, caching, usage plans, API keys
  • HTTP API: cheaper, faster, OIDC/JWT auth, WebSocket, no usage plans
  • WebSocket API: persistent connections, bidirectional
  • Authorizers: Lambda (custom), Cognito, JWT (HTTP API only)
  • Caching: per stage/method, TTL 0-3600s, up to 237GB
  • Throttling: account default 10K RPS, burst 5K, per-stage/method overrides
  • VPC Link: private integration to NLB in VPC
HTTP API = cheaper for simple cases. REST API = throttling, caching, X-Ray tracing, WAF. "Rate limit per customer" = REST API usage plans. WebSocket = REST API or WebSocket API (not HTTP API).

SQS

  • Standard: at-least-once, best-effort ordering, unlimited throughput
  • FIFO: exactly-once, strict ordering, 300 msg/s (3K with batching)
  • Visibility timeout: lock message during processing (default 30s, max 12hr)
  • DLQ: after maxReceiveCount failed attempts
  • Long polling: ReceiveMessageWaitTimeSeconds (up to 20s, reduces API calls)
  • Message retention: 1 min to 14 days
  • Max message size: 256KB (Extended Client Library โ†’ S3 for larger)
  • Delay queues: postpone delivery 0-15min
FIFO 300 TPS baseline (3000 with high throughput mode). Need to decouple + exactly-once = FIFO. Standard = higher scale. Visibility timeout must be > processing time or message re-appears.

SNS

  • Pub/sub fan-out to SQS, Lambda, HTTP, email, SMS, mobile push
  • FIFO topics: ordered delivery, dedup, only SQS FIFO subscribers
  • Message filtering: subscribers receive only messages matching filter policy
  • SNS + SQS fan-out: one topic โ†’ multiple SQS queues for parallel processing
  • Message archival: SNS โ†’ Kinesis Data Firehose โ†’ S3
SNS fan-out pattern: one publish โ†’ multiple consumers. Filter policies on subscriber (not publisher). FIFO SNS = only FIFO SQS subscribers.

EventBridge

  • Event bus: default (AWS services), custom (your events), partner (SaaS)
  • Rules: match events โ†’ route to targets (Lambda, SQS, Step Functions, etc.)
  • Scheduler: cron/rate-based and one-time scheduled invocations (replaced CloudWatch Events)
  • Pipes: point-to-point with filtering/enrichment (SQS/DynamoDB/Kinesis โ†’ target)
  • Schema Registry: discover and store event schemas
  • Cross-account/cross-region event buses
EventBridge Scheduler replaced CloudWatch Events Scheduled Rules โ€” same cron syntax, more targets, better scaling. EventBridge Pipes = filter+enrich before routing. Use Pipes to avoid Lambda glue code.

Step Functions

  • Standard: durable, max 1 year, exactly-once, audit history, $0.025/1K transitions
  • Express: high-volume, max 5 min, at-least-once, cheaper ($1/M executions)
  • Integrations: 200+ AWS services via optimized or SDK integrations
  • Map state: parallel processing of array items
  • Wait for task token: pause until external system calls back
  • Distributed map: process millions of S3 objects in parallel
Standard = long workflows, audit needed. Express = IoT/streaming/high-volume, short. "Human approval step" = Step Functions + SNS + wait for task token. Express is NOT exactly-once.

Kinesis

  • Data Streams: real-time, shards (1MB/s in, 2MB/s out per shard), 24hrโ€“365day retention
  • Data Firehose: load to S3/Redshift/OpenSearch/Splunk/HTTP, micro-batching, no replay
  • Data Analytics (managed Flink): SQL/Flink on streaming data
  • Video Streams: ingest/store/process video at scale
  • Enhanced fan-out: 2MB/s per consumer per shard (push model)
  • On-demand mode: auto-scales capacity
Firehose = no replay capability (delivers then gone). Data Streams = replay within retention window. "Real-time analytics" = Data Streams โ†’ Flink/Lambda. "Load to S3 for batch" = Firehose. Firehose buffers (not truly real-time โ€” 60s or 5MB minimum).

Cost Optimization

20% of exam

Compute Cost Strategy

  • Spot for fault-tolerant: HPC, batch, CI, stateless web (up to 90% savings)
  • Spot Fleet/EC2 Fleet: mixed on-demand + spot, capacity-optimized allocation
  • Savings Plans > Reserved Instances for flexibility
  • Graviton: 20-40% cheaper for same workload
  • Lambda: ephemeral, pay-per-invocation (vs always-on EC2)
  • Fargate Spot: spot pricing for Fargate tasks
  • Right-sizing: Compute Optimizer recommendations

Storage Cost Strategy

  • S3 Lifecycle โ†’ IA โ†’ Glacier โ†’ Deep Archive
  • Intelligent-Tiering for unknown access patterns
  • S3 storage lens: visibility into storage usage/costs
  • EBS: gp3 over gp2 (cheaper, separate IOPS config)
  • EBS snapshots: incremental, only changed blocks
  • Delete unattached EBS volumes (common waste)
  • EFS: use Infrequent Access lifecycle + One Zone where AZ redundancy not needed

Cost Monitoring Tools

  • Cost Explorer: visualize, filter, forecast spending
  • Budgets: alerts at thresholds (actual or forecast), Actions to stop spending
  • Cost Anomaly Detection: ML-based spending anomaly alerts
  • Compute Optimizer: rightsizing recommendations (EC2, Lambda, EBS, ECS Fargate, RDS)
  • Trusted Advisor: cost, security, performance, fault tolerance checks
  • Cost Allocation Tags: tag resources โ†’ report by tag in Cost Explorer
  • Savings Plans recommendations: built into Cost Explorer

Data Transfer Costs (Common Gotchas)

  • Inbound to AWS: FREE
  • Same AZ, same region, same service: FREE
  • Cross-AZ (EC2-to-EC2): $0.01/GB each direction
  • Cross-region: varies, ~$0.02-0.09/GB
  • EC2 to internet: $0.09/GB (first 10TB)
  • S3 to CloudFront: FREE (then CF to internet: $0.0085/GB)
  • VPC Endpoints (Gateway): FREE for S3/DynamoDB vs NAT costs
"Reduce data transfer costs for S3 access from EC2" โ†’ Gateway VPC Endpoint (free). Architecture diagram shows cross-AZ calls โ†’ identify hidden transfer costs. CF in front of S3 = eliminates S3 โ†’ internet charges.

Migration & Hybrid

DMS ยท Snowball ยท MGN ยท Outposts

Database Migration

  • DMS (Database Migration Service): homogeneous or heterogeneous DB migration, CDC, minimal downtime
  • SCT (Schema Conversion Tool): convert schema from one engine to another (Oracle โ†’ PostgreSQL)
  • DMS + SCT: together for heterogeneous migrations
  • DMS supports: Oracle, SQL Server, MySQL, PostgreSQL, MongoDB, S3 as source/target
  • CDC: continuous replication for near-zero downtime cutover

Data Transfer to AWS

  • Snowball Edge: 80TB, compute + storage, data center to S3
  • Snowball Edge (Compute Optimized): GPU, edge ML inference
  • Snowcone: 8-14TB, smallest, field rugged, portable
  • Snowmobile: 100PB, exabyte migrations, truck
  • Rule of thumb: if data transfer takes >1 week on existing bandwidth โ†’ physical transfer device
  • DataSync: online transfer, NFS/SMB/S3/EFS/FSx, scheduled/automated, up to 10Gbps
DataSync = online, ongoing sync. Snowball = offline, one-time large migration. DataSync can also transfer between AWS storage services (EFS to S3, etc.).

Server Migration: MGN

  • MGN (Application Migration Service): lift-and-shift, continuous block replication
  • Agent-based, continuous replication, minimal downtime cutover
  • Replaced SMS (Server Migration Service)
  • VMware Cloud on AWS: run VMware workloads on AWS hardware
  • EC2 VM Import/Export: simple import of VMs (no continuous replication)

Outposts & Edge

  • Outposts: AWS rack in your data center, run ECS/EKS/RDS/S3 locally
  • Connected to parent AWS region via Direct Connect
  • Outposts Servers: 1U/2U for branch offices (smaller form factor)
  • Local Zones: AWS infrastructure closer to metro areas, low latency
  • Wavelength: AWS compute at telecom 5G edge (ultra-low latency mobile apps)
Outposts = on-prem data sovereignty + AWS APIs. Local Zones = nearby AWS infrastructure (not on-prem). Wavelength = 5G/mobile edge (ms latency for mobile games, AR/VR). These are three different things.

Comparison Tables

High value for exam

SQS vs SNS vs EventBridge vs Kinesis

FeatureSQSSNSEventBridgeKinesis Streams
PatternQueue (pull)Pub/Sub (push)Event routingStream (replay)
ConsumersOneMany (fan-out)Many rulesMany (shards)
ReplayNoNoNo*Yes (retention)
OrderingFIFO onlyFIFO topicNoPer-shard
LatencyNear-real-timeNear-real-time~0.5sReal-time
FilterPartialYes (per subscriber)Yes (rules)Must code
Best forDecouple workFan-out notifySaaS + AWS eventsHigh-volume stream

ECS vs EKS vs Fargate

FeatureECS (EC2)ECS (Fargate)EKS (EC2)EKS (Fargate)
Node managementYouAWSYouAWS
Kubernetes APINoNoYesYes
Spot supportYesYes (Spot)YesLimited
GPUYesNoYesNo
DaemonSetsNoNoYesNo
Best forAWS-native, GPUServerless containersK8s expertiseK8s serverless

DR Strategy Comparison

StrategyRTORPOCostHow
Backup & RestoreHoursHoursLowestBackup to S3, restore on disaster
Pilot Light10-30 minMinutesLowCore DB running, scale compute on fail
Warm StandbyMinutesSecondsMediumScaled-down full env, promote on fail
Multi-Site Active-ActiveNear-zeroNear-zeroHighestFull prod in 2+ regions, live traffic split

S3 Storage Classes

ClassDurabilityAZsMin DurationRetrievalBest for
Standard11 9sโ‰ฅ3NoneInstant (free)Frequent access
Standard-IA11 9sโ‰ฅ330 daysInstant (fee)Backups, DR
One Zone-IA11 9s130 daysInstant (fee)Reproducible infrequent
Glacier Instant11 9sโ‰ฅ390 daysInstant (fee)Archive quarterly access
Glacier Flexible11 9sโ‰ฅ390 days1-12 hrsArchive annual access
Glacier Deep Archive11 9sโ‰ฅ3180 days12-48 hrsCompliance long-term
Intelligent-Tiering11 9sโ‰ฅ3NoneInstant (no fee)Unknown access pattern
Express One Zone99.95%1NoneSingle-digit msML training, analytics

IAM Policy Evaluation Order

StepPolicy TypeEffect if Deny
1Explicit Deny (any policy)Stop โ€” denied
2Organization SCPsStop if not allowed
3Resource-based policiesGrant if allows (same account)
4Identity-based policiesGrant if allows
5Permission boundariesRestrict max permissions
6Session policiesFurther restrict
DefaultImplicit denyDenied if nothing allowed

Common Scenario Patterns

Read these

Scenario: High Availability Web App

  • ALB โ†’ Auto Scaling Group โ†’ EC2 across multiple AZs
  • RDS Multi-AZ for database (standby sync, auto failover)
  • Read replicas for read-heavy queries
  • ElastiCache (Redis) for session state
  • Route 53 health checks for regional failover
  • CloudFront for static assets / cache

Scenario: Event-Driven Processing

  • S3 upload โ†’ S3 Event โ†’ SQS โ†’ Lambda (decoupled, retry-able)
  • SQS DLQ for failed processing
  • SNS โ†’ multiple SQS queues for fan-out parallel processing
  • EventBridge for cross-service event routing
  • Kinesis for ordered high-throughput stream processing

Scenario: Secure Cross-Account Access

  • IAM role in account B with trust policy for account A
  • Account A principal calls AssumeRole โ†’ gets temp credentials
  • Permissions: role policy AND requesting principal must allow
  • For S3: resource policy (bucket policy) can grant cross-account alone
  • SCPs must allow in both accounts

Scenario: Cost Reduction

  • Spiky โ†’ On-demand + Spot (with Spot Fleet)
  • Steady-state โ†’ Compute Savings Plans or EC2 Reserved
  • Dev/test shutdown nights/weekends โ†’ EventBridge Scheduler
  • S3 old objects โ†’ Lifecycle to Glacier
  • EC2 over-provisioned โ†’ Compute Optimizer โ†’ Graviton
  • NAT Gateway costs high โ†’ VPC Endpoints for S3/DynamoDB
  • CloudFront โ†’ reduce S3 egress costs

Scenario: Disaster Recovery

  • RTO/RPO requirements drive strategy (Backup โ†’ Pilot Light โ†’ Warm โ†’ Active-Active)
  • Aurora Global = <1s RPO, <1min RTO between regions
  • DynamoDB Global Tables = multi-region active-active
  • Route 53 failover routing + health checks = DNS-level failover
  • CloudFormation / Infrastructure as Code = fast env recreation

Scenario: Data Lake Architecture

  • S3 (raw) โ†’ Glue ETL โ†’ S3 (curated) โ†’ Athena (query)
  • Lake Formation for column/row-level security + data catalog
  • Redshift Spectrum for SQL over S3 from Redshift
  • DataZone for cross-team data discovery and governance
  • Kinesis Firehose โ†’ S3 for streaming ingest

Scenario: Serverless API

  • API Gateway (HTTP API for low cost) โ†’ Lambda โ†’ DynamoDB
  • Cognito for auth (user pool โ†’ JWT โ†’ API GW authorizer)
  • Lambda Layers for shared dependencies
  • DAX for DynamoDB read caching
  • CloudFront in front of API GW for geographic distribution
  • SQS for async tasks that exceed Lambda timeout concern

Scenario: Security & Compliance

  • GuardDuty โ†’ EventBridge โ†’ Lambda โ†’ auto-remediation
  • Config Rules โ†’ track compliance drift
  • Security Hub โ†’ aggregate findings, CSPM score
  • Macie โ†’ scan S3 for PII before sharing
  • Inspector โ†’ scan EC2/Lambda/containers for CVEs
  • CloudTrail โ†’ all API calls, immutable to S3
  • WAF โ†’ rate limit, geo-block, OWASP rules

Pre-Exam Checklist

Click to track
SAA-C03 ยท 65 questions ยท 130 min ยท Pass: 720/1000 ยท Scenario-based multiple choice
Sources: AWS Exam Guide, AWS Documentation, Tutorials Dojo, AWS re:Invent sessions 2023-2025