Cloud Engineering

4

The AWS VPC Foundation That Runs Dev, Staging, and Prod Without Your NAT Gateway Bill Spiralling

A production-ready multi-environment VPC in CDK TypeScript. CIDR allocation, private endpoints for S3/DynamoDB, multi-AZ design, and the decision behind every choice.

5 min read

Building an AWS Chaos Engineering Platform: Architecture, Experiments, and Real-World Resilience Testing

A production-ready AWS Chaos Engineering Platform that automates failure injection, blast radius control, resilience testing, GameDays, and observability. Built with serverless, Terraform, and AWS best practices to improve system reliability and fault tolerance.

4 min read

Building a Cloud-Native APM Platform with Distributed Profiling on AWS

A cloud-native APM platform with distributed profiling, flame graphs, and performance monitoring built on AWS. Covers full architecture, VPC design, observability, and IaC with CDK to enable scalable, secure, multi-environment performance analysis.

4 min read

Building a Petabyte-Scale Log Analytics Platform on AWS

A petabyte-scale log analytics platform built on AWS using OpenSearch, S3, Kinesis, and Firehose. It delivers real-time search, long-term storage, and cost-efficient observability. Designed with Terraform IaC for high scalability, security, and enterprise readiness.

5 min read

Every week: one AWS failure broken down + the fix that worked