Observability

3

Cloud-Native SIEM on AWS: Architecture Decisions, Cost Model, and What We Would Change

A cloud-native SIEM delivering real-time threat detection, log correlation, and automated incident response — with a frank cost breakdown and the decisions we'd revisit.

7 min read

The AWS VPC Foundation That Runs Dev, Staging, and Prod Without Your NAT Gateway Bill Spiralling

A production-ready multi-environment VPC in CDK TypeScript. CIDR allocation, private endpoints for S3/DynamoDB, multi-AZ design, and the decision behind every choice.

5 min read

Building an AWS Chaos Engineering Platform: Architecture, Experiments, and Real-World Resilience Testing

A production-ready AWS Chaos Engineering Platform that automates failure injection, blast radius control, resilience testing, GameDays, and observability. Built with serverless, Terraform, and AWS best practices to improve system reliability and fault tolerance.

4 min read

Every week: one AWS failure broken down + the fix that worked