How We Cut AWS Bills by 40% Without Sacrificing Performance

Cloud cost optimisation is one of those topics that sounds unglamorous until a founder sees their AWS bill and realises they're spending ₹8 lakh a month on infrastructure for a product with 5,000 active users. At that point, it becomes very interesting very quickly.

We've run cost audits on a dozen AWS deployments over the past three years. In every single case, we've found meaningful savings — typically between 30% and 55% of the current bill — without downgrading performance, reliability, or capacity. The savings come from the same handful of places, almost every time. Here's the audit process we use and the specific optimisations that have the highest impact.

Step 1: Get Full Cost Visibility

Before you can optimise anything, you need to understand what you're spending money on. AWS Cost Explorer is the starting point — but it shows you service-level costs (EC2, RDS, data transfer) rather than resource-level costs. Enable Cost Allocation Tags from day one so you can break costs down by environment (production vs staging), service (API vs background workers), and team or feature.

# Tag your resources consistently
aws ec2 create-tags \
  --resources i-0abc123def456 \
  --tags Key=Environment,Value=production \
          Key=Service,Value=api-server \
          Key=Team,Value=backend

AWS Compute Optimizer analyses your resource utilisation and makes specific rightsizing recommendations. Enable it — it's free — and look at its recommendations before making any other changes. It will tell you which EC2 instances are over-provisioned based on actual CPU, memory, and network utilisation over the past 14 days.

Step 2: Rightsize EC2 Instances

Over-provisioned EC2 instances are the single biggest source of wasted cloud spend in almost every deployment we've audited. The pattern is always the same: a developer chose an instance size during the initial build based on worst-case assumptions, the application launched, actual utilisation came in at 15-25% CPU, and nobody changed anything.

Common Finding

In a recent audit, a client was running their API servers on m5.2xlarge instances (8 vCPU, 32GB RAM) with average CPU utilisation of 12% and memory utilisation of 18%. Moving to m5.large instances (2 vCPU, 8GB RAM) with auto-scaling reduced that line item by 68% with no measurable impact on response times.

The correct process for rightsizing:

Pull 30 days of CloudWatch metrics for CPU, memory (requires the CloudWatch agent), and network utilisation for each instance.
Identify the P99 peak utilisation over that period — not the average, but the worst case.
Choose the smallest instance type where P99 CPU stays under 70% and P99 memory stays under 80%. Leave headroom for spikes.
Test in staging before applying to production. Watch for memory-bound workloads that show low CPU but are actually constrained by RAM.

Step 3: Purchase Savings Plans or Reserved Instances

If you're running any workload continuously — your primary API servers, your production database — you should not be paying On-Demand prices. On-Demand is priced for maximum flexibility; Savings Plans and Reserved Instances trade that flexibility for significant discounts.

Compute Savings Plans: 1 or 3-year commitment to a consistent amount of compute usage (measured in $/hour). In return, you get up to 66% discount vs On-Demand on EC2, Fargate, and Lambda. The most flexible option — the commitment applies across instance families, regions, and operating systems.
EC2 Instance Savings Plans: Higher discount (up to 72%) but locked to a specific instance family in a specific region.
Reserved Instances for RDS: 1 or 3-year reservations for RDS can save 30-60% vs On-Demand database pricing.

Only commit to Savings Plans after rightsizing. Committing to On-Demand usage at a volume that's twice what you need is still wasteful, even at a discount.

Step 4: Audit Data Transfer Costs

Data transfer costs are the most counter-intuitive part of AWS pricing, and they're consistently underestimated. The key rules:

Inbound data transfer to AWS is free. Outbound to the internet is charged at $0.09/GB (varies by region).
Transfer between AWS services in different regions incurs data transfer charges. Keep resources in the same region unless you have a specific reason not to.
Transfer between availability zones (AZs) within the same region is charged at $0.01/GB each way. For high-throughput services, this adds up.
CloudFront in front of your public assets dramatically reduces data transfer costs because the CDN charges less per GB than EC2 egress, and the cache hit rate means much of your content is served without touching origin.

"One client was paying $1,800/month in inter-AZ data transfer costs because their application was making API calls to a Redis cluster in a different availability zone on every request. Moving the cache to the same AZ eliminated the cost entirely."

Step 5: Optimise Lambda Functions

Lambda is billed on duration (GB-seconds) and number of invocations. The two most common over-spend patterns:

Over-allocated memory

Lambda lets you allocate 128MB to 10,240MB of memory. More memory also means more CPU. Many teams default to 512MB or 1024MB for everything. AWS Lambda Power Tuning is an open-source tool (deployed as a Step Functions state machine) that runs your function at multiple memory settings and reports the cost-optimal configuration.

# Typical Lambda Power Tuning output
{
  "128": { "averageDuration": 2847, "averageCost": 0.0000000298 },
  "256": { "averageDuration": 1423, "averageCost": 0.0000000298 },
  "512": { "averageDuration": 718,  "averageCost": 0.0000000300 },
  "1024": { "averageDuration": 412, "averageCost": 0.0000000343 }
}
# Optimal: 256MB - same cost as 128MB but 2x faster

Cold starts on VPC-attached Lambdas

Lambda functions attached to a VPC have historically had long cold start times (500ms-2s) because of the ENI provisioning process. With the Hyperplane ENI changes AWS shipped in 2019, this is much less of an issue for most functions — but VPC-attached Lambdas that haven't been invoked in a while can still experience 1-2 second cold starts. Provisioned Concurrency eliminates this but adds cost. For latency-critical endpoints, the tradeoff is usually worth it.

Step 6: S3 Storage Class Optimisation

S3 Standard costs $0.023/GB/month. S3 Intelligent-Tiering automatically moves objects between access tiers based on usage patterns — objects that haven't been accessed in 30 days move to the Infrequent Access tier ($0.0125/GB), and objects inactive for 90+ days move to the Archive Instant Access tier ($0.004/GB). Enable S3 Intelligent-Tiering on any bucket where you store objects you don't access daily — log archives, backups, media assets.

# Apply Intelligent-Tiering to existing objects in a bucket
aws s3 cp s3://your-bucket/ s3://your-bucket/ \
  --recursive \
  --storage-class INTELLIGENT_TIERING

# Or set it as the default for new objects via bucket lifecycle policy
aws s3api put-bucket-intelligent-tiering-configuration \
  --bucket your-bucket \
  --id entire-bucket \
  --intelligent-tiering-configuration \
    '{"Id":"entire-bucket","Status":"Enabled","Tierings":[{"Days":30,"AccessTier":"ARCHIVE_ACCESS"}]}'

The Audit Checklist

Run through this list on any AWS account:

Enable AWS Cost Explorer and Cost Allocation Tags
Enable AWS Compute Optimizer (free)
Rightsize all EC2 instances based on 30-day P99 utilisation
Purchase Compute Savings Plans for all steady-state workloads
Purchase Reserved Instances for all RDS instances
Audit inter-AZ and cross-region data transfer costs in Cost Explorer
Move publicly served static assets behind CloudFront
Run Lambda Power Tuning on all Lambda functions
Enable S3 Intelligent-Tiering on log, backup, and media buckets
Delete unattached EBS volumes, idle Elastic IPs, and unused load balancers
Check for idle RDS instances (low connection count in CloudWatch for 30 days)

The last item on that list — idle RDS instances — is where we found the single largest saving in a recent audit: a production-sized PostgreSQL RDS instance (db.r5.2xlarge) that was provisioned for a project that had been cancelled six months earlier, still running and costing $580/month with zero connections.

Cloud cost discipline is a process, not a one-time project. Build a monthly cost review into your engineering rituals, use budget alerts in AWS Budgets to catch unexpected spikes, and treat cost regressions the same way you treat performance regressions — as issues to be fixed, not accepted.

If you want us to run a cost audit on your AWS account, get in touch. We typically identify the full audit savings within the first session.

← Back to Blog Start from the top: Mobile-First Strategy →︎