Cloud Cost Optimisation Strategies
Eight proven strategies ranked by effort and potential savings. Each includes realistic savings percentages, scope, and implementation steps.
28-35%
Avg waste recoverable
Days
Fastest win
60-90%
Biggest single saving
20-30%
Typical 1yr total
| Strategy | Saving | Scope | Effort | Time to Value |
|---|---|---|---|---|
Reserved Instances and Savings Plans Commitment Discounts | 30-45% | Committed compute spend | Medium | 1-2 weeks |
Rightsizing Compute Instances Resource Optimisation | 15-30% | Total compute spend | Low-Medium | 1-4 weeks |
Spot and Preemptible Instances Spot Pricing | 60-90% | Interruptible workload compute | High | 2-8 weeks |
Storage Tiering and Lifecycle Policies Storage Optimisation | 40-70% | Object and block storage spend | Low | Days |
Idle and Orphaned Resource Cleanup Waste Elimination | 5-15% | Total cloud spend | Low | Days to 1 week |
Data Transfer and Egress Optimisation Network Costs | 20-60% | Data transfer and egress spend | High | 4-12 weeks |
Serverless and Container Optimisation Compute Architecture | 30-70% | Event-driven and batch workloads | High | 4-16 weeks |
Database Rightsizing and Autoscaling Database Optimisation | 20-40% | Database and data warehouse spend | Medium | 2-6 weeks |
Reserved Instances and Savings Plans
Commitment Discounts
Commit to a consistent amount of compute usage for 1 or 3 years in exchange for significant discounts. AWS Savings Plans offer 66% discount on EC2 vs on-demand; Azure Reserved VMs offer up to 72% savings. A 1-year commitment with no upfront payment still saves 30-40% over on-demand.
Implementation Steps
- 1.Analyse your baseline on-demand spend over the last 90 days
- 2.Identify stable, predictable workloads suitable for commitment
- 3.Start with 1-year Compute Savings Plans (most flexible) covering 50-60% of compute
- 4.Review and renew quarterly as your workload mix changes
Scope
Committed compute spend
Risk level
Low - convertible options exist if architecture changes
Rightsizing Compute Instances
Resource Optimisation
Most cloud instances are provisioned for peak load but run at 10-20% average CPU utilisation. Moving oversized instances to the next size down typically saves 35-50% per instance. AWS Trusted Advisor and Azure Advisor surface rightsizing recommendations automatically.
Implementation Steps
- 1.Pull utilisation data for all EC2 / VMs over the last 14-30 days
- 2.Flag any instance with average CPU below 20% and memory below 40%
- 3.Test downsized instances in staging environments first
- 4.Automate recommendations via Spot.io, CloudHealth, or native tools
Scope
Total compute spend
Risk level
Medium - requires testing; schedule during low-traffic windows
Spot and Preemptible Instances
Spot Pricing
AWS Spot Instances, Azure Spot VMs, and GCP Preemptible VMs offer unused capacity at 60-90% discounts. Suitable for batch jobs, CI/CD pipelines, data processing, and stateless web tiers. Spot.io and Elastigroup automate Spot management to reduce interruption risk.
Implementation Steps
- 1.Identify batch and stateless workloads that can tolerate interruption
- 2.Implement Spot instance diversification across instance families
- 3.Use Spot interruption handlers to drain workloads gracefully
- 4.Consider managed Kubernetes with Spot for worker node pools
Scope
Interruptible workload compute
Risk level
High - interruptions require application-level resilience
Storage Tiering and Lifecycle Policies
Storage Optimisation
S3 Standard costs $0.023/GB. S3 Standard-IA costs $0.0125/GB. S3 Glacier costs $0.004/GB. Moving data that is not accessed frequently to cheaper tiers with lifecycle policies is one of the easiest wins in cloud cost optimisation.
Implementation Steps
- 1.Enable S3 Storage Class Analysis to identify access patterns
- 2.Set lifecycle policies: move to IA after 30 days, Glacier after 90 days
- 3.Enable S3 Intelligent-Tiering for buckets with unpredictable access
- 4.Delete unattached EBS volumes and old snapshots (often 10-15% of storage cost)
Scope
Object and block storage spend
Risk level
Low - lifecycle policies are reversible; test retrieval times for Glacier
Idle and Orphaned Resource Cleanup
Waste Elimination
Unattached Elastic IPs, load balancers with no targets, stopped EC2 instances that still have EBS volumes attached, old AMIs and snapshots. These accumulate silently and can represent 10-15% of cloud spend in organisations without active governance.
Implementation Steps
- 1.Run AWS Trusted Advisor or Azure Advisor idle resource checks
- 2.Tag all resources with a LastActiveDate and owner
- 3.Set automated policies to stop untagged resources after 7 days
- 4.Schedule monthly idle resource review as a FinOps ritual
Scope
Total cloud spend
Risk level
Low - tag before deleting; maintain 30-day snapshot before removal
Data Transfer and Egress Optimisation
Network Costs
AWS charges $0.09/GB for data transferred to the internet. Multi-AZ and cross-region transfers also incur charges. For data-heavy workloads, egress can represent 20-40% of total cloud spend. VPC endpoints, CDNs, and architectural changes reduce these costs significantly.
Implementation Steps
- 1.Analyse your VPC Flow Logs and cost allocation for data transfer charges
- 2.Replace internet-routed API calls with VPC Endpoints (free for Gateway endpoints)
- 3.Use CloudFront or Cloudflare CDN to serve static assets instead of S3 direct
- 4.Co-locate services in the same region and AZ where latency allows
Scope
Data transfer and egress spend
Risk level
High - architecture changes required; latency testing needed
Serverless and Container Optimisation
Compute Architecture
Lambda and serverless functions charge per invocation and per millisecond of execution. Moving bursty, event-driven workloads from always-on EC2 to Lambda can cut compute costs by 70%+. Kubernetes bin-packing with Karpenter or Cluster Autoscaler reduces idle node spend.
Implementation Steps
- 1.Profile Lambda memory settings - over-allocation is common
- 2.Use Lambda Power Tuning to find the optimal memory/cost/performance balance
- 3.Implement Karpenter for Kubernetes node provisioning efficiency
- 4.Review ECS/Fargate task sizes; Fargate is priced per vCPU-second
Scope
Event-driven and batch workloads
Risk level
High - significant architectural changes; requires engineering capacity
Database Rightsizing and Autoscaling
Database Optimisation
RDS, Aurora, and Redshift instances are often provisioned for peak query load but sit idle most of the time. Aurora Serverless v2 autoscales to zero. Redshift Serverless charges per query. Multi-AZ read replicas in dev/staging environments often cost more than the development value they provide.
Implementation Steps
- 1.Review RDS Enhanced Monitoring for actual utilisation vs provisioned capacity
- 2.Migrate dev/staging databases to Aurora Serverless or smaller instance types
- 3.Use Redshift Serverless for ad-hoc analytics rather than provisioned clusters
- 4.Implement reserved instances for production databases running 24/7
Scope
Database and data warehouse spend
Risk level
Medium - test performance impact before production changes
Calculate how much your cloud is wasting right now
Enter your monthly cloud spend to get a personalised waste and ROI estimate.
Open ROI Calculator →