Cloud Cost Optimization Dashboard

Avg. Waste
15–32%
of cloud spend
Right-Size Savings
30–60%
per instance
Spot Discount
60–91%
vs on-demand
Storage Tiering
30–50%
on infrequent data
fig-A05_cost-tagging-workflow.svg
Cost Tagging Workflow
// Figure A05: 8-step tagging workflow — schema → tag enforcement → optimization
01 The FinOps Framework — Crawl / Walk / Run [ AWS · Azure · GCP ]

The FinOps Foundation defines three maturity stages. Each maps to specific tooling and process changes. The goal at every stage is accountability: knowing who spends what, and why.

StageFocusKey ActionsTypical Time
CrawlVisibilityCost baseline, tagging schema, idle resource scan0–30 days
WalkOptimizationRI/Savings Plan purchase, right-sizing, budget alerts30–90 days
RunAutomationPredictive modeling, auto-scaling policies, showback90+ days
02 Right-Sizing Compute [ HIGH ROI ]

If average CPU is below 40% over a 30-day baseline, the instance is oversized. Industry data shows 60–70% of cloud instances run at 2× required capacity.

fig-A06_finops-glossary.svg
FinOps Glossary Hub
// Figure A06: FinOps Glossary Hub — key terms for cloud cost optimization
Utilization SignalThresholdAction
CPU avg (30d)< 40%Downsize one size tier
Memory avg (30d)< 50%Review instance family
Network throughput< 20% peakEvaluate smaller ENI
Disk I/O< 30% avgSwitch to lower-tier volume
  • Export CPU + memory metrics from CloudWatch / Azure Monitor / GCP Monitoring
  • Identify instances with < 40% avg CPU over 30 days
  • Test at target size before terminating original instance
  • Make one change at a time to attribute performance correctly
  • Set a monitoring alert on the new instance before closing the old one
  • [ AdSense Slot 1 — top of content ]
    03 Reserved Instances vs. Savings Plans [ COMMITMENT REQUIRED ]
    FeatureSavings PlansReserved Instances
    FlexibilityHigh — any instance family, OS, AZLow — specific instance type + AZ
    DiscountUp to 72% vs on-demandUp to 75% vs on-demand
    Commitment1-yr or 3-yr USD amount1-yr or 3-yr specific instance
    Best forBaseline predictable workloadStable critical production
    Recommendation: Start with a Compute Savings Plan for baseline workload. Layer specific RIs for your most stable, highest-utilization instances.
    04 Spot Instances for Fault-Tolerant Workloads [ BATCH / CI-CD ]
    ProviderMax DiscountUse Case
    AWS EC2 Spot60–91% offBatch processing, CI/CD agents
    Azure SpotUp to 90% offData pipelines, rendering
    GCP Spot / PreemptibleUp to 91% offNon-production workloads
    Do NOT use spot for: databases, persistent APIs, any workload requiring consistent uptime.
    05 Storage Tiering [ LIFECYCLE POLICIES ]
    TierAccessCostRetrieval Delay
    Hot / StandardReal-timeBaselineNone
    Cool / InfrequentMonthly–40–60% storageHours
    Archive / ColdQuarterly–70–80% storage12–48 hours
    Glacier / Deep ArchiveAnnual–95% storageHours to days

    Implement lifecycle policies to auto-transition: Hot → Cool (90d)Cool → Glacier (1yr)Glacier → Deep Archive (3yr)

    06 Tagging Strategy [ COST ALLOCATION ]
    Required TagExamplePurpose
    EnvironmentproductionIsolate prod vs. dev spend
    Owner / Teamplatform-engChargeback by team
    CostCenterCC-12345Finance allocation
    Projectpayments-apiPer-project visibility
    ServiceNamepostgres-mainResource identification
    Enforcement: Use SCPs or policies to block resource creation if mandatory tags are missing. Audit weekly with: aws resourcegroupstaggingapi get-resources
    [ AdSense Slot 2 — mid content ]
    fig-A10_waste-checklist.svg
    Waste Detection Checklist
    // Figure A10: 15-point waste detection checklist
    Disclaimer: This guide provides general informational content about cloud infrastructure cost management. Figures and benchmarks are based on publicly available industry averages (e.g., Gartner, IDC, cloud provider documentation) and may vary by provider, region, and workload. This content is not a substitute for professional financial, legal, or technical advice specific to your organization.