Resource-Based vs Spend-Based CUDs
Google Cloud's CUD programme has two fundamentally different structures — resource-based and spend-based — that apply to different services and suit different consumption profiles. Understanding the distinction is the starting point for any CUD optimisation strategy.
Resource-Based CUDs
Resource-based CUDs commit to specific amounts of Compute Engine resources — vCPUs and GB of RAM — in a specific region for 1 or 3 years. The discounts are applied at the VM level, reducing the per-vCPU and per-GB cost for committed resources. Committed resources are applied automatically to any VM that matches the committed resource configuration — the discount applies as long as a matching VM is running, regardless of the specific VM name or purpose.
Resource-based CUD discounts are substantial: 20% discount for 1-year commitments and 57% discount for 3-year commitments off equivalent on-demand pricing for N2 and N2D compute. These are the most aggressive self-service compute discounts available from any major hyperscaler — AWS's 1-year Reserved Instances provide approximately 40% discount, but require instance-type-specific commitment rather than flexible resource commitment.
The key limitation is resource-specificity. Resource-based CUDs are committed to a specific machine type family (N2, N2D, C2, etc.) and region. If your workloads shift to a different machine family or region, the committed resources may not apply — and you pay both the on-demand rate for the new configuration and the committed rate for unused capacity. Machine type flexibility (committing to resources rather than specific instance types) partially mitigates this, but regional specificity remains a real risk for organisations with changing geographic footprints.
Spend-Based CUDs
Spend-based CUDs commit to a minimum monthly spend on specific managed services — BigQuery, Cloud Run, Cloud SQL, Cloud Spanner, Google Kubernetes Engine, and others — and apply a percentage discount across all qualifying consumption of that service, regardless of specific resource configuration. The discount rate for spend-based CUDs is typically 20–25%, lower than the top-tier resource-based CUD rate, but the flexibility is significantly higher.
Spend-based CUDs are best suited for services where resource consumption is variable — e.g., BigQuery query volumes that vary significantly month to month — but total service spend is predictable. They are also better suited to managed services where resource-level commitment is not meaningful (you cannot commit to specific BigQuery slot configurations the way you commit to Compute Engine vCPUs).
Flex CUDs: When Flexibility Beats Discount
Google Cloud introduced Flexible Committed Use Discounts (Flex CUDs) to address the commitment risk of standard 1-year and 3-year CUDs. A Flex CUD has a minimum term of 60 days and can be cancelled at any time after the minimum term with a defined notice period (typically 30 days).
The trade-off is discount depth: Flex CUDs provide approximately 15% discount on compute resources, versus 20% for 1-year standard CUDs and 57% for 3-year standard CUDs. You sacrifice 5 percentage points of discount compared to a standard 1-year CUD in exchange for the ability to reduce or cancel the commitment when circumstances change.
When Flex CUDs Make Commercial Sense
Flex CUDs are appropriate in specific commercial situations:
- Active cloud migrations: When workloads are being migrated to GCP and the future compute profile is not yet stable, Flex CUDs allow initial commitment coverage without locking into configurations that will be obsolete once the migration completes.
- Business uncertainty: Organisations facing potential mergers, divestitures, or major restructurings where future cloud consumption is uncertain can use Flex CUDs to maintain some cost optimisation without the full downside risk of standard CUD commitments.
- New service adoption: When trialling a new GCP service or workload type, Flex CUDs provide early-stage discount coverage without the commitment lock-in of standard CUDs until consumption patterns stabilise.
For stable, mature workloads with predictable resource consumption, standard CUDs are almost always the correct choice — the 5 percentage point discount premium over Flex CUDs compounds materially over a 1-year period.
How CUDs Interact with Sustained Use Discounts
Google Cloud automatically applies Sustained Use Discounts (SUDs) to Compute Engine instances that run for more than 25% of a calendar month — without any commitment or configuration change. SUD discounts reach up to 30% for instances running 100% of the month.
CUDs and SUDs interact in a specific way that affects optimisation strategy: CUD discounts replace SUD discounts for committed resources, they do not stack on top of them. If you have resources that would naturally qualify for full SUD discounts (running 100% of the month, generating ~30% discount), a 1-year resource-based CUD provides only a 20% discount on those resources — which is worse than the automatic SUD discount they were already receiving.
This means the optimal CUD application is not necessarily your highest-utilisation resources. The best CUD candidates are resources that run predictably but not at 100% utilisation — where SUD discounts are partial (say, 15–20% from 50–70% utilisation) and the CUD discount of 20% represents an improvement. Fully saturated resources may be better left on SUD discounts without CUD commitment overhead.
"Most enterprises apply CUDs to their most-used resources without modelling SUD interaction. We routinely find that 20–30% of a client's CUD commitment is applied to resources already receiving better SUD discounts — representing wasted commitment capacity that could be generating incremental savings elsewhere."
CUD Sizing: The Most Common Mistake
CUD over-commitment is the most frequent and costly CUD mistake. An over-committed CUD — committing to more vCPUs or RAM than your workloads actually consume — results in paying for committed resources that are not generating any matched discount, because there are no running VMs to apply them to. Unlike AWS Reserved Instances (which can be sold on the RI Marketplace if over-committed), Google Cloud CUDs cannot be transferred or sold — over-commitment is a pure cost with no recovery mechanism.
The CUD Sizing Framework
Optimal CUD sizing follows a conservative approach:
- Baseline at P10: Set your initial CUD commitment at the 10th percentile of your historical resource consumption — the level you are confident you will consume even in low-demand periods. This minimises over-commitment risk while ensuring the CUD is always being applied.
- Layer with Flex CUDs for variance: Cover the variance between your P10 baseline and your expected average consumption with Flex CUDs. If average consumption is 40% higher than P10, apply Flex CUDs at 30% of baseline to cover most of the gap with limited commitment risk.
- Leave peak consumption on-demand: The peak of your resource consumption — the 5–10% of time when your workloads spike — should remain on-demand pricing. Committing to cover peak consumption means paying for resources at all other times that are only occasionally needed.
- Review quarterly: CUD utilisation data is available in Google Cloud console. Review your CUD utilisation rate (committed resources matched to running VMs vs. committed resources idle) quarterly and adjust commitments at renewal.
BigQuery and Cloud Run Spend-Based CUDs
Spend-based CUDs for BigQuery and Cloud Run warrant specific attention because these services are frequently high-spend areas for data-intensive and microservices-heavy enterprises.
BigQuery Spend-Based CUDs
BigQuery is Google Cloud's enterprise data warehouse, and for organisations with large analytics workloads, it often represents 20–40% of total GCP spend. BigQuery's pricing model has two dimensions: on-demand query pricing (charged per TB scanned) and flat-rate pricing (charged per slot — a unit of compute capacity). BigQuery spend-based CUDs apply to on-demand query costs and provide a 20% discount on committed monthly spend.
For organisations with consistent BigQuery query volumes, committing to spend-based CUDs on the predictable base load — leaving variable or seasonal query spikes uncovered — is a standard optimisation. Alternatively, organisations with large, consistent BigQuery usage should evaluate flat-rate pricing with slot reservations, which can be more cost-effective than on-demand with CUDs for sustained high-volume workloads.
Cloud Run Spend-Based CUDs
Cloud Run's serverless pricing model — charging per CPU-second and GB-second of actual consumption — is variable by nature, making resource-based CUDs impractical. Spend-based CUDs for Cloud Run commit to a minimum monthly spend and provide a 17% discount across all Cloud Run consumption above a free tier. For organisations running production workloads on Cloud Run with predictable aggregate demand, spend-based CUDs provide a simple, low-risk optimisation.
When to Negotiate a Private Pricing Agreement
Google Cloud CUDs are self-service — they can be purchased without any negotiation, and for organisations with annual GCP spend below $1–2M, they are often the right and only commercial tool needed. But for larger GCP consumers, CUDs leave significant value on the table compared to what is achievable through a negotiated private pricing agreement.
A Google Cloud private pricing agreement (equivalent to an AWS EDP) provides:
- Broader discounts across the GCP service catalogue — not just compute and specific managed services covered by CUDs
- Migration credits for organisations moving workloads from other providers or on-premises
- Professional services credits for Google Cloud architecture and optimisation work
- Training credits for Google Cloud certification programmes
- A dedicated enterprise commercial team relationship with access to Google Cloud roadmap and executive engagement
The threshold for serious private agreement engagement is approximately $1M–$2M in annual GCP spend, but Google Cloud's commercial team will engage with high-growth customers or organisations undertaking significant migrations below this threshold. Google Cloud is typically the most commercially aggressive of the three hyperscalers for competitive wins — private agreement discounts of 25–40% are achievable for organisations displacing AWS or Azure workloads, significantly above what CUDs alone can deliver.
If your organisation is at or approaching $1M annual GCP spend, the conversation to initiate a private pricing agreement should start 3–6 months before your next CUD renewal. The timing matters — GCP's commercial team has more flexibility to include new CUDs within a private agreement structure at the point of agreement inception than as modifications after the fact. Our cloud contract negotiation service covers GCP private agreement negotiations alongside AWS and Azure engagements. See also our hyperscaler comparison and the Cloud Contract Framework.