CI/CD pipelines are one of the best workloads in Azure for spot instances. Short-lived, stateless, fault-tolerant. If a build agent gets evicted, the build fails and retries on a new instance. Annoying, yes. Catastrophic, not remotely.
And yet, most teams I work with are running their DevOps agents on full pay-as-you-go compute. Paying premium rates for infrastructure that exists for minutes at a time and produces nothing that can't be reproduced.
This builds on two earlier posts in this series: Managed DevOps Pools with scale sets and containers and zero-standby agent pools. Spot is the next lever.
What are Spot VMs
Azure Spot VMs give you access to unused capacity at 60-90% off pay-as-you-go pricing. The trade-off is simple. Azure can reclaim the capacity with 30 seconds notice when demand from regular customers increases.
For workloads that can handle interruption, this is an extraordinary deal. For workloads that can't, it's a non-starter. CI/CD sits firmly in the first category.
Why CI/CD is the perfect spot workload
Not every workload tolerates eviction well. Databases don't. Long-running stateful services don't. But build agents are practically designed for it.
Builds are stateless. Each pipeline run starts from a clean checkout. No session state, no accumulated data, nothing that can't be recreated from source control.
Builds are short-lived. Most pipelines complete in 5-45 minutes. The shorter the job, the lower the probability of eviction during execution.
Builds are idempotent. If a build fails for any reason, including eviction, you trigger it again and get the same result.
Failed builds aren't outages. A build that fails and retries five minutes later is a minor inconvenience. Nobody's production environment is affected.
Compare this to running a production API on spot: an eviction means dropped requests, potential data loss, and a degraded customer experience. The risk profiles aren't even in the same postcode.
The numbers
Take a D4s_v5, a solid mid-range VM commonly used for build agents.
| Pricing Model | Cost per Minute | 30-Minute Build |
|---|---|---|
| On-demand (pay-as-you-go) | ~£0.15 | ~£4.50 |
| Spot | ~£0.02-0.04 | ~£0.60-1.20 |
Spot pricing varies by region and current demand, but discounts of 70-85% on D-series VMs are typical in UK South and West Europe.
Scale that to a real team running 20 builds per day across 22 working days. 440 builds per month, average 30 minutes each:
| Model | Monthly Cost |
|---|---|
| On-demand agents | ~£1,980 |
| Spot agents | ~£264-528 |
| Monthly saving | ~£1,450-1,716 |
That's roughly £17,000-20,000 a year on agent compute alone. For larger organisations running hundreds of builds daily, the numbers get substantially bigger.
The eviction reality
The obvious concern is eviction. What happens when Azure reclaims your spot capacity mid-build?
The build fails. The agent is terminated, the pipeline reports a failure, and if configured for it, the build retries automatically on a new instance. The developer sees a failed run followed by a successful one. A few minutes of delay.
Eviction rates vary by VM SKU and region, but for D-series VMs in UK South, rates typically sit in the 0-5% or 5-10% range. On any individual build, the chance of eviction is low. Over hundreds of builds per month, you'll see the occasional one. The question is whether that occasional retry is worth saving £1,500 a month.
The hybrid pipeline pattern
The smartest approach isn't all-spot or all-on-demand. It's both, strategically, within the same pipeline.
Build stage is stateless compilation and packaging. Perfect for spot. If evicted, rebuild from scratch, nothing is lost.
Test stage is running test suites against build artefacts. Also fine for spot. Tests are idempotent, a retry produces the same results.
Deploy stage (production) is applying changes to live infrastructure. This is where you want guaranteed completion.
In Azure DevOps, you can assign different agent pools to different stages. One pool configured for spot (build and test stages), one configured for on-demand (production deploys). Your YAML pipeline references the appropriate pool per stage.
The result: 80-90% of your pipeline compute runs on spot pricing, while the critical production deploy stage runs on guaranteed capacity.
Combining with zero-standby pools
The real cost floor comes from combining two techniques. Zero-standby pools ensure you pay nothing when nobody is building. Spot instances ensure you pay 60-90% less when someone is. Together, they represent the absolute minimum cost for self-hosted CI/CD agents.
Take the earlier example of 440 builds per month at 30 minutes each:
| Configuration | Monthly Cost |
|---|---|
| Traditional always-on agents (on-demand) | ~£1,980 |
| Zero-standby pools (on-demand) | ~£660 |
| Zero-standby pools + spot | ~£130-265 |
From nearly two thousand a month to potentially under three hundred. An 85-93% reduction in agent compute costs, with no change to your pipeline logic, no change to your build scripts, and no impact on build output quality.
The only trade-off is a cold-start delay (2-5 minutes when no agents are warm) plus an occasional eviction retry. For the vast majority of teams, that's a trade-off worth making.
When to avoid spot
Production release pipelines where you need guaranteed completion. Very long-running builds (2+ hours) where cumulative eviction probability climbs. Time-critical hotfix deployments where every minute counts.
For everything else, spot is worth a look.
Spending more than you should on DevOps agent compute? Get in touch, we help teams slash CI/CD costs with spot instances, zero-standby pools, and right-sized agent infrastructure.