Back to Blog
FinOps
3 min read

Synapse Spark Write Patterns: How Default Parallelism Inflates Your Storage Bill

AzureSynapseSparkADLS Gen2Cost OptimisationFinOps

Synapse Spark, by default, generates far more concurrent storage operations than most workloads need. The bill creeps up, the natural response is a Premium tier upgrade, and most of the time that upgrade isn't actually needed.

The pattern

We see this regularly in environments running Synapse Spark against ADLS Gen2. The pipeline looks simple: read from one or more sources, transform, merge into a consolidated dataset. The merge step is where the bill quietly inflates.

By default, Spark distributes the merge across 200 shuffle partitions. Each partition writes simultaneously, generating 200 concurrent write operations against a single storage endpoint. During these bursts, server latency spikes from single-digit milliseconds into multiple seconds per operation. The storage throttles, operations queue, and what should take two minutes takes nine.

This isn't a storage tier problem. Standard ADLS Gen2 handles sustained throughput fine. What it can't absorb gracefully is hundreds of small write operations arriving in a tight burst.

Where it inflates the bill

The obvious cost is time. Spark pools bill per-node per-minute. A merge that takes 4-5x longer than it should bills 4-5x as much.

The expensive cost is the misdiagnosis. When merges slow down, the natural response is to upgrade storage to Premium. For four ADLS Gen2 accounts at 1.3 TB, that upgrade runs around £780/month, or roughly £9,360/year. If the bottleneck is write parallelism rather than storage throughput, that money buys nothing.

What we found

In one environment, tuning Spark's write parallelism cut merge time from nine minutes to under four. The storage tier stayed on Standard. The £9,360/year Premium upgrade that had been on the proposal got pulled.

The remaining four minutes were down to network latency between the Spark cluster and the storage endpoint, an architectural issue (cluster and storage in different regions, or routed through network appliances) rather than a Spark configuration issue.

The principle

Storage tier upgrades are easy to authorise and hard to reverse. A short review of Spark partition settings and storage server-latency metrics is usually enough to tell you whether the bottleneck is the storage or the workload writing to it. Measure first, spend second.

If you do have a workload that legitimately needs high concurrent write parallelism, Premium tier is the right answer. The point is knowing that's actually true before signing off.


Synapse costs creeping up without clear explanation? Our free cost assessment includes a workload-level review that identifies configuration savings before infrastructure ones.

How mature is your cloud cost management?

Take our free 2-minute FinOps maturity test and get a personalised improvement roadmap.