Back to Blog
Azure
6 min read

SQL Server 2022 Upgrade Gone Wrong: How Tempdb on the Wrong Disk Caused 4-Second Write Latency

SQL ServerAzure VMsPerformanceTroubleshootingIaaS

A routine SQL Server 2022 upgrade on an Azure VM turned into nearly a week of degraded performance and finger-pointing at the new version. The actual problem had nothing to do with SQL Server 2022. It was a disk that was never fit for purpose.

The Setup

SQL Server on Azure IaaS, running on a general-purpose VM. Had been on SQL Server 2019 for a couple of years without major complaints. The upgrade to 2022 went smoothly — installer ran, databases came online, compatibility levels updated. Textbook.

For the first few days, everything seemed fine. Then the trouble started.

When Jobs Started Hanging

Four to five days post-upgrade, database jobs began taking significantly longer. Queries that normally completed in seconds were hanging. Agent jobs that ran overnight were still running come morning. Users reported sluggish performance across the board.

The immediate assumption: SQL Server 2022 had introduced a regression. New version, new problems. But obvious conclusions are often wrong.

Chasing the Wrong Signals

Wait statistics showed SOS_WORK_DISPATCHER sitting prominently near the top. If you've been around SQL Server long enough, you know this one sends you down a rabbit hole. It looks alarming but it's almost entirely benign — an internal scheduling wait that accumulates on idle schedulers.

What we expected to see was PAGEIOLATCH waits — the classic I/O bottleneck indicators. They were present but weren't dominating, because the bottleneck was concentrated on one specific set of files rather than spread across the instance.

The real clue came from the error logs: "I/O requests taking longer than 15 seconds to complete." These correlated with backup operations and heavy tempdb usage. When backups kicked off, I/O contention spiked and everything ground to a halt.

Finding the Real Problem

Per-file I/O latency told the story immediately. User databases were showing read and write latency under 100 milliseconds — not brilliant, but acceptable. Tempdb was showing write latency of over 4,000 milliseconds. Four full seconds per write operation.

For a database engine that relies on tempdb for sort spills, hash joins, version store, temporary tables, and a dozen other internal operations, that's catastrophic.

The cause was obvious once we checked the disk layout. Tempdb files were sitting on the same Standard SSD E30 as the user data files. That's a maximum of 500 IOPS shared between every database on the disk. When backups hit the data files and tempdb activity spiked simultaneously, the disk was completely saturated. Tempdb writes queued.

Why It Worked Before (Sort Of)

If the disk was always this slow, why did it only surface after the 2022 upgrade?

It was probably always a bottleneck — just not a visible one. SQL Server 2022 makes heavier use of tempdb in certain scenarios, particularly around query processing improvements and system page concurrency. The upgrade didn't create the problem. It amplified a pre-existing weakness past the point of tolerance.

This is a pattern we see repeatedly with SQL Server upgrades on Azure IaaS. The upgrade itself is fine. The underlying infrastructure that was "good enough" turns out to be inadequate for the new version. Nobody re-validates the storage tier after an upgrade. They assume if it worked before, it'll work now.

The Politics of Tempdb Placement

Microsoft's best practice for SQL Server on Azure VMs is clear: put tempdb on the local temporary disk (D: drive). It's fast, it's included, and it's ephemeral — perfect for tempdb since it gets rebuilt every time SQL Server starts.

We raised this. Management pushed back.

The concern was valid on the surface: if the VM gets deallocated, the temporary disk is wiped. SQL Server expects tempdb files where it left them. If those files are on an erased disk, SQL Server fails to start. This had happened to the team before, causing an outage requiring manual intervention.

The answer wasn't to avoid the local disk — it was to mitigate the deallocation risk. Microsoft's own guidance covers this: a startup script that recreates the tempdb folder structure on the D: drive before the SQL Server service starts. If the VM gets deallocated and the local disk is wiped, the script rebuilds the directory, SQL Server creates fresh tempdb files on startup, and the service comes up cleanly. No manual intervention.

Armed with Microsoft's documentation and the mitigation script, management agreed. The local temporary disk became the primary recommendation.

For environments where the local disk genuinely isn't suitable — perhaps it's too small, or the VM SKU doesn't have one — a dedicated Premium SSD is the fallback:

The Fix

Tempdb moved to the local temporary disk. For comparison, here's what the numbers looked like against the original shared disk — and what a dedicated Premium SSD would deliver as an alternative:

MetricStandard SSD E30 (Before)Premium SSD P20 (After)
Baseline IOPS5002,300
Max burst IOPS1,5003,500
Max throughput60 MB/s150 MB/s
Shared with data filesYesNo (dedicated)

The key insight with tempdb file moves: ALTER DATABASE for tempdb only updates system metadata — it doesn't physically move anything. SQL Server rebuilds tempdb from scratch on every service restart. So the metadata change runs in milliseconds during business hours with zero impact. The actual move happens on the next planned restart.

Total downtime for the restart: under two minutes.

The Result

The difference was immediate. Tempdb write latency dropped from 4,000+ ms to under 5 ms. The 15-second I/O warnings vanished from the error log. Database jobs returned to normal execution times. Backups no longer caused cascading slowdowns.

The Takeaway

After any SQL Server upgrade on Azure IaaS, validate your storage configuration. Don't just check that databases came online — check that disk performance is adequate for the new version's workload patterns. Anything over 20ms for tempdb writes should raise a flag.

The upgrade was never the problem. The disk was always the bottleneck. SQL Server 2022 just stopped being polite about it.

If your tempdb is sharing a disk with anything else on Azure, fix it before you find out the hard way. A dedicated Premium SSD costs a fraction of the downtime and troubleshooting hours you'll spend when it inevitably becomes a problem.


Running SQL Server on Azure IaaS and not sure your storage is up to scratch? Get in touch and we'll review your configuration before it becomes an incident.

Need help with your Azure environment?

Get in touch for a free consultation.

Get in Touch