From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Jan Kiszka <jan.kiszka@siemens.com>,
Florian Bezdeka <florian.bezdeka@siemens.com>,
Michael Kelley <mhklinux@outlook.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Sasha Levin <sashal@kernel.org>,
kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
decui@microsoft.com, longli@microsoft.com,
James.Bottomley@HansenPartnership.com,
linux-hyperv@vger.kernel.org, linux-scsi@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19-5.10] scsi: storvsc: Fix scheduling while atomic on PREEMPT_RT
Date: Thu, 5 Mar 2026 10:36:53 -0500 [thread overview]
Message-ID: <20260305153704.106918-10-sashal@kernel.org> (raw)
In-Reply-To: <20260305153704.106918-1-sashal@kernel.org>
From: Jan Kiszka <jan.kiszka@siemens.com>
[ Upstream commit 57297736c08233987e5d29ce6584c6ca2a831b12 ]
This resolves the follow splat and lock-up when running with PREEMPT_RT
enabled on Hyper-V:
[ 415.140818] BUG: scheduling while atomic: stress-ng-iomix/1048/0x00000002
[ 415.140822] INFO: lockdep is turned off.
[ 415.140823] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel aesni_intel rapl binfmt_misc nls_ascii nls_cp437 vfat fat snd_pcm hyperv_drm snd_timer drm_client_lib drm_shmem_helper snd sg soundcore drm_kms_helper pcspkr hv_balloon hv_utils evdev joydev drm configfs efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vsock vmw_vmci efivarfs autofs4 ext4 crc16 mbcache jbd2 sr_mod sd_mod cdrom hv_storvsc serio_raw hid_generic scsi_transport_fc hid_hyperv scsi_mod hid hv_netvsc hyperv_keyboard scsi_common
[ 415.140846] Preemption disabled at:
[ 415.140847] [<ffffffffc0656171>] storvsc_queuecommand+0x2e1/0xbe0 [hv_storvsc]
[ 415.140854] CPU: 8 UID: 0 PID: 1048 Comm: stress-ng-iomix Not tainted 6.19.0-rc7 #30 PREEMPT_{RT,(full)}
[ 415.140856] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/04/2024
[ 415.140857] Call Trace:
[ 415.140861] <TASK>
[ 415.140861] ? storvsc_queuecommand+0x2e1/0xbe0 [hv_storvsc]
[ 415.140863] dump_stack_lvl+0x91/0xb0
[ 415.140870] __schedule_bug+0x9c/0xc0
[ 415.140875] __schedule+0xdf6/0x1300
[ 415.140877] ? rtlock_slowlock_locked+0x56c/0x1980
[ 415.140879] ? rcu_is_watching+0x12/0x60
[ 415.140883] schedule_rtlock+0x21/0x40
[ 415.140885] rtlock_slowlock_locked+0x502/0x1980
[ 415.140891] rt_spin_lock+0x89/0x1e0
[ 415.140893] hv_ringbuffer_write+0x87/0x2a0
[ 415.140899] vmbus_sendpacket_mpb_desc+0xb6/0xe0
[ 415.140900] ? rcu_is_watching+0x12/0x60
[ 415.140902] storvsc_queuecommand+0x669/0xbe0 [hv_storvsc]
[ 415.140904] ? HARDIRQ_verbose+0x10/0x10
[ 415.140908] ? __rq_qos_issue+0x28/0x40
[ 415.140911] scsi_queue_rq+0x760/0xd80 [scsi_mod]
[ 415.140926] __blk_mq_issue_directly+0x4a/0xc0
[ 415.140928] blk_mq_issue_direct+0x87/0x2b0
[ 415.140931] blk_mq_dispatch_queue_requests+0x120/0x440
[ 415.140933] blk_mq_flush_plug_list+0x7a/0x1a0
[ 415.140935] __blk_flush_plug+0xf4/0x150
[ 415.140940] __submit_bio+0x2b2/0x5c0
[ 415.140944] ? submit_bio_noacct_nocheck+0x272/0x360
[ 415.140946] submit_bio_noacct_nocheck+0x272/0x360
[ 415.140951] ext4_read_bh_lock+0x3e/0x60 [ext4]
[ 415.140995] ext4_block_write_begin+0x396/0x650 [ext4]
[ 415.141018] ? __pfx_ext4_da_get_block_prep+0x10/0x10 [ext4]
[ 415.141038] ext4_da_write_begin+0x1c4/0x350 [ext4]
[ 415.141060] generic_perform_write+0x14e/0x2c0
[ 415.141065] ext4_buffered_write_iter+0x6b/0x120 [ext4]
[ 415.141083] vfs_write+0x2ca/0x570
[ 415.141087] ksys_write+0x76/0xf0
[ 415.141089] do_syscall_64+0x99/0x1490
[ 415.141093] ? rcu_is_watching+0x12/0x60
[ 415.141095] ? finish_task_switch.isra.0+0xdf/0x3d0
[ 415.141097] ? rcu_is_watching+0x12/0x60
[ 415.141098] ? lock_release+0x1f0/0x2a0
[ 415.141100] ? rcu_is_watching+0x12/0x60
[ 415.141101] ? finish_task_switch.isra.0+0xe4/0x3d0
[ 415.141103] ? rcu_is_watching+0x12/0x60
[ 415.141104] ? __schedule+0xb34/0x1300
[ 415.141106] ? hrtimer_try_to_cancel+0x1d/0x170
[ 415.141109] ? do_nanosleep+0x8b/0x160
[ 415.141111] ? hrtimer_nanosleep+0x89/0x100
[ 415.141114] ? __pfx_hrtimer_wakeup+0x10/0x10
[ 415.141116] ? xfd_validate_state+0x26/0x90
[ 415.141118] ? rcu_is_watching+0x12/0x60
[ 415.141120] ? do_syscall_64+0x1e0/0x1490
[ 415.141121] ? do_syscall_64+0x1e0/0x1490
[ 415.141123] ? rcu_is_watching+0x12/0x60
[ 415.141124] ? do_syscall_64+0x1e0/0x1490
[ 415.141125] ? do_syscall_64+0x1e0/0x1490
[ 415.141127] ? irqentry_exit+0x140/0x7e0
[ 415.141129] entry_SYSCALL_64_after_hwframe+0x76/0x7e
get_cpu() disables preemption while the spinlock hv_ringbuffer_write is
using is converted to an rt-mutex under PREEMPT_RT.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Tested-by: Florian Bezdeka <florian.bezdeka@siemens.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/0c7fb5cd-fb21-4760-8593-e04bade84744@siemens.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have enough context to analyze this commit thoroughly.
## Analysis
### 1. Problem Description
The commit fixes a **"BUG: scheduling while atomic"** crash and
**lockup** on Hyper-V VMs running with `PREEMPT_RT` enabled. The stack
trace in the commit message clearly shows the issue:
- `storvsc_queuecommand()` calls `get_cpu()` which disables preemption
- It then calls `storvsc_do_io()` → `vmbus_sendpacket_mpb_desc()` →
`hv_ringbuffer_write()`
- `hv_ringbuffer_write()` takes a spinlock that, under PREEMPT_RT, is
converted to an rt-mutex
- rt-mutexes can sleep/schedule, but preemption is disabled →
**scheduling while atomic BUG**
### 2. The Fix
The fix replaces:
```c
ret = storvsc_do_io(dev, cmd_request, get_cpu());
put_cpu();
```
with:
```c
migrate_disable();
ret = storvsc_do_io(dev, cmd_request, smp_processor_id());
migrate_enable();
```
The purpose of `get_cpu()` here was to get a stable CPU number to use as
a channel index in `storvsc_do_io()`. The actual requirement is just to
prevent migration (so the CPU number stays valid), not to disable
preemption entirely. `migrate_disable()` achieves this while allowing
scheduling under PREEMPT_RT.
### 3. Correctness
- `migrate_disable()` prevents the task from being migrated to another
CPU, so `smp_processor_id()` remains valid throughout the call
- On non-PREEMPT_RT kernels, this is functionally equivalent
(migrate_disable maps to preempt_disable)
- On PREEMPT_RT, it allows the rt-mutex in `hv_ringbuffer_write()` to
sleep as needed
### 4. Scope and Risk
- **3 lines changed** - extremely small and surgical
- Only affects `storvsc_queuecommand()` in the Hyper-V storage driver
- Well-understood transformation pattern
(`get_cpu()`→`migrate_disable()`+`smp_processor_id()`) used
extensively across the kernel for PREEMPT_RT fixes
- Has been **Tested-by** two people and **Reviewed-by** the Hyper-V
subsystem expert (Michael Kelley)
### 5. User Impact
- **Hyper-V VMs with PREEMPT_RT**: This is a hard crash/lockup during
normal I/O operations (ext4 writes), making the system completely
unusable
- PREEMPT_RT is increasingly used in enterprise and embedded
deployments, including on Hyper-V/Azure
- Without this fix, PREEMPT_RT kernels on Hyper-V are effectively broken
for any storage I/O
### 6. Stable Criteria Assessment
- **Obviously correct**: Yes - standard PREEMPT_RT fix pattern
- **Fixes a real bug**: Yes - kernel BUG + lockup
- **Important issue**: Yes - system lockup during normal I/O
- **Small and contained**: Yes - 3-line change in one file
- **No new features**: Correct - pure bug fix
- **Tested**: Yes - two Tested-by tags
### Verification
- Read `storvsc_drv.c:1858` confirming the old code uses
`get_cpu()`/`put_cpu()` (matches the diff)
- Read `storvsc_do_io()` at line 1453: confirms `q_num` is used as a CPU
index into `stor_chns[]` array and passed to
`cpumask_of_node(cpu_to_node(q_num))` - only needs migration
stability, not preemption disabled
- Confirmed the call chain: `storvsc_do_io()` →
`vmbus_sendpacket_mpb_desc()` (line 1547) → `hv_ringbuffer_write()`
which takes a spinlock (as shown in the stack trace)
- The stack trace shows `rt_spin_lock` → `rtlock_slowlock_locked` →
`schedule_rtlock` confirming the spinlock-to-rt-mutex conversion is
the trigger
- `git log` confirmed this is in a well-maintained driver with recent
activity
**YES**
drivers/scsi/storvsc_drv.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index b43d876747b76..68c837146b9ea 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1855,8 +1855,9 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd)
cmd_request->payload_sz = payload_sz;
/* Invokes the vsc to start an IO */
- ret = storvsc_do_io(dev, cmd_request, get_cpu());
- put_cpu();
+ migrate_disable();
+ ret = storvsc_do_io(dev, cmd_request, smp_processor_id());
+ migrate_enable();
if (ret)
scsi_dma_unmap(scmnd);
--
2.51.0
next prev parent reply other threads:[~2026-03-05 15:37 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 15:36 [PATCH AUTOSEL 6.19-6.18] scsi: ufs: core: Reset urgent_bkops_lvl to allow runtime PM power mode Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-5.10] unshare: fix unshare_fs() handling Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.12] drm/amdgpu/vcn5: Add SMU dpm interface type Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.1] wifi: mac80211: set default WMM parameters on all links Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-5.15] ALSA: usb-audio: Check max frame size for implicit feedback mode, too Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-5.10] scsi: ses: Fix devices attaching to different hosts Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.6] ASoC: cs42l43: Report insert for exotic peripherals Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-5.15] ALSA: usb-audio: Avoid implicit feedback mode on DIYINHK USB Audio 2.0 Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-5.10] ACPI: PM: Save NVS memory on Lenovo G70-35 Sasha Levin
2026-03-05 15:36 ` Sasha Levin [this message]
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.1] ASoC: amd: yc: Add ASUS EXPERTBOOK BM1503CDA to quirk table Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-5.10] ACPI: OSI: Add DMI quirk for Acer Aspire One D255 Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.18] fs: init flags_valid before calling vfs_fileattr_get Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.6] scsi: ufs: core: Fix possible NULL pointer dereference in ufshcd_add_command_trace() Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.18] perf/core: Fix refcount bug and potential UAF in perf_mmap Sasha Levin
2026-03-05 15:36 ` [PATCH AUTOSEL 6.19-6.6] scsi: ufs: core: Fix shift out of bounds when MAXQ=32 Sasha Levin
2026-03-05 15:37 ` [PATCH AUTOSEL 6.19-5.15] scsi: mpi3mr: Add NULL checks when resetting request and reply queues Sasha Levin
2026-03-05 15:37 ` [PATCH AUTOSEL 6.19-6.12] ALSA: hda/realtek: Fix speaker pop on Star Labs StarFighter Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260305153704.106918-10-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=decui@microsoft.com \
--cc=florian.bezdeka@siemens.com \
--cc=haiyangz@microsoft.com \
--cc=jan.kiszka@siemens.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=martin.petersen@oracle.com \
--cc=mhklinux@outlook.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox