* [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
@ 2026-04-13 7:22 Jinhui Guo
2026-04-13 7:45 ` Michael S. Tsirkin
0 siblings, 1 reply; 7+ messages in thread
From: Jinhui Guo @ 2026-04-13 7:22 UTC (permalink / raw)
To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, Eugenio Pérez,
Jiri Pirko
Cc: virtualization, linux-kernel, Jinhui Guo, stable
virtqueue_exec_admin_cmd() holds admin_vq->lock with spin_lock_irqsave(),
which disables interrupts. Using GFP_KERNEL inside this critical section
is unsafe because kmalloc() may sleep, leading to potential deadlocks or
scheduling violations.
Switch to GFP_ATOMIC to ensure the allocation is non-blocking.
Fixes: 4c3b54af907e ("virtio_pci_modern: use completion instead of busy loop to wait on admin cmd result")
Cc: stable@vger.kernel.org
Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
---
drivers/virtio/virtio_pci_modern.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index 6d8ae2a6a8ca..db8e4f88b749 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -101,7 +101,7 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
return -EIO;
spin_lock_irqsave(&admin_vq->lock, flags);
- ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL);
+ ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
if (ret < 0) {
if (ret == -ENOSPC) {
spin_unlock_irqrestore(&admin_vq->lock, flags);
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() 2026-04-13 7:22 [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() Jinhui Guo @ 2026-04-13 7:45 ` Michael S. Tsirkin 2026-04-13 9:17 ` David Laight 2026-04-13 10:00 ` Jinhui Guo 0 siblings, 2 replies; 7+ messages in thread From: Michael S. Tsirkin @ 2026-04-13 7:45 UTC (permalink / raw) To: Jinhui Guo Cc: Jason Wang, Xuan Zhuo, Eugenio Pérez, Jiri Pirko, virtualization, linux-kernel, stable On Mon, Apr 13, 2026 at 03:22:49PM +0800, Jinhui Guo wrote: > virtqueue_exec_admin_cmd() holds admin_vq->lock with spin_lock_irqsave(), > which disables interrupts. Using GFP_KERNEL inside this critical section > is unsafe because kmalloc() may sleep, leading to potential deadlocks or > scheduling violations. > > Switch to GFP_ATOMIC to ensure the allocation is non-blocking. > > Fixes: 4c3b54af907e ("virtio_pci_modern: use completion instead of busy loop to wait on admin cmd result") > Cc: stable@vger.kernel.org > Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> > --- > drivers/virtio/virtio_pci_modern.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c > index 6d8ae2a6a8ca..db8e4f88b749 100644 > --- a/drivers/virtio/virtio_pci_modern.c > +++ b/drivers/virtio/virtio_pci_modern.c > @@ -101,7 +101,7 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq, > return -EIO; > > spin_lock_irqsave(&admin_vq->lock, flags); > - ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL); > + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC); > if (ret < 0) { > if (ret == -ENOSPC) { > spin_unlock_irqrestore(&admin_vq->lock, flags); GFP_ATOMIC allocations can and will fail. If using them, one must retry, not just propagate failures. Or just switch admin_vq->lock to a mutex? > -- > 2.20.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() 2026-04-13 7:45 ` Michael S. Tsirkin @ 2026-04-13 9:17 ` David Laight 2026-04-13 12:22 ` Jinhui Guo 2026-04-13 10:00 ` Jinhui Guo 1 sibling, 1 reply; 7+ messages in thread From: David Laight @ 2026-04-13 9:17 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jinhui Guo, Jason Wang, Xuan Zhuo, Eugenio Pérez, Jiri Pirko, virtualization, linux-kernel, stable On Mon, 13 Apr 2026 03:45:20 -0400 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Mon, Apr 13, 2026 at 03:22:49PM +0800, Jinhui Guo wrote: > > virtqueue_exec_admin_cmd() holds admin_vq->lock with spin_lock_irqsave(), > > which disables interrupts. Using GFP_KERNEL inside this critical section > > is unsafe because kmalloc() may sleep, leading to potential deadlocks or > > scheduling violations. > > > > Switch to GFP_ATOMIC to ensure the allocation is non-blocking. > > > > Fixes: 4c3b54af907e ("virtio_pci_modern: use completion instead of busy loop to wait on admin cmd result") > > Cc: stable@vger.kernel.org > > Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> > > --- > > drivers/virtio/virtio_pci_modern.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c > > index 6d8ae2a6a8ca..db8e4f88b749 100644 > > --- a/drivers/virtio/virtio_pci_modern.c > > +++ b/drivers/virtio/virtio_pci_modern.c > > @@ -101,7 +101,7 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq, > > return -EIO; > > > > spin_lock_irqsave(&admin_vq->lock, flags); > > - ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL); > > + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC); > > if (ret < 0) { > > if (ret == -ENOSPC) { > > spin_unlock_irqrestore(&admin_vq->lock, flags); > > > GFP_ATOMIC allocations can and will fail. If using them, one must > retry, not just propagate failures. > Or just switch admin_vq->lock to a mutex? Or do the allocate before acquiring the lock (and free it not used in the error path). David > > > > -- > > 2.20.1 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() 2026-04-13 9:17 ` David Laight @ 2026-04-13 12:22 ` Jinhui Guo 2026-04-13 13:33 ` David Laight 2026-04-13 14:14 ` Eugenio Perez Martin 0 siblings, 2 replies; 7+ messages in thread From: Jinhui Guo @ 2026-04-13 12:22 UTC (permalink / raw) To: David Laight Cc: Michael S. Tsirkin, Eugenio Pérez, Jinhui Guo, Jason Wang, Jiri Pirko, Xuan Zhuo, linux-kernel, stable, virtualization On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote: > Or do the allocate before acquiring the lock (and free it not used > in the error path). Hi David, Thanks for the suggestion. Pre-allocating the memory outside the lock is indeed a good practice, but unfortunately it doesn't work in this specific virtqueue context. The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd() level. Instead, it is deeply embedded inside virtqueue_add_sgs() (specifically, in functions like alloc_indirect_split() or virtqueue_add_indirect_packed()) to allocate indirect descriptors when multiple SG elements are provided. As a caller, we have no mechanism to pre-allocate this indirect descriptor memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs() needs to atomically check the queue's num_free status, allocate the indirect table if necessary, and update the queue pointers. All these operations must be protected by admin_vq->lock to prevent concurrent admin command submissions from corrupting the virtqueue state. Therefore, allocating before acquiring the lock isn't feasible here, and replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon failure) seems to be the more viable fix. Does this make sense? Thanks, Jinhui ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() 2026-04-13 12:22 ` Jinhui Guo @ 2026-04-13 13:33 ` David Laight 2026-04-13 14:14 ` Eugenio Perez Martin 1 sibling, 0 replies; 7+ messages in thread From: David Laight @ 2026-04-13 13:33 UTC (permalink / raw) To: Jinhui Guo Cc: Michael S. Tsirkin, Eugenio Pérez, Jason Wang, Jiri Pirko, Xuan Zhuo, linux-kernel, stable, virtualization On Mon, 13 Apr 2026 20:22:44 +0800 "Jinhui Guo" <guojinhui.liam@bytedance.com> wrote: > On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote: > > Or do the allocate before acquiring the lock (and free it not used > > in the error path). > > Hi David, > > Thanks for the suggestion. > > Pre-allocating the memory outside the lock is indeed a good practice, > but unfortunately it doesn't work in this specific virtqueue context. > > The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd() > level. Instead, it is deeply embedded inside virtqueue_add_sgs() > (specifically, in functions like alloc_indirect_split() or > virtqueue_add_indirect_packed()) to allocate indirect descriptors when > multiple SG elements are provided. > > As a caller, we have no mechanism to pre-allocate this indirect descriptor > memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs() > needs to atomically check the queue's num_free status, allocate the indirect > table if necessary, and update the queue pointers. All these operations > must be protected by admin_vq->lock to prevent concurrent admin command > submissions from corrupting the virtqueue state. It just sounds non-trivial... > > Therefore, allocating before acquiring the lock isn't feasible here, and > replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon > failure) seems to be the more viable fix. The sleep-retry isn't really ideal - and may not make progress. An 'interesting' solution would be to return the size of the kmalloc() that failed, kmalloc() and kfree() a buffer of that size and hope it is still available for the retry. For a quick read of the code it is always a constant multiplied by the number of fragments. Although I only found kmalloc() in the 'indirect' paths. I didn't spot what happens if the ring itself is full. David > > Does this make sense? > > Thanks, > Jinhui ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() 2026-04-13 12:22 ` Jinhui Guo 2026-04-13 13:33 ` David Laight @ 2026-04-13 14:14 ` Eugenio Perez Martin 1 sibling, 0 replies; 7+ messages in thread From: Eugenio Perez Martin @ 2026-04-13 14:14 UTC (permalink / raw) To: Jinhui Guo Cc: David Laight, Michael S. Tsirkin, Jason Wang, Jiri Pirko, Xuan Zhuo, linux-kernel, stable, virtualization On Mon, Apr 13, 2026 at 2:23 PM Jinhui Guo <guojinhui.liam@bytedance.com> wrote: > > On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote: > > Or do the allocate before acquiring the lock (and free it not used > > in the error path). > > Hi David, > > Thanks for the suggestion. > > Pre-allocating the memory outside the lock is indeed a good practice, > but unfortunately it doesn't work in this specific virtqueue context. > > The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd() > level. Instead, it is deeply embedded inside virtqueue_add_sgs() > (specifically, in functions like alloc_indirect_split() or > virtqueue_add_indirect_packed()) to allocate indirect descriptors when > multiple SG elements are provided. > > As a caller, we have no mechanism to pre-allocate this indirect descriptor > memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs() > needs to atomically check the queue's num_free status, allocate the indirect > table if necessary, and update the queue pointers. All these operations > must be protected by admin_vq->lock to prevent concurrent admin command > submissions from corrupting the virtqueue state. > Sounds like a big chunk of that is achieved with virtqueue_map_* and virtqueue_add_{in,out}buf_premapped functions, isn't it? Or am I missing something? > Therefore, allocating before acquiring the lock isn't feasible here, and > replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon > failure) seems to be the more viable fix. > > Does this make sense? > > Thanks, > Jinhui > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() 2026-04-13 7:45 ` Michael S. Tsirkin 2026-04-13 9:17 ` David Laight @ 2026-04-13 10:00 ` Jinhui Guo 1 sibling, 0 replies; 7+ messages in thread From: Jinhui Guo @ 2026-04-13 10:00 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Eugenio Pérez, Jinhui Guo, Jason Wang, Jiri Pirko, Xuan Zhuo, linux-kernel, stable, virtualization On Mon, Apr 13, 2026 at 03:45:20 -0400, "Michael S. Tsirkin" wrote: > GFP_ATOMIC allocations can and will fail. If using them, one must > retry, not just propagate failures. > Or just switch admin_vq->lock to a mutex? Hi Michael, Thank you for the review. Regarding the suggestion to switch admin_vq->lock to a mutex: The virtqueue callback vp_modern_avq_done() holds admin_vq->lock and runs in an interrupt handler context, making it impractical to replace the spinlock with a mutex directly. I considered deferring the completion to a workqueue so we could safely use a mutex, but since this is a bug fix destined for stable@vger.kernel.org, doing so would introduce significant code churn (e.g., handling INIT_WORK, cancel_work_sync during cleanup, etc.) and increase the risk for backports. Therefore, using GFP_ATOMIC with the existing spinlock seems to be the most minimal and safest approach for a fix. However, just replacing GFP_KERNEL with GFP_ATOMIC isn't entirely safe because of how virtqueue_add_sgs() handles allocation failures. If kmalloc() fails under memory pressure with GFP_ATOMIC, the function falls back to using direct descriptors. If there are not enough free direct descriptors, it ultimately returns -ENOSPC. In the current code, -ENOSPC is handled with a busy loop: if (ret == -ENOSPC) { spin_unlock_irqrestore(&admin_vq->lock, flags); cpu_relax(); goto again; } If the -ENOSPC is actually caused by a GFP_ATOMIC allocation failure under memory pressure, this cpu_relax() loop will never yield the CPU to memory reclaim mechanisms (like kswapd), potentially leading to a soft lockup. To properly handle both actual queue-full conditions and GFP_ATOMIC failures, I propose replacing cpu_relax() with a sleep (e.g., usleep_range(10, 100)). This allows memory reclaim to run while we wait. I plan to send out a v2 patch with this modification: --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -101,11 +101,11 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq, return -EIO; spin_lock_irqsave(&admin_vq->lock, flags); - ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL); + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC); if (ret < 0) { if (ret == -ENOSPC) { spin_unlock_irqrestore(&admin_vq->lock, flags); - cpu_relax(); + usleep_range(10, 100); goto again; } goto unlock_err; Does this approach align with your expectations for the fix? Thanks, Jinhui ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-13 14:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-13 7:22 [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() Jinhui Guo 2026-04-13 7:45 ` Michael S. Tsirkin 2026-04-13 9:17 ` David Laight 2026-04-13 12:22 ` Jinhui Guo 2026-04-13 13:33 ` David Laight 2026-04-13 14:14 ` Eugenio Perez Martin 2026-04-13 10:00 ` Jinhui Guo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox