* [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
@ 2026-04-13 7:22 Jinhui Guo
2026-04-13 7:45 ` Michael S. Tsirkin
0 siblings, 1 reply; 7+ messages in thread
From: Jinhui Guo @ 2026-04-13 7:22 UTC (permalink / raw)
To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, Eugenio Pérez,
Jiri Pirko
Cc: virtualization, linux-kernel, Jinhui Guo, stable
virtqueue_exec_admin_cmd() holds admin_vq->lock with spin_lock_irqsave(),
which disables interrupts. Using GFP_KERNEL inside this critical section
is unsafe because kmalloc() may sleep, leading to potential deadlocks or
scheduling violations.
Switch to GFP_ATOMIC to ensure the allocation is non-blocking.
Fixes: 4c3b54af907e ("virtio_pci_modern: use completion instead of busy loop to wait on admin cmd result")
Cc: stable@vger.kernel.org
Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
---
drivers/virtio/virtio_pci_modern.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index 6d8ae2a6a8ca..db8e4f88b749 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -101,7 +101,7 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
return -EIO;
spin_lock_irqsave(&admin_vq->lock, flags);
- ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL);
+ ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
if (ret < 0) {
if (ret == -ENOSPC) {
spin_unlock_irqrestore(&admin_vq->lock, flags);
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
2026-04-13 7:22 [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() Jinhui Guo
@ 2026-04-13 7:45 ` Michael S. Tsirkin
2026-04-13 9:17 ` David Laight
2026-04-13 10:00 ` Jinhui Guo
0 siblings, 2 replies; 7+ messages in thread
From: Michael S. Tsirkin @ 2026-04-13 7:45 UTC (permalink / raw)
To: Jinhui Guo
Cc: Jason Wang, Xuan Zhuo, Eugenio Pérez, Jiri Pirko,
virtualization, linux-kernel, stable
On Mon, Apr 13, 2026 at 03:22:49PM +0800, Jinhui Guo wrote:
> virtqueue_exec_admin_cmd() holds admin_vq->lock with spin_lock_irqsave(),
> which disables interrupts. Using GFP_KERNEL inside this critical section
> is unsafe because kmalloc() may sleep, leading to potential deadlocks or
> scheduling violations.
>
> Switch to GFP_ATOMIC to ensure the allocation is non-blocking.
>
> Fixes: 4c3b54af907e ("virtio_pci_modern: use completion instead of busy loop to wait on admin cmd result")
> Cc: stable@vger.kernel.org
> Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
> ---
> drivers/virtio/virtio_pci_modern.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> index 6d8ae2a6a8ca..db8e4f88b749 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -101,7 +101,7 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
> return -EIO;
>
> spin_lock_irqsave(&admin_vq->lock, flags);
> - ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL);
> + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
> if (ret < 0) {
> if (ret == -ENOSPC) {
> spin_unlock_irqrestore(&admin_vq->lock, flags);
GFP_ATOMIC allocations can and will fail. If using them, one must
retry, not just propagate failures.
Or just switch admin_vq->lock to a mutex?
> --
> 2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
2026-04-13 7:45 ` Michael S. Tsirkin
@ 2026-04-13 9:17 ` David Laight
2026-04-13 12:22 ` Jinhui Guo
2026-04-13 10:00 ` Jinhui Guo
1 sibling, 1 reply; 7+ messages in thread
From: David Laight @ 2026-04-13 9:17 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jinhui Guo, Jason Wang, Xuan Zhuo, Eugenio Pérez, Jiri Pirko,
virtualization, linux-kernel, stable
On Mon, 13 Apr 2026 03:45:20 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Apr 13, 2026 at 03:22:49PM +0800, Jinhui Guo wrote:
> > virtqueue_exec_admin_cmd() holds admin_vq->lock with spin_lock_irqsave(),
> > which disables interrupts. Using GFP_KERNEL inside this critical section
> > is unsafe because kmalloc() may sleep, leading to potential deadlocks or
> > scheduling violations.
> >
> > Switch to GFP_ATOMIC to ensure the allocation is non-blocking.
> >
> > Fixes: 4c3b54af907e ("virtio_pci_modern: use completion instead of busy loop to wait on admin cmd result")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
> > ---
> > drivers/virtio/virtio_pci_modern.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> > index 6d8ae2a6a8ca..db8e4f88b749 100644
> > --- a/drivers/virtio/virtio_pci_modern.c
> > +++ b/drivers/virtio/virtio_pci_modern.c
> > @@ -101,7 +101,7 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
> > return -EIO;
> >
> > spin_lock_irqsave(&admin_vq->lock, flags);
> > - ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL);
> > + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
> > if (ret < 0) {
> > if (ret == -ENOSPC) {
> > spin_unlock_irqrestore(&admin_vq->lock, flags);
>
>
> GFP_ATOMIC allocations can and will fail. If using them, one must
> retry, not just propagate failures.
> Or just switch admin_vq->lock to a mutex?
Or do the allocate before acquiring the lock (and free it not used
in the error path).
David
>
>
> > --
> > 2.20.1
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
2026-04-13 7:45 ` Michael S. Tsirkin
2026-04-13 9:17 ` David Laight
@ 2026-04-13 10:00 ` Jinhui Guo
1 sibling, 0 replies; 7+ messages in thread
From: Jinhui Guo @ 2026-04-13 10:00 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Eugenio Pérez, Jinhui Guo, Jason Wang, Jiri Pirko, Xuan Zhuo,
linux-kernel, stable, virtualization
On Mon, Apr 13, 2026 at 03:45:20 -0400, "Michael S. Tsirkin" wrote:
> GFP_ATOMIC allocations can and will fail. If using them, one must
> retry, not just propagate failures.
> Or just switch admin_vq->lock to a mutex?
Hi Michael,
Thank you for the review.
Regarding the suggestion to switch admin_vq->lock to a mutex:
The virtqueue callback vp_modern_avq_done() holds admin_vq->lock and
runs in an interrupt handler context, making it impractical to replace
the spinlock with a mutex directly.
I considered deferring the completion to a workqueue so we could safely
use a mutex, but since this is a bug fix destined for stable@vger.kernel.org,
doing so would introduce significant code churn (e.g., handling INIT_WORK,
cancel_work_sync during cleanup, etc.) and increase the risk for backports.
Therefore, using GFP_ATOMIC with the existing spinlock seems to be the most
minimal and safest approach for a fix.
However, just replacing GFP_KERNEL with GFP_ATOMIC isn't entirely safe
because of how virtqueue_add_sgs() handles allocation failures. If kmalloc()
fails under memory pressure with GFP_ATOMIC, the function falls back to using
direct descriptors. If there are not enough free direct descriptors, it
ultimately returns -ENOSPC.
In the current code, -ENOSPC is handled with a busy loop:
if (ret == -ENOSPC) {
spin_unlock_irqrestore(&admin_vq->lock, flags);
cpu_relax();
goto again;
}
If the -ENOSPC is actually caused by a GFP_ATOMIC allocation failure under
memory pressure, this cpu_relax() loop will never yield the CPU to memory
reclaim mechanisms (like kswapd), potentially leading to a soft lockup.
To properly handle both actual queue-full conditions and GFP_ATOMIC failures,
I propose replacing cpu_relax() with a sleep (e.g., usleep_range(10, 100)).
This allows memory reclaim to run while we wait.
I plan to send out a v2 patch with this modification:
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -101,11 +101,11 @@ static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
return -EIO;
spin_lock_irqsave(&admin_vq->lock, flags);
- ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_KERNEL);
+ ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
if (ret < 0) {
if (ret == -ENOSPC) {
spin_unlock_irqrestore(&admin_vq->lock, flags);
- cpu_relax();
+ usleep_range(10, 100);
goto again;
}
goto unlock_err;
Does this approach align with your expectations for the fix?
Thanks,
Jinhui
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
2026-04-13 9:17 ` David Laight
@ 2026-04-13 12:22 ` Jinhui Guo
2026-04-13 13:33 ` David Laight
2026-04-13 14:14 ` Eugenio Perez Martin
0 siblings, 2 replies; 7+ messages in thread
From: Jinhui Guo @ 2026-04-13 12:22 UTC (permalink / raw)
To: David Laight
Cc: Michael S. Tsirkin, Eugenio Pérez, Jinhui Guo, Jason Wang,
Jiri Pirko, Xuan Zhuo, linux-kernel, stable, virtualization
On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote:
> Or do the allocate before acquiring the lock (and free it not used
> in the error path).
Hi David,
Thanks for the suggestion.
Pre-allocating the memory outside the lock is indeed a good practice,
but unfortunately it doesn't work in this specific virtqueue context.
The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd()
level. Instead, it is deeply embedded inside virtqueue_add_sgs()
(specifically, in functions like alloc_indirect_split() or
virtqueue_add_indirect_packed()) to allocate indirect descriptors when
multiple SG elements are provided.
As a caller, we have no mechanism to pre-allocate this indirect descriptor
memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs()
needs to atomically check the queue's num_free status, allocate the indirect
table if necessary, and update the queue pointers. All these operations
must be protected by admin_vq->lock to prevent concurrent admin command
submissions from corrupting the virtqueue state.
Therefore, allocating before acquiring the lock isn't feasible here, and
replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon
failure) seems to be the more viable fix.
Does this make sense?
Thanks,
Jinhui
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
2026-04-13 12:22 ` Jinhui Guo
@ 2026-04-13 13:33 ` David Laight
2026-04-13 14:14 ` Eugenio Perez Martin
1 sibling, 0 replies; 7+ messages in thread
From: David Laight @ 2026-04-13 13:33 UTC (permalink / raw)
To: Jinhui Guo
Cc: Michael S. Tsirkin, Eugenio Pérez, Jason Wang, Jiri Pirko,
Xuan Zhuo, linux-kernel, stable, virtualization
On Mon, 13 Apr 2026 20:22:44 +0800
"Jinhui Guo" <guojinhui.liam@bytedance.com> wrote:
> On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote:
> > Or do the allocate before acquiring the lock (and free it not used
> > in the error path).
>
> Hi David,
>
> Thanks for the suggestion.
>
> Pre-allocating the memory outside the lock is indeed a good practice,
> but unfortunately it doesn't work in this specific virtqueue context.
>
> The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd()
> level. Instead, it is deeply embedded inside virtqueue_add_sgs()
> (specifically, in functions like alloc_indirect_split() or
> virtqueue_add_indirect_packed()) to allocate indirect descriptors when
> multiple SG elements are provided.
>
> As a caller, we have no mechanism to pre-allocate this indirect descriptor
> memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs()
> needs to atomically check the queue's num_free status, allocate the indirect
> table if necessary, and update the queue pointers. All these operations
> must be protected by admin_vq->lock to prevent concurrent admin command
> submissions from corrupting the virtqueue state.
It just sounds non-trivial...
>
> Therefore, allocating before acquiring the lock isn't feasible here, and
> replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon
> failure) seems to be the more viable fix.
The sleep-retry isn't really ideal - and may not make progress.
An 'interesting' solution would be to return the size of the kmalloc()
that failed, kmalloc() and kfree() a buffer of that size and hope
it is still available for the retry.
For a quick read of the code it is always a constant multiplied by the
number of fragments.
Although I only found kmalloc() in the 'indirect' paths.
I didn't spot what happens if the ring itself is full.
David
>
> Does this make sense?
>
> Thanks,
> Jinhui
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
2026-04-13 12:22 ` Jinhui Guo
2026-04-13 13:33 ` David Laight
@ 2026-04-13 14:14 ` Eugenio Perez Martin
1 sibling, 0 replies; 7+ messages in thread
From: Eugenio Perez Martin @ 2026-04-13 14:14 UTC (permalink / raw)
To: Jinhui Guo
Cc: David Laight, Michael S. Tsirkin, Jason Wang, Jiri Pirko,
Xuan Zhuo, linux-kernel, stable, virtualization
On Mon, Apr 13, 2026 at 2:23 PM Jinhui Guo <guojinhui.liam@bytedance.com> wrote:
>
> On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote:
> > Or do the allocate before acquiring the lock (and free it not used
> > in the error path).
>
> Hi David,
>
> Thanks for the suggestion.
>
> Pre-allocating the memory outside the lock is indeed a good practice,
> but unfortunately it doesn't work in this specific virtqueue context.
>
> The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd()
> level. Instead, it is deeply embedded inside virtqueue_add_sgs()
> (specifically, in functions like alloc_indirect_split() or
> virtqueue_add_indirect_packed()) to allocate indirect descriptors when
> multiple SG elements are provided.
>
> As a caller, we have no mechanism to pre-allocate this indirect descriptor
> memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs()
> needs to atomically check the queue's num_free status, allocate the indirect
> table if necessary, and update the queue pointers. All these operations
> must be protected by admin_vq->lock to prevent concurrent admin command
> submissions from corrupting the virtqueue state.
>
Sounds like a big chunk of that is achieved with virtqueue_map_* and
virtqueue_add_{in,out}buf_premapped functions, isn't it? Or am I
missing something?
> Therefore, allocating before acquiring the lock isn't feasible here, and
> replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon
> failure) seems to be the more viable fix.
>
> Does this make sense?
>
> Thanks,
> Jinhui
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-13 14:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 7:22 [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() Jinhui Guo
2026-04-13 7:45 ` Michael S. Tsirkin
2026-04-13 9:17 ` David Laight
2026-04-13 12:22 ` Jinhui Guo
2026-04-13 13:33 ` David Laight
2026-04-13 14:14 ` Eugenio Perez Martin
2026-04-13 10:00 ` Jinhui Guo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox