From: David Laight <david.laight.linux@gmail.com>
To: "Jinhui Guo" <guojinhui.liam@bytedance.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Jiri Pirko" <jiri@resnulli.us>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
virtualization@lists.linux.dev
Subject: Re: [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd()
Date: Mon, 13 Apr 2026 14:33:00 +0100 [thread overview]
Message-ID: <20260413143300.16922e4f@pumpkin> (raw)
In-Reply-To: <20260413122244.534-1-guojinhui.liam@bytedance.com>
On Mon, 13 Apr 2026 20:22:44 +0800
"Jinhui Guo" <guojinhui.liam@bytedance.com> wrote:
> On Mon, Apr 13, 2026 at 10:17:59 +0100, David Laight wrote:
> > Or do the allocate before acquiring the lock (and free it not used
> > in the error path).
>
> Hi David,
>
> Thanks for the suggestion.
>
> Pre-allocating the memory outside the lock is indeed a good practice,
> but unfortunately it doesn't work in this specific virtqueue context.
>
> The kmalloc() in question is not happening at the virtqueue_exec_admin_cmd()
> level. Instead, it is deeply embedded inside virtqueue_add_sgs()
> (specifically, in functions like alloc_indirect_split() or
> virtqueue_add_indirect_packed()) to allocate indirect descriptors when
> multiple SG elements are provided.
>
> As a caller, we have no mechanism to pre-allocate this indirect descriptor
> memory and pass it down to virtqueue_add_sgs(). Furthermore, virtqueue_add_sgs()
> needs to atomically check the queue's num_free status, allocate the indirect
> table if necessary, and update the queue pointers. All these operations
> must be protected by admin_vq->lock to prevent concurrent admin command
> submissions from corrupting the virtqueue state.
It just sounds non-trivial...
>
> Therefore, allocating before acquiring the lock isn't feasible here, and
> replacing GFP_KERNEL with GFP_ATOMIC (with a proper sleepable retry upon
> failure) seems to be the more viable fix.
The sleep-retry isn't really ideal - and may not make progress.
An 'interesting' solution would be to return the size of the kmalloc()
that failed, kmalloc() and kfree() a buffer of that size and hope
it is still available for the retry.
For a quick read of the code it is always a constant multiplied by the
number of fragments.
Although I only found kmalloc() in the 'indirect' paths.
I didn't spot what happens if the ring itself is full.
David
>
> Does this make sense?
>
> Thanks,
> Jinhui
next prev parent reply other threads:[~2026-04-13 13:33 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-13 7:22 [PATCH] virtio_pci_modern: Use GFP_ATOMIC with spin_lock_irqsave held in virtqueue_exec_admin_cmd() Jinhui Guo
2026-04-13 7:45 ` Michael S. Tsirkin
2026-04-13 9:17 ` David Laight
2026-04-13 12:22 ` Jinhui Guo
2026-04-13 13:33 ` David Laight [this message]
2026-04-13 14:14 ` Eugenio Perez Martin
2026-04-13 10:00 ` Jinhui Guo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260413143300.16922e4f@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=eperezma@redhat.com \
--cc=guojinhui.liam@bytedance.com \
--cc=jasowang@redhat.com \
--cc=jiri@resnulli.us \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=stable@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox