All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Philippe Mathieu-Daudé" <philmd@linaro.org>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH] virtio: avoid cost of -ftrivial-auto-var-init in hot path
Date: Thu, 5 Jun 2025 08:50:15 -0400	[thread overview]
Message-ID: <20250605125015.GB417071@fedora> (raw)
In-Reply-To: <42276df1-4267-4038-8685-c7a193259e67@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 4139 bytes --]

On Thu, Jun 05, 2025 at 01:28:49PM +0200, Philippe Mathieu-Daudé wrote:
> On 5/6/25 10:34, Daniel P. Berrangé wrote:
> > On Wed, Jun 04, 2025 at 03:18:43PM -0400, Stefan Hajnoczi wrote:
> > > Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
> > > stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
> > > used to zero local variables. While this reduces security risks
> > > associated with uninitialized stack data, it introduced a measurable
> > > bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
> > > functions.
> > > 
> > > These virtqueue functions are in the hot path. They are called for each
> > > element (request) that is popped from a VIRTIO device's virtqueue. Using
> > > __attribute__((uninitialized)) on large stack variables in these
> > > functions improves fio randread bs=4k iodepth=64 performance from 304k
> > > to 332k IOPS (+9%).
> > 
> > IIUC, the 'hwaddr addr' variable is 8k in size, and the 'struct iovec iov'
> > array is 16k in size, so we have 24k on the stack that we're clearing and
> > then later writing the real value. Makes sense that this would have a
> > perf impact in a hotpath.
> > 
> > > This issue was found using perf-top(1). virtqueue_split_pop() was one of
> > > the top CPU consumers and the "annotate" feature showed that the memory
> > > zeroing instructions at the beginning of the functions were hot.
> > 
> > When you say you found it with 'perf-top' was that just discovered by
> > accident, or was this usage of perf-top in response to users reporting
> > a performance degradation vs earlier QEMU ?
> 
> Would it make sense to move these to VirtQueue (since the structure
> definition is local anyway)?
> 
> -- >8 --
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 85110bce374..b96c6ec603c 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -153,6 +153,12 @@ struct VirtQueue
>      EventNotifier host_notifier;
>      bool host_notifier_enabled;
>      QLIST_ENTRY(VirtQueue) node;
> +
> +    /* Only used by virtqueue_pop() */
> +    struct {
> +        hwaddr addr[VIRTQUEUE_MAX_SIZE];
> +        struct iovec iov[VIRTQUEUE_MAX_SIZE];
> +    } pop;

This is an alternative. Using g_alloca() is another alternative.

I chose __attribute__((uninitialized)) because it clearly documents the
reason why these variables need special treatment. In your patch the
"Only used by virtqueue_pop()" comment isn't enough to explain why these
variables should be located here. Someone might accidentally move them
back into virtqueue_pop() functions in the future if they are unaware of
the reason.

I'm happy to change approaches based on the pros/cons. Why do you prefer
moving the local variables into VirtQueue?

>  };
> 
>  const char *virtio_device_names[] = {
> @@ -1680,8 +1686,8 @@ static void *virtqueue_split_pop(VirtQueue *vq, size_t
> sz)
>      VirtIODevice *vdev = vq->vdev;
>      VirtQueueElement *elem = NULL;
>      unsigned out_num, in_num, elem_entries;
> -    hwaddr addr[VIRTQUEUE_MAX_SIZE];
> -    struct iovec iov[VIRTQUEUE_MAX_SIZE];
> +    hwaddr *addr = vq->pop.addr;
> +    struct iovec *iov = vq->pop.iov;
>      VRingDesc desc;
>      int rc;
> 
> @@ -1826,8 +1832,8 @@ static void *virtqueue_packed_pop(VirtQueue *vq,
> size_t sz)
>      VirtIODevice *vdev = vq->vdev;
>      VirtQueueElement *elem = NULL;
>      unsigned out_num, in_num, elem_entries;
> -    hwaddr addr[VIRTQUEUE_MAX_SIZE];
> -    struct iovec iov[VIRTQUEUE_MAX_SIZE];
> +    hwaddr *addr = vq->pop.addr;
> +    struct iovec *iov = vq->pop.iov;
>      VRingPackedDesc desc;
>      uint16_t id;
>      int rc;
> ---
> 
> > 
> > > 
> > > Fixes: 7ff9ff039380 ("meson: mitigate against use of uninitialize stack for exploits")
> > > Cc: Daniel P. Berrangé <berrange@redhat.com>
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > ---
> > >   include/qemu/compiler.h | 12 ++++++++++++
> > >   hw/virtio/virtio.c      |  8 ++++----
> > >   2 files changed, 16 insertions(+), 4 deletions(-)
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-06-05 12:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-04 19:18 [PATCH] virtio: avoid cost of -ftrivial-auto-var-init in hot path Stefan Hajnoczi
2025-06-05  8:34 ` Daniel P. Berrangé
2025-06-05 11:28   ` Philippe Mathieu-Daudé
2025-06-05 12:50     ` Stefan Hajnoczi [this message]
2025-06-05 16:16       ` Philippe Mathieu-Daudé
2025-06-05 16:30         ` Peter Maydell
2025-06-05 12:44   ` Stefan Hajnoczi
2025-06-05 18:54 ` Daniel P. Berrangé
2025-06-06  9:33   ` Kevin Wolf
2025-06-10 16:41 ` Michael S. Tsirkin
2025-06-10 16:52   ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250605125015.GB417071@fedora \
    --to=stefanha@redhat.com \
    --cc=berrange@redhat.com \
    --cc=mst@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.