qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Philippe Mathieu-Daudé" <philmd@linaro.org>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH] virtio: avoid cost of -ftrivial-auto-var-init in hot path
Date: Thu, 5 Jun 2025 18:16:58 +0200	[thread overview]
Message-ID: <ff2a5d50-0908-4736-b664-523b2ed09f30@linaro.org> (raw)
In-Reply-To: <20250605125015.GB417071@fedora>

On 5/6/25 14:50, Stefan Hajnoczi wrote:
> On Thu, Jun 05, 2025 at 01:28:49PM +0200, Philippe Mathieu-Daudé wrote:
>> On 5/6/25 10:34, Daniel P. Berrangé wrote:
>>> On Wed, Jun 04, 2025 at 03:18:43PM -0400, Stefan Hajnoczi wrote:
>>>> Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
>>>> stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
>>>> used to zero local variables. While this reduces security risks
>>>> associated with uninitialized stack data, it introduced a measurable
>>>> bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
>>>> functions.
>>>>
>>>> These virtqueue functions are in the hot path. They are called for each
>>>> element (request) that is popped from a VIRTIO device's virtqueue. Using
>>>> __attribute__((uninitialized)) on large stack variables in these
>>>> functions improves fio randread bs=4k iodepth=64 performance from 304k
>>>> to 332k IOPS (+9%).
>>>
>>> IIUC, the 'hwaddr addr' variable is 8k in size, and the 'struct iovec iov'
>>> array is 16k in size, so we have 24k on the stack that we're clearing and
>>> then later writing the real value. Makes sense that this would have a
>>> perf impact in a hotpath.
>>>
>>>> This issue was found using perf-top(1). virtqueue_split_pop() was one of
>>>> the top CPU consumers and the "annotate" feature showed that the memory
>>>> zeroing instructions at the beginning of the functions were hot.
>>>
>>> When you say you found it with 'perf-top' was that just discovered by
>>> accident, or was this usage of perf-top in response to users reporting
>>> a performance degradation vs earlier QEMU ?
>>
>> Would it make sense to move these to VirtQueue (since the structure
>> definition is local anyway)?
>>
>> -- >8 --
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 85110bce374..b96c6ec603c 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -153,6 +153,12 @@ struct VirtQueue
>>       EventNotifier host_notifier;
>>       bool host_notifier_enabled;
>>       QLIST_ENTRY(VirtQueue) node;
>> +
>> +    /* Only used by virtqueue_pop() */
>> +    struct {
>> +        hwaddr addr[VIRTQUEUE_MAX_SIZE];
>> +        struct iovec iov[VIRTQUEUE_MAX_SIZE];
>> +    } pop;
> 
> This is an alternative. Using g_alloca() is another alternative.

Not a lot of these:

$ git grep -w g_alloca
backends/tpm/tpm_emulator.c:136:        buf = g_alloca(n);
tests/unit/test-char.c:1012:        be = g_alloca(sizeof(CharBackend));

The tpm_emulator.c use could be replaced by g_autofree g_malloc.

> I chose __attribute__((uninitialized)) because it clearly documents the
> reason why these variables need special treatment. In your patch the
> "Only used by virtqueue_pop()" comment isn't enough to explain why these
> variables should be located here. Someone might accidentally move them
> back into virtqueue_pop() functions in the future if they are unaware of
> the reason.

The only safe-net is a better comment.

> I'm happy to change approaches based on the pros/cons. Why do you prefer
> moving the local variables into VirtQueue?

I don't have a particular preference, I'm just wondering why these
vars have to be handled differently than the rest, by introducing
QEMU_UNINITIALIZED.

Anyway, no objection to this patch :)

Regards,

Phil.


  reply	other threads:[~2025-06-05 16:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-04 19:18 [PATCH] virtio: avoid cost of -ftrivial-auto-var-init in hot path Stefan Hajnoczi
2025-06-05  8:34 ` Daniel P. Berrangé
2025-06-05 11:28   ` Philippe Mathieu-Daudé
2025-06-05 12:50     ` Stefan Hajnoczi
2025-06-05 16:16       ` Philippe Mathieu-Daudé [this message]
2025-06-05 16:30         ` Peter Maydell
2025-06-05 12:44   ` Stefan Hajnoczi
2025-06-05 18:54 ` Daniel P. Berrangé
2025-06-06  9:33   ` Kevin Wolf
2025-06-10 16:41 ` Michael S. Tsirkin
2025-06-10 16:52   ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff2a5d50-0908-4736-b664-523b2ed09f30@linaro.org \
    --to=philmd@linaro.org \
    --cc=berrange@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).