* [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path Daniel P. Berrangé
` (32 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
From: Stefan Hajnoczi <stefanha@redhat.com>
The QEMU_UNINITIALIZED macro is to be used to skip the default compiler
variable initialization done by -ftrivial-auto-var-init=zero.
Use this in cases where there a method in the device I/O path (or other
important hot paths), that has large variables on the stack. A rule of
thumb is that "large" means a method with 4kb data in the local stack
frame. Any variables which are KB in size, should be annotated with this
attribute, to pre-emptively eliminate any potential overhead from the
compiler zero'ing memory.
Given that this turns off a security hardening feature, when using this
to flag variables, it is important that the code is double-checked to
ensure there is no possible use of uninitialized data in the method.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[DB: split off patch & rewrite guidance on when to use the annotation]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
include/qemu/compiler.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 496dac5ac1..65b89958d3 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -207,6 +207,26 @@
# define QEMU_USED
#endif
+/*
+ * Disable -ftrivial-auto-var-init on a local variable.
+ *
+ * Use this in cases where there a method in the device I/O path (or other
+ * important hot paths), that has large variables on the stack. A rule of
+ * thumb is that "large" means a method with 4kb data in the local stack
+ * frame. Any variables which are KB in size, should be annotated with this
+ * attribute, to pre-emptively eliminate any potential overhead from the
+ * compiler's implicit zero'ing of memory.
+ *
+ * Given that this turns off a security hardening feature, when using this
+ * to flag variables, it is important that the code is double-checked to
+ * ensure there is no possible use of uninitialized data in the method.
+ */
+#if __has_attribute(uninitialized)
+# define QEMU_UNINITIALIZED __attribute__((uninitialized))
+#else
+# define QEMU_UNINITIALIZED
+#endif
+
/*
* http://clang.llvm.org/docs/ThreadSafetyAnalysis.html
*
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit Daniel P. Berrangé
` (31 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
From: Stefan Hajnoczi <stefanha@redhat.com>
Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
used to zero local variables. While this reduces security risks
associated with uninitialized stack data, it introduced a measurable
bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
functions.
These virtqueue functions are in the hot path. They are called for each
element (request) that is popped from a VIRTIO device's virtqueue. Using
__attribute__((uninitialized)) on large stack variables in these
functions improves fio randread bs=4k iodepth=64 performance from 304k
to 332k IOPS (+9%).
This issue was found using perf-top(1). virtqueue_split_pop() was one of
the top CPU consumers and the "annotate" feature showed that the memory
zeroing instructions at the beginning of the functions were hot.
Fixes: 7ff9ff039380 ("meson: mitigate against use of uninitialize stack for exploits")
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/virtio/virtio.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5534251e01..82a285a31d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1689,8 +1689,8 @@ static void *virtqueue_split_pop(VirtQueue *vq, size_t sz)
VirtIODevice *vdev = vq->vdev;
VirtQueueElement *elem = NULL;
unsigned out_num, in_num, elem_entries;
- hwaddr addr[VIRTQUEUE_MAX_SIZE];
- struct iovec iov[VIRTQUEUE_MAX_SIZE];
+ hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
+ struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
VRingDesc desc;
int rc;
@@ -1836,8 +1836,8 @@ static void *virtqueue_packed_pop(VirtQueue *vq, size_t sz)
VirtIODevice *vdev = vq->vdev;
VirtQueueElement *elem = NULL;
unsigned out_num, in_num, elem_entries;
- hwaddr addr[VIRTQUEUE_MAX_SIZE];
- struct iovec iov[VIRTQUEUE_MAX_SIZE];
+ hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
+ struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
VRingPackedDesc desc;
uint16_t id;
int rc;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array Daniel P. Berrangé
` (30 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'ioq_submit' method has a struct array that is 8k in size.
Skip the automatic zero-init of this array to eliminate the
performance overhead in the I/O hot path.
The 'iocbs' array will selectively initialized when processing
the I/O data.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
block/linux-aio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 407369f5c9..c200e7ad20 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -291,7 +291,7 @@ static void ioq_submit(LinuxAioState *s)
{
int ret, len;
struct qemu_laiocb *aiocb;
- struct iocb *iocbs[MAX_EVENTS];
+ QEMU_UNINITIALIZED struct iocb *iocbs[MAX_EVENTS];
QSIMPLEQ_HEAD(, qemu_laiocb) completed;
do {
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (2 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 05/31] chardev/char-pty: " Daniel P. Berrangé
` (29 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'fd_chr_read' method has a 4k byte array used for copying
data between the socket and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
chardev/char-fd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index 23bfe3c0b1..6f03adf872 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -50,7 +50,7 @@ static gboolean fd_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
Chardev *chr = CHARDEV(opaque);
FDChardev *s = FD_CHARDEV(opaque);
int len;
- uint8_t buf[CHR_READ_BUF_LEN];
+ QEMU_UNINITIALIZED uint8_t buf[CHR_READ_BUF_LEN];
ssize_t ret;
len = sizeof(buf);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 05/31] chardev/char-pty: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (3 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 06/31] chardev/char-socket: " Daniel P. Berrangé
` (28 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'pty_chr_read' method has a 4k byte array used for copying
data between the PTY and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the PTY.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
chardev/char-pty.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index c28554e6e0..674e9b3f14 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -154,7 +154,7 @@ static gboolean pty_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
Chardev *chr = CHARDEV(opaque);
PtyChardev *s = PTY_CHARDEV(opaque);
gsize len;
- uint8_t buf[CHR_READ_BUF_LEN];
+ QEMU_UNINITIALIZED uint8_t buf[CHR_READ_BUF_LEN];
ssize_t ret;
len = sizeof(buf);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 06/31] chardev/char-socket: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (4 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 05/31] chardev/char-pty: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays Daniel P. Berrangé
` (27 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'tcp_chr_read' method has a 4k byte array used for copying
data between the socket and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
chardev/char-socket.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index e8dd2931dc..1e8313915b 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -497,7 +497,7 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
{
Chardev *chr = CHARDEV(opaque);
SocketChardev *s = SOCKET_CHARDEV(opaque);
- uint8_t buf[CHR_READ_BUF_LEN];
+ QEMU_UNINITIALIZED uint8_t buf[CHR_READ_BUF_LEN];
int len, size;
if ((s->state != TCP_CHARDEV_STATE_CONNECTED) ||
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (5 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 06/31] chardev/char-socket: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 08/31] hw/audio/cs4231a: " Daniel P. Berrangé
` (26 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'read_audio' & 'write_audio' methods have a 4k byte array used
for copying data between the audio backend and device. Skip the
automatic zero-init of these arrays to eliminate the performance
overhead in the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data from
the audio backend and/or device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/ac97.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/audio/ac97.c b/hw/audio/ac97.c
index 669a0463cc..eb7a847080 100644
--- a/hw/audio/ac97.c
+++ b/hw/audio/ac97.c
@@ -886,7 +886,7 @@ static void nabm_writel(void *opaque, uint32_t addr, uint32_t val)
static int write_audio(AC97LinkState *s, AC97BusMasterRegs *r,
int max, int *stop)
{
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
uint32_t addr = r->bd.addr;
uint32_t temp = r->picb << 1;
uint32_t written = 0;
@@ -959,7 +959,7 @@ static void write_bup(AC97LinkState *s, int elapsed)
static int read_audio(AC97LinkState *s, AC97BusMasterRegs *r,
int max, int *stop)
{
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
uint32_t addr = r->bd.addr;
uint32_t temp = r->picb << 1;
uint32_t nread = 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 08/31] hw/audio/cs4231a: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (6 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array Daniel P. Berrangé
` (25 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'cs_write_audio' method has a pair of byte arrays, one 4k in size
and one 8k, which are used in converting audio samples. Skip the
automatic zero-init of these arrays to eliminate the performance
overhead in the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading a block of
data from the guest. The 'linbuf' array will be fully initialized
when converting the audio samples.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/cs4231a.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/audio/cs4231a.c b/hw/audio/cs4231a.c
index eb9a45805b..6dfff202ff 100644
--- a/hw/audio/cs4231a.c
+++ b/hw/audio/cs4231a.c
@@ -528,7 +528,7 @@ static int cs_write_audio (CSState *s, int nchan, int dma_pos,
int dma_len, int len)
{
int temp, net;
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
IsaDmaClass *k = ISADMA_GET_CLASS(s->isa_dma);
temp = len;
@@ -547,7 +547,7 @@ static int cs_write_audio (CSState *s, int nchan, int dma_pos,
copied = k->read_memory(s->isa_dma, nchan, tmpbuf, dma_pos, to_copy);
if (s->tab) {
int i;
- int16_t linbuf[4096];
+ QEMU_UNINITIALIZED int16_t linbuf[4096];
for (i = 0; i < copied; ++i)
linbuf[i] = s->tab[tmpbuf[i]];
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (7 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 08/31] hw/audio/cs4231a: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 10/31] hw/audio/gus: " Daniel P. Berrangé
` (24 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'es1370_transfer_audio' method has a 4k byte array used for
copying data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data from
the audio backend and/or device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/es1370.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/es1370.c b/hw/audio/es1370.c
index 8efb969212..a6a32a6348 100644
--- a/hw/audio/es1370.c
+++ b/hw/audio/es1370.c
@@ -604,7 +604,7 @@ static uint64_t es1370_read(void *opaque, hwaddr addr, unsigned size)
static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel,
int max, bool *irq)
{
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
size_t to_transfer;
uint32_t addr = d->frame_addr;
int sc = d->scount & 0xffff;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 10/31] hw/audio/gus: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (8 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
` (23 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'GUS_read_DMA' method has a 4k byte array used for copying
data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data
from device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/gus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/gus.c b/hw/audio/gus.c
index 87e8634893..c36df0240f 100644
--- a/hw/audio/gus.c
+++ b/hw/audio/gus.c
@@ -183,7 +183,7 @@ static int GUS_read_DMA (void *opaque, int nchan, int dma_pos, int dma_len)
{
GUSState *s = opaque;
IsaDmaClass *k = ISADMA_GET_CLASS(s->isa_dma);
- char tmpbuf[4096];
+ QEMU_UNINITIALIZED char tmpbuf[4096];
int pos = dma_pos, mode, left = dma_len - dma_pos;
ldebug ("read DMA %#x %d\n", dma_pos, dma_len);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 11/31] hw/audio/gus: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (9 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 10/31] hw/audio/gus: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 14:23 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 12/31] hw/audio/sb16: " Daniel P. Berrangé
` (22 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'mv88w8618_audio_callback' method has a 4k byte array used for
copying data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'buf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/marvell_88w8618.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/marvell_88w8618.c b/hw/audio/marvell_88w8618.c
index 6d3ebbb0c8..c5c79d083a 100644
--- a/hw/audio/marvell_88w8618.c
+++ b/hw/audio/marvell_88w8618.c
@@ -66,7 +66,7 @@ static void mv88w8618_audio_callback(void *opaque, int free_out, int free_in)
{
mv88w8618_audio_state *s = opaque;
int16_t *codec_buffer;
- int8_t buf[4096];
+ QEMU_UNINITIALIZED int8_t buf[4096];
int8_t *mem_buffer;
int pos, block_size;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* Re: [PATCH 11/31] hw/audio/gus: skip automatic zero-init of large array
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
@ 2025-06-10 14:23 ` Daniel P. Berrangé
0 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 14:23 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf
Mistake in $SUBJECT - should be 'hw/audio/marvell_88w8618:' prefix
On Tue, Jun 10, 2025 at 01:36:49PM +0100, Daniel P. Berrangé wrote:
> The 'mv88w8618_audio_callback' method has a 4k byte array used for
> copying data between the audio backend and device. Skip the automatic
> zero-init of this array to eliminate the performance overhead in
> the I/O hot path.
>
> The 'buf' array will be fully initialized when reading data from
> device memory.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/audio/marvell_88w8618.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/audio/marvell_88w8618.c b/hw/audio/marvell_88w8618.c
> index 6d3ebbb0c8..c5c79d083a 100644
> --- a/hw/audio/marvell_88w8618.c
> +++ b/hw/audio/marvell_88w8618.c
> @@ -66,7 +66,7 @@ static void mv88w8618_audio_callback(void *opaque, int free_out, int free_in)
> {
> mv88w8618_audio_state *s = opaque;
> int16_t *codec_buffer;
> - int8_t buf[4096];
> + QEMU_UNINITIALIZED int8_t buf[4096];
> int8_t *mem_buffer;
> int pos, block_size;
>
> --
> 2.49.0
>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 12/31] hw/audio/sb16: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (10 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 13/31] hw/audio/via-ac97: " Daniel P. Berrangé
` (21 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'write_audio' method has a 4k byte array used for copying data
between the audio backend and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'tmpbuf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/sb16.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/sb16.c b/hw/audio/sb16.c
index 19fd3b9020..bac64118fe 100644
--- a/hw/audio/sb16.c
+++ b/hw/audio/sb16.c
@@ -1181,7 +1181,7 @@ static int write_audio (SB16State *s, int nchan, int dma_pos,
IsaDma *isa_dma = nchan == s->dma ? s->isa_dma : s->isa_hdma;
IsaDmaClass *k = ISADMA_GET_CLASS(isa_dma);
int temp, net;
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
temp = len;
net = 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 13/31] hw/audio/via-ac97: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (11 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 12/31] hw/audio/sb16: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 14/31] hw/char/sclpconsole-lm: " Daniel P. Berrangé
` (20 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'out_cb' method has a 4k byte array used for copying data
between the audio backend and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'tmpbuf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/via-ac97.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/via-ac97.c b/hw/audio/via-ac97.c
index 1e0a5c7398..d5231e1cf2 100644
--- a/hw/audio/via-ac97.c
+++ b/hw/audio/via-ac97.c
@@ -175,7 +175,7 @@ static void out_cb(void *opaque, int avail)
ViaAC97SGDChannel *c = &s->aur;
int temp, to_copy, copied;
bool stop = false;
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
if (c->stat & STAT_PAUSED) {
return;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 14/31] hw/char/sclpconsole-lm: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (12 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 13/31] hw/audio/via-ac97: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 15/31] hw/dma/xlnx_csu_dma: " Daniel P. Berrangé
` (19 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'process_mdb' method has a 4k byte array used for copying data
between the guest and the chardev backend. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'buffer' array will be selectively initialized when data is converted
between EBCDIC and ASCII.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/char/sclpconsole-lm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/char/sclpconsole-lm.c b/hw/char/sclpconsole-lm.c
index e9580aacba..3e40d5e434 100644
--- a/hw/char/sclpconsole-lm.c
+++ b/hw/char/sclpconsole-lm.c
@@ -214,7 +214,7 @@ static int process_mdb(SCLPEvent *event, MDBO *mdbo)
{
int rc;
int len;
- uint8_t buffer[SIZE_BUFFER];
+ QEMU_UNINITIALIZED uint8_t buffer[SIZE_BUFFER];
len = be16_to_cpu(mdbo->length);
len -= sizeof(mdbo->length) + sizeof(mdbo->type)
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 15/31] hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (13 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 14/31] hw/char/sclpconsole-lm: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct Daniel P. Berrangé
` (18 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'xlnx_csu_dma_src_notify' method has a 4k byte array used for
copying DMA data. Skip the automatic zero-init of this array to
eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when data is copied.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/dma/xlnx_csu_dma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/dma/xlnx_csu_dma.c b/hw/dma/xlnx_csu_dma.c
index 3db3904d83..d8c7da1a50 100644
--- a/hw/dma/xlnx_csu_dma.c
+++ b/hw/dma/xlnx_csu_dma.c
@@ -287,7 +287,7 @@ static uint32_t xlnx_csu_dma_advance(XlnxCSUDMA *s, uint32_t len)
static void xlnx_csu_dma_src_notify(void *opaque)
{
XlnxCSUDMA *s = XLNX_CSU_DMA(opaque);
- unsigned char buf[4 * 1024];
+ QEMU_UNINITIALIZED unsigned char buf[4 * 1024];
size_t rlen = 0;
ptimer_transaction_begin(s->src_timer);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (14 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 15/31] hw/dma/xlnx_csu_dma: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array Daniel P. Berrangé
` (17 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'vmsvga_fifo_run' method has a struct which is a little over 20k
in size, used for holding image data for cursor changes. Skip the
automatic zero-init of this struct to eliminate the performance
overhead in the I/O hot path.
The cursor variable will be fully initialized only when processing
a cursor definition message from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/display/vmware_vga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
index 544bb65320..bc1a8ed466 100644
--- a/hw/display/vmware_vga.c
+++ b/hw/display/vmware_vga.c
@@ -618,7 +618,7 @@ static void vmsvga_fifo_run(struct vmsvga_state_s *s)
uint32_t cmd, colour;
int args, len, maxloop = 1024;
int x, y, dx, dy, width, height;
- struct vmsvga_cursor_definition_s cursor;
+ QEMU_UNINITIALIZED struct vmsvga_cursor_definition_s cursor;
uint32_t cmd_start;
len = vmsvga_fifo_length(s);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (15 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 18/31] hw/misc/aspeed_hace: " Daniel P. Berrangé
` (16 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'handle_recv_msg' method has a 4k byte array used for copying
data between the network socket and guest memory. Skip the automatic
zero-init of this array to eliminate the performance overhead in the
I/O hot path.
The 'data_buf' array will be fully initialized when data is read
off the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/hyperv/syndbg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/hyperv/syndbg.c b/hw/hyperv/syndbg.c
index 8b8a14750d..ac7e15f6f1 100644
--- a/hw/hyperv/syndbg.c
+++ b/hw/hyperv/syndbg.c
@@ -192,7 +192,7 @@ static uint16_t handle_recv_msg(HvSynDbg *syndbg, uint64_t outgpa,
{
uint16_t ret;
g_assert(MSG_BUFSZ >= qemu_target_page_size());
- uint8_t data_buf[MSG_BUFSZ];
+ QEMU_UNINITIALIZED uint8_t data_buf[MSG_BUFSZ];
hwaddr out_len;
void *out_data;
ssize_t recv_byte_count;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 18/31] hw/misc/aspeed_hace: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (16 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 19/31] hw/net/rtl8139: " Daniel P. Berrangé
` (15 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'do_hash_operation' method has a 256 element iovec array used for
holding pointers to data that is to be hashed. Skip the automatic
zero-init of this array to eliminate the performance overhead in the
I/O hot path.
The 'iovec' array will be selectively initialized based on data that
needs to be hashed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/misc/aspeed_hace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c
index 8924a30eff..726368fbbc 100644
--- a/hw/misc/aspeed_hace.c
+++ b/hw/misc/aspeed_hace.c
@@ -419,7 +419,7 @@ static void hash_execute_acc_mode(AspeedHACEState *s, int algo,
static void do_hash_operation(AspeedHACEState *s, int algo, bool sg_mode,
bool acc_mode)
{
- struct iovec iov[ASPEED_HACE_MAX_SG];
+ QEMU_UNINITIALIZED struct iovec iov[ASPEED_HACE_MAX_SG];
bool acc_final_request = false;
int iov_idx = -1;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 19/31] hw/net/rtl8139: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (17 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 18/31] hw/misc/aspeed_hace: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 20/31] hw/net/tulip: " Daniel P. Berrangé
` (14 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'rtl8139_transmit_one' method has a 8k byte array used for
copying data between guest and host. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'txbuffer' will be fully initialized when reading PCI DMA
buffers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/rtl8139.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 15b8f7501a..654a087d80 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -1816,7 +1816,7 @@ static int rtl8139_transmit_one(RTL8139State *s, int descriptor)
PCIDevice *d = PCI_DEVICE(s);
int txsize = s->TxStatus[descriptor] & 0x1fff;
- uint8_t txbuffer[0x2000];
+ QEMU_UNINITIALIZED uint8_t txbuffer[0x2000];
DPRINTF("+++ transmit reading %d bytes from host memory at 0x%08x\n",
txsize, s->TxAddr[descriptor]);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 20/31] hw/net/tulip: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (18 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 19/31] hw/net/rtl8139: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays Daniel P. Berrangé
` (13 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'tulip_setup_frame' method has a 4k byte array used for copynig
DMA data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when reading data from the
device.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/tulip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/tulip.c b/hw/net/tulip.c
index 63fe513458..319af906c8 100644
--- a/hw/net/tulip.c
+++ b/hw/net/tulip.c
@@ -629,7 +629,7 @@ static void tulip_setup_filter_addr(TULIPState *s, uint8_t *buf, int n)
static void tulip_setup_frame(TULIPState *s,
struct tulip_descriptor *desc)
{
- uint8_t buf[4096];
+ QEMU_UNINITIALIZED uint8_t buf[4096];
int len = (desc->control >> TDES1_BUF1_SIZE_SHIFT) & TDES1_BUF1_SIZE_MASK;
int i;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (19 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 20/31] hw/net/tulip: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array Daniel P. Berrangé
` (12 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'virtio_net_receive_rcu' method has three arrays with
VIRTQUEUE_MAX_SIZE elements, which are apprixmately 32k in
size used for copying data between guest and host. Skip the
automatic zero-init of these arrays to eliminate the
performance overhead in the I/O hot path.
The three arrays will be selectively initialized as required
when processing network buffers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/virtio-net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 221252e00a..eb93607b8c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1911,9 +1911,9 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
VirtIONet *n = qemu_get_nic_opaque(nc);
VirtIONetQueue *q;
VirtIODevice *vdev = VIRTIO_DEVICE(n);
- VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
- size_t lens[VIRTQUEUE_MAX_SIZE];
- struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
+ QEMU_UNINITIALIZED VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
+ QEMU_UNINITIALIZED size_t lens[VIRTQUEUE_MAX_SIZE];
+ QEMU_UNINITIALIZED struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
struct virtio_net_hdr_v1_hash extra_hdr;
unsigned mhdr_cnt = 0;
size_t offset, i, guest_offset, j;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (20 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
` (11 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'xgmac_enet_send' method has a 8k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'frame' buffer will be fully initialized when reading guest
memory to fetch the data to send.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/xgmac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/xgmac.c b/hw/net/xgmac.c
index 9c87c4e70f..d45f872467 100644
--- a/hw/net/xgmac.c
+++ b/hw/net/xgmac.c
@@ -207,7 +207,7 @@ static void xgmac_enet_send(XgmacState *s)
struct desc bd;
int frame_size;
int len;
- uint8_t frame[8192];
+ QEMU_UNINITIALIZED uint8_t frame[8192];
uint8_t *ptr;
ptr = frame;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (21 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-11 8:55 ` Klaus Jensen
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
` (10 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'nvme_map_sgl' method has a 256 element array used for copying
data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.
The 'segment' array will be fully initialized when reading data from
the device.
The 'nme_changed_nslist' method has a 4k byte array that is manually
initialized with memset(). The compiler ought to be intelligent
enough to turn the memset() into a static initialization operation,
and thus not duplicate the automatic zero-init. Replacing memset()
with '{}' makes it unambiguous that the array is statically initialized.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/nvme/ctrl.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index fd935507bc..220002830d 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1057,7 +1057,8 @@ static uint16_t nvme_map_sgl(NvmeCtrl *n, NvmeSg *sg, NvmeSglDescriptor sgl,
*/
#define SEG_CHUNK_SIZE 256
- NvmeSglDescriptor segment[SEG_CHUNK_SIZE], *sgld, *last_sgld;
+ QEMU_UNINITIALIZED NvmeSglDescriptor segment[SEG_CHUNK_SIZE];
+ NvmeSglDescriptor *sgld, *last_sgld;
uint64_t nsgld;
uint32_t seg_len;
uint16_t status;
@@ -5128,7 +5129,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
uint64_t off, NvmeRequest *req)
{
- uint32_t nslist[1024];
+ uint32_t nslist[1024] = {};
uint32_t trans_len;
int i = 0;
uint32_t nsid;
@@ -5138,7 +5139,6 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
return NVME_INVALID_FIELD | NVME_DNR;
}
- memset(nslist, 0x0, sizeof(nslist));
trans_len = MIN(sizeof(nslist) - off, buf_len);
while ((nsid = find_first_bit(n->changed_nsids, NVME_CHANGED_NSID_SIZE)) !=
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* Re: [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-11 8:55 ` Klaus Jensen
0 siblings, 0 replies; 42+ messages in thread
From: Klaus Jensen @ 2025-06-11 8:55 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Philippe Mathieu-Daudé, Michael S. Tsirkin,
Peter Maydell, qemu-block, Stefan Hajnoczi, Kevin Wolf
[-- Attachment #1: Type: text/plain, Size: 2235 bytes --]
On Jun 10 13:37, Daniel P. Berrangé wrote:
> The 'nvme_map_sgl' method has a 256 element array used for copying
> data from the device. Skip the automatic zero-init of this array
> to eliminate the performance overhead in the I/O hot path.
>
> The 'segment' array will be fully initialized when reading data from
> the device.
>
> The 'nme_changed_nslist' method has a 4k byte array that is manually
> initialized with memset(). The compiler ought to be intelligent
> enough to turn the memset() into a static initialization operation,
> and thus not duplicate the automatic zero-init. Replacing memset()
> with '{}' makes it unambiguous that the array is statically initialized.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/nvme/ctrl.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index fd935507bc..220002830d 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -1057,7 +1057,8 @@ static uint16_t nvme_map_sgl(NvmeCtrl *n, NvmeSg *sg, NvmeSglDescriptor sgl,
> */
> #define SEG_CHUNK_SIZE 256
>
> - NvmeSglDescriptor segment[SEG_CHUNK_SIZE], *sgld, *last_sgld;
> + QEMU_UNINITIALIZED NvmeSglDescriptor segment[SEG_CHUNK_SIZE];
> + NvmeSglDescriptor *sgld, *last_sgld;
> uint64_t nsgld;
> uint32_t seg_len;
> uint16_t status;
> @@ -5128,7 +5129,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> uint64_t off, NvmeRequest *req)
> {
> - uint32_t nslist[1024];
> + uint32_t nslist[1024] = {};
> uint32_t trans_len;
> int i = 0;
> uint32_t nsid;
> @@ -5138,7 +5139,6 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> return NVME_INVALID_FIELD | NVME_DNR;
> }
>
> - memset(nslist, 0x0, sizeof(nslist));
> trans_len = MIN(sizeof(nslist) - off, buf_len);
>
> while ((nsid = find_first_bit(n->changed_nsids, NVME_CHANGED_NSID_SIZE)) !=
> --
> 2.49.0
>
>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (22 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-11 9:09 ` Harsh Prateek Bora
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
` (9 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'occ_model_tick' method has a 12k struct used for copying
data between guest and host. Skip the automatic zero-init of this
struct to eliminate the performance overhead in the I/O hot path.
The 'dynamic_data' buffer will be fully initialized when reading
data from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/ppc/pnv_occ.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
index fa6f31cb8d..24b789c191 100644
--- a/hw/ppc/pnv_occ.c
+++ b/hw/ppc/pnv_occ.c
@@ -789,7 +789,7 @@ static bool occ_opal_process_command(PnvOCC *occ,
static bool occ_model_tick(PnvOCC *occ)
{
- struct occ_dynamic_data dynamic_data;
+ QEMU_UNINITIALIZED struct occ_dynamic_data dynamic_data;
if (!occ_read_dynamic_data(occ, &dynamic_data, NULL)) {
/* Can't move OCC state field to safe because we can't map it! */
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* Re: [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
@ 2025-06-11 9:09 ` Harsh Prateek Bora
0 siblings, 0 replies; 42+ messages in thread
From: Harsh Prateek Bora @ 2025-06-11 9:09 UTC (permalink / raw)
To: Daniel P. Berrangé, qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf
On 6/10/25 18:07, Daniel P. Berrangé wrote:
> The 'occ_model_tick' method has a 12k struct used for copying
> data between guest and host. Skip the automatic zero-init of this
> struct to eliminate the performance overhead in the I/O hot path.
>
> The 'dynamic_data' buffer will be fully initialized when reading
> data from the guest.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/ppc/pnv_occ.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
> index fa6f31cb8d..24b789c191 100644
> --- a/hw/ppc/pnv_occ.c
> +++ b/hw/ppc/pnv_occ.c
> @@ -789,7 +789,7 @@ static bool occ_opal_process_command(PnvOCC *occ,
>
> static bool occ_model_tick(PnvOCC *occ)
> {
> - struct occ_dynamic_data dynamic_data;
> + QEMU_UNINITIALIZED struct occ_dynamic_data dynamic_data;
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>
> if (!occ_read_dynamic_data(occ, &dynamic_data, NULL)) {
> /* Can't move OCC state field to safe because we can't map it! */
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (23 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-11 9:20 ` Harsh Prateek Bora
2025-06-10 12:37 ` [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array Daniel P. Berrangé
` (8 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'tpm_execute' method has a pair of 4k arrays used for copying
data between guest and host. Skip the automatic zero-init of these
arrays to eliminate the performance overhead in the I/O hot path.
The two arrays will be fully initialized when reading data from
guest memory or reading data from the proxy FD.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/ppc/spapr_tpm_proxy.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/ppc/spapr_tpm_proxy.c b/hw/ppc/spapr_tpm_proxy.c
index 862eeaa50a..1297b3ad56 100644
--- a/hw/ppc/spapr_tpm_proxy.c
+++ b/hw/ppc/spapr_tpm_proxy.c
@@ -41,8 +41,8 @@ static ssize_t tpm_execute(SpaprTpmProxy *tpm_proxy, target_ulong *args)
target_ulong data_in_size = args[2];
uint64_t data_out = ppc64_phys_to_real(args[3]);
target_ulong data_out_size = args[4];
- uint8_t buf_in[TPM_SPAPR_BUFSIZE];
- uint8_t buf_out[TPM_SPAPR_BUFSIZE];
+ QEMU_UNINITIALIZED uint8_t buf_in[TPM_SPAPR_BUFSIZE];
+ QEMU_UNINITIALIZED uint8_t buf_out[TPM_SPAPR_BUFSIZE];
ssize_t ret;
trace_spapr_tpm_execute(data_in, data_in_size, data_out, data_out_size);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* Re: [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-11 9:20 ` Harsh Prateek Bora
0 siblings, 0 replies; 42+ messages in thread
From: Harsh Prateek Bora @ 2025-06-11 9:20 UTC (permalink / raw)
To: Daniel P. Berrangé, qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf
On 6/10/25 18:07, Daniel P. Berrangé wrote:
> The 'tpm_execute' method has a pair of 4k arrays used for copying
> data between guest and host. Skip the automatic zero-init of these
> arrays to eliminate the performance overhead in the I/O hot path.
>
> The two arrays will be fully initialized when reading data from
> guest memory or reading data from the proxy FD.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/ppc/spapr_tpm_proxy.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/ppc/spapr_tpm_proxy.c b/hw/ppc/spapr_tpm_proxy.c
> index 862eeaa50a..1297b3ad56 100644
> --- a/hw/ppc/spapr_tpm_proxy.c
> +++ b/hw/ppc/spapr_tpm_proxy.c
> @@ -41,8 +41,8 @@ static ssize_t tpm_execute(SpaprTpmProxy *tpm_proxy, target_ulong *args)
> target_ulong data_in_size = args[2];
> uint64_t data_out = ppc64_phys_to_real(args[3]);
> target_ulong data_out_size = args[4];
> - uint8_t buf_in[TPM_SPAPR_BUFSIZE];
> - uint8_t buf_out[TPM_SPAPR_BUFSIZE];
> + QEMU_UNINITIALIZED uint8_t buf_in[TPM_SPAPR_BUFSIZE];
> + QEMU_UNINITIALIZED uint8_t buf_out[TPM_SPAPR_BUFSIZE];
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> ssize_t ret;
>
> trace_spapr_tpm_execute(data_in, data_in_size, data_out, data_out_size);
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (24 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 27/31] hw/scsi/lsi53c895a: " Daniel P. Berrangé
` (7 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'ohci_service_iso_td' method has a 8k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when reading data from guest
memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/usb/hcd-ohci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 71b54914d3..72a9f9f474 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -577,7 +577,7 @@ static int ohci_service_iso_td(OHCIState *ohci, struct ohci_ed *ed)
USBDevice *dev;
USBEndpoint *ep;
USBPacket *pkt;
- uint8_t buf[8192];
+ QEMU_UNINITIALIZED uint8_t buf[8192];
bool int_req;
struct ohci_iso_td iso_td;
uint32_t addr;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 27/31] hw/scsi/lsi53c895a: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (25 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays Daniel P. Berrangé
` (6 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'lsi_memcpy' method has a 4k byte array used for copying data
to/from the device. Skip the automatic zero-init of this array to
eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when data is copied.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/scsi/lsi53c895a.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index f4f2ef321e..9ea4aa0a85 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1112,7 +1112,7 @@ bad:
static void lsi_memcpy(LSIState *s, uint32_t dest, uint32_t src, int count)
{
int n;
- uint8_t buf[LSI_BUF_SIZE];
+ QEMU_UNINITIALIZED uint8_t buf[LSI_BUF_SIZE];
trace_lsi_memcpy(dest, src, count);
while (count) {
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (26 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 27/31] hw/scsi/lsi53c895a: " Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array Daniel P. Berrangé
` (5 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'megasas_dcmd_pd_get_list' and 'megasas_dcmd_get_properties'
methods have 4k structs used for copying data from the device.
Skip the automatic zero-init of this array to eliminate the
performance overhead in the I/O hot path.
The 'info' structs are manually initialized with memset(). The
compiler ought to be intelligent enough to turn the memset()
into a static initialization operation, and thus not duplicate
the automatic zero-init. Replacing memset() with '{}' makes it
unambiguous that the arrays are statically initialized.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/scsi/megasas.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 55cd188bd5..844643d916 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -981,13 +981,11 @@ static int megasas_event_wait(MegasasState *s, MegasasCmd *cmd)
static int megasas_dcmd_pd_get_list(MegasasState *s, MegasasCmd *cmd)
{
- struct mfi_pd_list info;
- size_t dcmd_size = sizeof(info);
+ struct mfi_pd_list info = {};
BusChild *kid;
uint32_t offset, dcmd_limit, num_pd_disks = 0, max_pd_disks;
dma_addr_t residual;
- memset(&info, 0, dcmd_size);
offset = 8;
dcmd_limit = offset + sizeof(struct mfi_pd_address);
if (cmd->iov_size < dcmd_limit) {
@@ -1429,11 +1427,10 @@ static int megasas_dcmd_cfg_read(MegasasState *s, MegasasCmd *cmd)
static int megasas_dcmd_get_properties(MegasasState *s, MegasasCmd *cmd)
{
- struct mfi_ctrl_props info;
+ struct mfi_ctrl_props info = {};
size_t dcmd_size = sizeof(info);
dma_addr_t residual;
- memset(&info, 0x0, dcmd_size);
if (cmd->iov_size < dcmd_size) {
trace_megasas_dcmd_invalid_xfer_len(cmd->index, cmd->iov_size,
dcmd_size);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (27 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 30/31] net/socket: " Daniel P. Berrangé
` (4 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'ufs_emulate_scsi_cmd' method has a 4k byte array used for
copying data from the device. Skip the automatic zero-init of
this array to eliminate the performance overhead in the I/O hot
path.
The 'outbuf' array will be fully initialized when data is copied
from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/ufs/lu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/ufs/lu.c b/hw/ufs/lu.c
index 57b307ea56..2d8ffd72c5 100644
--- a/hw/ufs/lu.c
+++ b/hw/ufs/lu.c
@@ -194,7 +194,7 @@ static int ufs_emulate_wlun_inquiry(UfsRequest *req, uint8_t *outbuf,
static UfsReqResult ufs_emulate_scsi_cmd(UfsLu *lu, UfsRequest *req)
{
uint8_t lun = lu->lun;
- uint8_t outbuf[4096];
+ QEMU_UNINITIALIZED uint8_t outbuf[4096];
uint8_t sense_buf[UFS_SENSE_SIZE];
uint8_t scsi_status;
int len = 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 30/31] net/socket: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (28 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 31/31] net/stream: " Daniel P. Berrangé
` (3 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'net_socket_send' method has a 68k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf1' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
net/socket.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/socket.c b/net/socket.c
index 8e3702e1f3..784dda686f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -157,7 +157,7 @@ static void net_socket_send(void *opaque)
NetSocketState *s = opaque;
int size;
int ret;
- uint8_t buf1[NET_BUFSIZE];
+ QEMU_UNINITIALIZED uint8_t buf1[NET_BUFSIZE];
const uint8_t *buf;
size = recv(s->fd, buf1, sizeof(buf1), 0);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* [PATCH 31/31] net/stream: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (29 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 30/31] net/socket: " Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
` (2 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'net_stream_send' method has a 68k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf1' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
net/stream.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/stream.c b/net/stream.c
index 4de5613844..6152d2a05e 100644
--- a/net/stream.c
+++ b/net/stream.c
@@ -148,7 +148,7 @@ static gboolean net_stream_send(QIOChannel *ioc,
NetStreamState *s = data;
int size;
int ret;
- char buf1[NET_BUFSIZE];
+ QEMU_UNINITIALIZED char buf1[NET_BUFSIZE];
const char *buf;
size = qio_channel_read(s->ioc, buf1, sizeof(buf1), NULL);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (30 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 31/31] net/stream: " Daniel P. Berrangé
@ 2025-06-10 12:49 ` Philippe Mathieu-Daudé
2025-06-10 12:56 ` Daniel P. Berrangé
2025-06-10 14:04 ` Stefan Hajnoczi
2025-06-11 19:19 ` Stefan Hajnoczi
33 siblings, 1 reply; 42+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-06-10 12:49 UTC (permalink / raw)
To: Daniel P. Berrangé, qemu-devel
Cc: Michael S. Tsirkin, Peter Maydell, qemu-block, Stefan Hajnoczi,
Kevin Wolf
On 10/6/25 14:36, Daniel P. Berrangé wrote:
> This series is an extension of Stefan's proposal:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>
> It used '-Wframe-larger-than=4096' to identify all code locations
> with more than 4k on the stack. Any locations in the I/O paths
> were chosen to avoid automatic zero-init, to eliminate the performance
> overhead of the automatic initialization.
Should we eventually add it to our default CFLAGS?
> Note, although all the changed locations are I/O paths, this does not
> imply that the changes make a measurement performance difference in
> every case.
>
> This is because many of the emulated devices are likely limited by the
> emulation impl, rather than any implicit memory zero'ing overhead of
> stack data.
>
> None the less the memory zero'ing is still a redundant CPU burn in
> all these cases, so I felt it worth setting the general precedent
> that any data over 4k on a stack in a device I/O path should be
> skipping zero-init.
>
> I did reasonable review in each case to identify that the data was
> indeed initialized explicitly later in the method.
^ permalink raw reply [flat|nested] 42+ messages in thread* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
@ 2025-06-10 12:56 ` Daniel P. Berrangé
2025-06-10 15:00 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:56 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, Michael S. Tsirkin, Peter Maydell, qemu-block,
Stefan Hajnoczi, Kevin Wolf
On Tue, Jun 10, 2025 at 02:49:02PM +0200, Philippe Mathieu-Daudé wrote:
> On 10/6/25 14:36, Daniel P. Berrangé wrote:
> > This series is an extension of Stefan's proposal:
> >
> > https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
> >
> > It used '-Wframe-larger-than=4096' to identify all code locations
> > with more than 4k on the stack. Any locations in the I/O paths
> > were chosen to avoid automatic zero-init, to eliminate the performance
> > overhead of the automatic initialization.
>
> Should we eventually add it to our default CFLAGS?
You mean as a way to impose a hard limit on stack frame size ?
FWIW, the VNC server is biggest at 100kb, but we have a large
number of other big frames - here's a sorted list of frame
sizes vs frequency
1 103696
2 69664
46 49184
4 42608
23 32880
23 32864
1 32832
1 28688
36 24800
10 24784
1 22240
21 21760
1 20624
1 17536
2 17168
1 16592
1 16576
1 16544
1 16512
2 16480
2 16464
1 16448
4 16432
1 15136
1 12960
1 12336
1 12048
1 10064
5 9184
1 8832
1 8784
1 8576
1 8448
2 8400
1 8384
2 8304
1 8272
4 8256
1 8240
4 8224
7 8208
2 7120
1 6800
1 5856
1 4928
2 4560
1 4464
1 4336
2 4256
1 4240
2 4224
4 4208
9 4176
9 4160
40 4144
68 4128
9 4112
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 42+ messages in thread* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:56 ` Daniel P. Berrangé
@ 2025-06-10 15:00 ` Philippe Mathieu-Daudé
2025-06-10 15:56 ` Daniel P. Berrangé
0 siblings, 1 reply; 42+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-06-10 15:00 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Michael S. Tsirkin, Peter Maydell, qemu-block,
Stefan Hajnoczi, Kevin Wolf
On 10/6/25 14:56, Daniel P. Berrangé wrote:
> On Tue, Jun 10, 2025 at 02:49:02PM +0200, Philippe Mathieu-Daudé wrote:
>> On 10/6/25 14:36, Daniel P. Berrangé wrote:
>>> This series is an extension of Stefan's proposal:
>>>
>>> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>>>
>>> It used '-Wframe-larger-than=4096' to identify all code locations
>>> with more than 4k on the stack. Any locations in the I/O paths
>>> were chosen to avoid automatic zero-init, to eliminate the performance
>>> overhead of the automatic initialization.
>>
>> Should we eventually add it to our default CFLAGS?
>
> You mean as a way to impose a hard limit on stack frame size ?
Yes, we could use -Wframe-larger-than=50000 for this release, then
aim to reduce by 5/10k every release until some reasonable limit...
> FWIW, the VNC server is biggest at 100kb, but we have a large
> number of other big frames - here's a sorted list of frame
> sizes vs frequency
>
> 1 103696
> 2 69664
> 46 49184
> 4 42608
> 23 32880
> 23 32864
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 15:00 ` Philippe Mathieu-Daudé
@ 2025-06-10 15:56 ` Daniel P. Berrangé
0 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 15:56 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, Michael S. Tsirkin, Peter Maydell, qemu-block,
Stefan Hajnoczi, Kevin Wolf
On Tue, Jun 10, 2025 at 05:00:43PM +0200, Philippe Mathieu-Daudé wrote:
> On 10/6/25 14:56, Daniel P. Berrangé wrote:
> > On Tue, Jun 10, 2025 at 02:49:02PM +0200, Philippe Mathieu-Daudé wrote:
> > > On 10/6/25 14:36, Daniel P. Berrangé wrote:
> > > > This series is an extension of Stefan's proposal:
> > > >
> > > > https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
> > > >
> > > > It used '-Wframe-larger-than=4096' to identify all code locations
> > > > with more than 4k on the stack. Any locations in the I/O paths
> > > > were chosen to avoid automatic zero-init, to eliminate the performance
> > > > overhead of the automatic initialization.
> > >
> > > Should we eventually add it to our default CFLAGS?
> >
> > You mean as a way to impose a hard limit on stack frame size ?
>
> Yes, we could use -Wframe-larger-than=50000 for this release, then
> aim to reduce by 5/10k every release until some reasonable limit...
I guess the key thing is to define approximately what our target
should be ?
There are some large stack frames in QEMU device I/O paths that are
not unreasonable to have, because they avoid need to malloc in the
I/O path. A greater many devices have 4/8k buffers for copying
data around, and some bigger ones are fairly valid to have too.
On the flip side there's still way too much use of PATH_MAX that
is gratuitous and should be removed, but its hard to enforce the
latter without negatively impacting the former.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (31 preceding siblings ...)
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
@ 2025-06-10 14:04 ` Stefan Hajnoczi
2025-06-11 19:19 ` Stefan Hajnoczi
33 siblings, 0 replies; 42+ messages in thread
From: Stefan Hajnoczi @ 2025-06-10 14:04 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Philippe Mathieu-Daudé, Michael S. Tsirkin,
Peter Maydell, qemu-block, Kevin Wolf
[-- Attachment #1: Type: text/plain, Size: 4637 bytes --]
On Tue, Jun 10, 2025 at 01:36:38PM +0100, Daniel P. Berrangé wrote:
> This series is an extension of Stefan's proposal:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>
> It used '-Wframe-larger-than=4096' to identify all code locations
> with more than 4k on the stack. Any locations in the I/O paths
> were chosen to avoid automatic zero-init, to eliminate the performance
> overhead of the automatic initialization.
>
> Note, although all the changed locations are I/O paths, this does not
> imply that the changes make a measurement performance difference in
> every case.
>
> This is because many of the emulated devices are likely limited by the
> emulation impl, rather than any implicit memory zero'ing overhead of
> stack data.
>
> None the less the memory zero'ing is still a redundant CPU burn in
> all these cases, so I felt it worth setting the general precedent
> that any data over 4k on a stack in a device I/O path should be
> skipping zero-init.
>
> I did reasonable review in each case to identify that the data was
> indeed initialized explicitly later in the method.
>
> Daniel P. Berrangé (29):
> block: skip automatic zero-init of large array in ioq_submit
> chardev/char-fd: skip automatic zero-init of large array
> chardev/char-pty: skip automatic zero-init of large array
> chardev/char-socket: skip automatic zero-init of large array
> hw/audio/ac97: skip automatic zero-init of large arrays
> hw/audio/cs4231a: skip automatic zero-init of large arrays
> hw/audio/es1370: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/sb16: skip automatic zero-init of large array
> hw/audio/via-ac97: skip automatic zero-init of large array
> hw/char/sclpconsole-lm: skip automatic zero-init of large array
> hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
> hw/display/vmware_vga: skip automatic zero-init of large struct
> hw/hyperv/syndbg: skip automatic zero-init of large array
> hw/misc/aspeed_hace: skip automatic zero-init of large array
> hw/net/rtl8139: skip automatic zero-init of large array
> hw/net/tulip: skip automatic zero-init of large array
> hw/net/virtio-net: skip automatic zero-init of large arrays
> hw/net/xgamc: skip automatic zero-init of large array
> hw/nvme/ctrl: skip automatic zero-init of large arrays
> hw/ppc/pnv_occ: skip automatic zero-init of large struct
> hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
> hw/usb/hcd-ohci: skip automatic zero-init of large array
> hw/scsi/lsi53c895a: skip automatic zero-init of large array
> hw/scsi/megasas: skip automatic zero-init of large arrays
> hw/ufs/lu: skip automatic zero-init of large array
> net/socket: skip automatic zero-init of large array
> net/stream: skip automatic zero-init of large array
>
> Stefan Hajnoczi (2):
> include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
> hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
>
> block/linux-aio.c | 2 +-
> chardev/char-fd.c | 2 +-
> chardev/char-pty.c | 2 +-
> chardev/char-socket.c | 2 +-
> hw/audio/ac97.c | 4 ++--
> hw/audio/cs4231a.c | 4 ++--
> hw/audio/es1370.c | 2 +-
> hw/audio/gus.c | 2 +-
> hw/audio/marvell_88w8618.c | 2 +-
> hw/audio/sb16.c | 2 +-
> hw/audio/via-ac97.c | 2 +-
> hw/char/sclpconsole-lm.c | 2 +-
> hw/display/vmware_vga.c | 2 +-
> hw/dma/xlnx_csu_dma.c | 2 +-
> hw/hyperv/syndbg.c | 2 +-
> hw/misc/aspeed_hace.c | 2 +-
> hw/net/rtl8139.c | 2 +-
> hw/net/tulip.c | 2 +-
> hw/net/virtio-net.c | 6 +++---
> hw/net/xgmac.c | 2 +-
> hw/nvme/ctrl.c | 6 +++---
> hw/ppc/pnv_occ.c | 2 +-
> hw/ppc/spapr_tpm_proxy.c | 4 ++--
> hw/scsi/lsi53c895a.c | 2 +-
> hw/scsi/megasas.c | 7 ++-----
> hw/ufs/lu.c | 2 +-
> hw/usb/hcd-ohci.c | 2 +-
> hw/virtio/virtio.c | 8 ++++----
> include/qemu/compiler.h | 20 ++++++++++++++++++++
> net/socket.c | 2 +-
> net/stream.c | 2 +-
> 31 files changed, 61 insertions(+), 44 deletions(-)
>
> --
> 2.49.0
>
I will merge this series but want to give others a chance to review it
first:
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (32 preceding siblings ...)
2025-06-10 14:04 ` Stefan Hajnoczi
@ 2025-06-11 19:19 ` Stefan Hajnoczi
33 siblings, 0 replies; 42+ messages in thread
From: Stefan Hajnoczi @ 2025-06-11 19:19 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Philippe Mathieu-Daudé, Michael S. Tsirkin,
Peter Maydell, qemu-block, Kevin Wolf
[-- Attachment #1: Type: text/plain, Size: 4596 bytes --]
On Tue, Jun 10, 2025 at 01:36:38PM +0100, Daniel P. Berrangé wrote:
> This series is an extension of Stefan's proposal:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>
> It used '-Wframe-larger-than=4096' to identify all code locations
> with more than 4k on the stack. Any locations in the I/O paths
> were chosen to avoid automatic zero-init, to eliminate the performance
> overhead of the automatic initialization.
>
> Note, although all the changed locations are I/O paths, this does not
> imply that the changes make a measurement performance difference in
> every case.
>
> This is because many of the emulated devices are likely limited by the
> emulation impl, rather than any implicit memory zero'ing overhead of
> stack data.
>
> None the less the memory zero'ing is still a redundant CPU burn in
> all these cases, so I felt it worth setting the general precedent
> that any data over 4k on a stack in a device I/O path should be
> skipping zero-init.
>
> I did reasonable review in each case to identify that the data was
> indeed initialized explicitly later in the method.
>
> Daniel P. Berrangé (29):
> block: skip automatic zero-init of large array in ioq_submit
> chardev/char-fd: skip automatic zero-init of large array
> chardev/char-pty: skip automatic zero-init of large array
> chardev/char-socket: skip automatic zero-init of large array
> hw/audio/ac97: skip automatic zero-init of large arrays
> hw/audio/cs4231a: skip automatic zero-init of large arrays
> hw/audio/es1370: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/sb16: skip automatic zero-init of large array
> hw/audio/via-ac97: skip automatic zero-init of large array
> hw/char/sclpconsole-lm: skip automatic zero-init of large array
> hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
> hw/display/vmware_vga: skip automatic zero-init of large struct
> hw/hyperv/syndbg: skip automatic zero-init of large array
> hw/misc/aspeed_hace: skip automatic zero-init of large array
> hw/net/rtl8139: skip automatic zero-init of large array
> hw/net/tulip: skip automatic zero-init of large array
> hw/net/virtio-net: skip automatic zero-init of large arrays
> hw/net/xgamc: skip automatic zero-init of large array
> hw/nvme/ctrl: skip automatic zero-init of large arrays
> hw/ppc/pnv_occ: skip automatic zero-init of large struct
> hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
> hw/usb/hcd-ohci: skip automatic zero-init of large array
> hw/scsi/lsi53c895a: skip automatic zero-init of large array
> hw/scsi/megasas: skip automatic zero-init of large arrays
> hw/ufs/lu: skip automatic zero-init of large array
> net/socket: skip automatic zero-init of large array
> net/stream: skip automatic zero-init of large array
>
> Stefan Hajnoczi (2):
> include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
> hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
>
> block/linux-aio.c | 2 +-
> chardev/char-fd.c | 2 +-
> chardev/char-pty.c | 2 +-
> chardev/char-socket.c | 2 +-
> hw/audio/ac97.c | 4 ++--
> hw/audio/cs4231a.c | 4 ++--
> hw/audio/es1370.c | 2 +-
> hw/audio/gus.c | 2 +-
> hw/audio/marvell_88w8618.c | 2 +-
> hw/audio/sb16.c | 2 +-
> hw/audio/via-ac97.c | 2 +-
> hw/char/sclpconsole-lm.c | 2 +-
> hw/display/vmware_vga.c | 2 +-
> hw/dma/xlnx_csu_dma.c | 2 +-
> hw/hyperv/syndbg.c | 2 +-
> hw/misc/aspeed_hace.c | 2 +-
> hw/net/rtl8139.c | 2 +-
> hw/net/tulip.c | 2 +-
> hw/net/virtio-net.c | 6 +++---
> hw/net/xgmac.c | 2 +-
> hw/nvme/ctrl.c | 6 +++---
> hw/ppc/pnv_occ.c | 2 +-
> hw/ppc/spapr_tpm_proxy.c | 4 ++--
> hw/scsi/lsi53c895a.c | 2 +-
> hw/scsi/megasas.c | 7 ++-----
> hw/ufs/lu.c | 2 +-
> hw/usb/hcd-ohci.c | 2 +-
> hw/virtio/virtio.c | 8 ++++----
> include/qemu/compiler.h | 20 ++++++++++++++++++++
> net/socket.c | 2 +-
> net/stream.c | 2 +-
> 31 files changed, 61 insertions(+), 44 deletions(-)
>
> --
> 2.49.0
>
Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread