* [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
@ 2025-06-10 12:36 Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro Daniel P. Berrangé
` (33 more replies)
0 siblings, 34 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
This series is an extension of Stefan's proposal:
https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
It used '-Wframe-larger-than=4096' to identify all code locations
with more than 4k on the stack. Any locations in the I/O paths
were chosen to avoid automatic zero-init, to eliminate the performance
overhead of the automatic initialization.
Note, although all the changed locations are I/O paths, this does not
imply that the changes make a measurement performance difference in
every case.
This is because many of the emulated devices are likely limited by the
emulation impl, rather than any implicit memory zero'ing overhead of
stack data.
None the less the memory zero'ing is still a redundant CPU burn in
all these cases, so I felt it worth setting the general precedent
that any data over 4k on a stack in a device I/O path should be
skipping zero-init.
I did reasonable review in each case to identify that the data was
indeed initialized explicitly later in the method.
Daniel P. Berrangé (29):
block: skip automatic zero-init of large array in ioq_submit
chardev/char-fd: skip automatic zero-init of large array
chardev/char-pty: skip automatic zero-init of large array
chardev/char-socket: skip automatic zero-init of large array
hw/audio/ac97: skip automatic zero-init of large arrays
hw/audio/cs4231a: skip automatic zero-init of large arrays
hw/audio/es1370: skip automatic zero-init of large array
hw/audio/gus: skip automatic zero-init of large array
hw/audio/gus: skip automatic zero-init of large array
hw/audio/sb16: skip automatic zero-init of large array
hw/audio/via-ac97: skip automatic zero-init of large array
hw/char/sclpconsole-lm: skip automatic zero-init of large array
hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
hw/display/vmware_vga: skip automatic zero-init of large struct
hw/hyperv/syndbg: skip automatic zero-init of large array
hw/misc/aspeed_hace: skip automatic zero-init of large array
hw/net/rtl8139: skip automatic zero-init of large array
hw/net/tulip: skip automatic zero-init of large array
hw/net/virtio-net: skip automatic zero-init of large arrays
hw/net/xgamc: skip automatic zero-init of large array
hw/nvme/ctrl: skip automatic zero-init of large arrays
hw/ppc/pnv_occ: skip automatic zero-init of large struct
hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
hw/usb/hcd-ohci: skip automatic zero-init of large array
hw/scsi/lsi53c895a: skip automatic zero-init of large array
hw/scsi/megasas: skip automatic zero-init of large arrays
hw/ufs/lu: skip automatic zero-init of large array
net/socket: skip automatic zero-init of large array
net/stream: skip automatic zero-init of large array
Stefan Hajnoczi (2):
include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
block/linux-aio.c | 2 +-
chardev/char-fd.c | 2 +-
chardev/char-pty.c | 2 +-
chardev/char-socket.c | 2 +-
hw/audio/ac97.c | 4 ++--
hw/audio/cs4231a.c | 4 ++--
hw/audio/es1370.c | 2 +-
hw/audio/gus.c | 2 +-
hw/audio/marvell_88w8618.c | 2 +-
hw/audio/sb16.c | 2 +-
hw/audio/via-ac97.c | 2 +-
hw/char/sclpconsole-lm.c | 2 +-
hw/display/vmware_vga.c | 2 +-
hw/dma/xlnx_csu_dma.c | 2 +-
hw/hyperv/syndbg.c | 2 +-
hw/misc/aspeed_hace.c | 2 +-
hw/net/rtl8139.c | 2 +-
hw/net/tulip.c | 2 +-
hw/net/virtio-net.c | 6 +++---
hw/net/xgmac.c | 2 +-
hw/nvme/ctrl.c | 6 +++---
hw/ppc/pnv_occ.c | 2 +-
hw/ppc/spapr_tpm_proxy.c | 4 ++--
hw/scsi/lsi53c895a.c | 2 +-
hw/scsi/megasas.c | 7 ++-----
hw/ufs/lu.c | 2 +-
hw/usb/hcd-ohci.c | 2 +-
hw/virtio/virtio.c | 8 ++++----
include/qemu/compiler.h | 20 ++++++++++++++++++++
net/socket.c | 2 +-
net/stream.c | 2 +-
31 files changed, 61 insertions(+), 44 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path Daniel P. Berrangé
` (32 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
From: Stefan Hajnoczi <stefanha@redhat.com>
The QEMU_UNINITIALIZED macro is to be used to skip the default compiler
variable initialization done by -ftrivial-auto-var-init=zero.
Use this in cases where there a method in the device I/O path (or other
important hot paths), that has large variables on the stack. A rule of
thumb is that "large" means a method with 4kb data in the local stack
frame. Any variables which are KB in size, should be annotated with this
attribute, to pre-emptively eliminate any potential overhead from the
compiler zero'ing memory.
Given that this turns off a security hardening feature, when using this
to flag variables, it is important that the code is double-checked to
ensure there is no possible use of uninitialized data in the method.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[DB: split off patch & rewrite guidance on when to use the annotation]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
include/qemu/compiler.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 496dac5ac1..65b89958d3 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -207,6 +207,26 @@
# define QEMU_USED
#endif
+/*
+ * Disable -ftrivial-auto-var-init on a local variable.
+ *
+ * Use this in cases where there a method in the device I/O path (or other
+ * important hot paths), that has large variables on the stack. A rule of
+ * thumb is that "large" means a method with 4kb data in the local stack
+ * frame. Any variables which are KB in size, should be annotated with this
+ * attribute, to pre-emptively eliminate any potential overhead from the
+ * compiler's implicit zero'ing of memory.
+ *
+ * Given that this turns off a security hardening feature, when using this
+ * to flag variables, it is important that the code is double-checked to
+ * ensure there is no possible use of uninitialized data in the method.
+ */
+#if __has_attribute(uninitialized)
+# define QEMU_UNINITIALIZED __attribute__((uninitialized))
+#else
+# define QEMU_UNINITIALIZED
+#endif
+
/*
* http://clang.llvm.org/docs/ThreadSafetyAnalysis.html
*
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit Daniel P. Berrangé
` (31 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
From: Stefan Hajnoczi <stefanha@redhat.com>
Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
used to zero local variables. While this reduces security risks
associated with uninitialized stack data, it introduced a measurable
bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
functions.
These virtqueue functions are in the hot path. They are called for each
element (request) that is popped from a VIRTIO device's virtqueue. Using
__attribute__((uninitialized)) on large stack variables in these
functions improves fio randread bs=4k iodepth=64 performance from 304k
to 332k IOPS (+9%).
This issue was found using perf-top(1). virtqueue_split_pop() was one of
the top CPU consumers and the "annotate" feature showed that the memory
zeroing instructions at the beginning of the functions were hot.
Fixes: 7ff9ff039380 ("meson: mitigate against use of uninitialize stack for exploits")
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/virtio/virtio.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5534251e01..82a285a31d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1689,8 +1689,8 @@ static void *virtqueue_split_pop(VirtQueue *vq, size_t sz)
VirtIODevice *vdev = vq->vdev;
VirtQueueElement *elem = NULL;
unsigned out_num, in_num, elem_entries;
- hwaddr addr[VIRTQUEUE_MAX_SIZE];
- struct iovec iov[VIRTQUEUE_MAX_SIZE];
+ hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
+ struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
VRingDesc desc;
int rc;
@@ -1836,8 +1836,8 @@ static void *virtqueue_packed_pop(VirtQueue *vq, size_t sz)
VirtIODevice *vdev = vq->vdev;
VirtQueueElement *elem = NULL;
unsigned out_num, in_num, elem_entries;
- hwaddr addr[VIRTQUEUE_MAX_SIZE];
- struct iovec iov[VIRTQUEUE_MAX_SIZE];
+ hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
+ struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
VRingPackedDesc desc;
uint16_t id;
int rc;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array Daniel P. Berrangé
` (30 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'ioq_submit' method has a struct array that is 8k in size.
Skip the automatic zero-init of this array to eliminate the
performance overhead in the I/O hot path.
The 'iocbs' array will selectively initialized when processing
the I/O data.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
block/linux-aio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 407369f5c9..c200e7ad20 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -291,7 +291,7 @@ static void ioq_submit(LinuxAioState *s)
{
int ret, len;
struct qemu_laiocb *aiocb;
- struct iocb *iocbs[MAX_EVENTS];
+ QEMU_UNINITIALIZED struct iocb *iocbs[MAX_EVENTS];
QSIMPLEQ_HEAD(, qemu_laiocb) completed;
do {
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (2 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 05/31] chardev/char-pty: " Daniel P. Berrangé
` (29 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'fd_chr_read' method has a 4k byte array used for copying
data between the socket and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
chardev/char-fd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index 23bfe3c0b1..6f03adf872 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -50,7 +50,7 @@ static gboolean fd_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
Chardev *chr = CHARDEV(opaque);
FDChardev *s = FD_CHARDEV(opaque);
int len;
- uint8_t buf[CHR_READ_BUF_LEN];
+ QEMU_UNINITIALIZED uint8_t buf[CHR_READ_BUF_LEN];
ssize_t ret;
len = sizeof(buf);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 05/31] chardev/char-pty: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (3 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 06/31] chardev/char-socket: " Daniel P. Berrangé
` (28 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'pty_chr_read' method has a 4k byte array used for copying
data between the PTY and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the PTY.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
chardev/char-pty.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index c28554e6e0..674e9b3f14 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -154,7 +154,7 @@ static gboolean pty_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
Chardev *chr = CHARDEV(opaque);
PtyChardev *s = PTY_CHARDEV(opaque);
gsize len;
- uint8_t buf[CHR_READ_BUF_LEN];
+ QEMU_UNINITIALIZED uint8_t buf[CHR_READ_BUF_LEN];
ssize_t ret;
len = sizeof(buf);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 06/31] chardev/char-socket: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (4 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 05/31] chardev/char-pty: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays Daniel P. Berrangé
` (27 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'tcp_chr_read' method has a 4k byte array used for copying
data between the socket and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
chardev/char-socket.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index e8dd2931dc..1e8313915b 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -497,7 +497,7 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
{
Chardev *chr = CHARDEV(opaque);
SocketChardev *s = SOCKET_CHARDEV(opaque);
- uint8_t buf[CHR_READ_BUF_LEN];
+ QEMU_UNINITIALIZED uint8_t buf[CHR_READ_BUF_LEN];
int len, size;
if ((s->state != TCP_CHARDEV_STATE_CONNECTED) ||
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (5 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 06/31] chardev/char-socket: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 08/31] hw/audio/cs4231a: " Daniel P. Berrangé
` (26 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'read_audio' & 'write_audio' methods have a 4k byte array used
for copying data between the audio backend and device. Skip the
automatic zero-init of these arrays to eliminate the performance
overhead in the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data from
the audio backend and/or device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/ac97.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/audio/ac97.c b/hw/audio/ac97.c
index 669a0463cc..eb7a847080 100644
--- a/hw/audio/ac97.c
+++ b/hw/audio/ac97.c
@@ -886,7 +886,7 @@ static void nabm_writel(void *opaque, uint32_t addr, uint32_t val)
static int write_audio(AC97LinkState *s, AC97BusMasterRegs *r,
int max, int *stop)
{
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
uint32_t addr = r->bd.addr;
uint32_t temp = r->picb << 1;
uint32_t written = 0;
@@ -959,7 +959,7 @@ static void write_bup(AC97LinkState *s, int elapsed)
static int read_audio(AC97LinkState *s, AC97BusMasterRegs *r,
int max, int *stop)
{
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
uint32_t addr = r->bd.addr;
uint32_t temp = r->picb << 1;
uint32_t nread = 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 08/31] hw/audio/cs4231a: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (6 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array Daniel P. Berrangé
` (25 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'cs_write_audio' method has a pair of byte arrays, one 4k in size
and one 8k, which are used in converting audio samples. Skip the
automatic zero-init of these arrays to eliminate the performance
overhead in the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading a block of
data from the guest. The 'linbuf' array will be fully initialized
when converting the audio samples.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/cs4231a.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/audio/cs4231a.c b/hw/audio/cs4231a.c
index eb9a45805b..6dfff202ff 100644
--- a/hw/audio/cs4231a.c
+++ b/hw/audio/cs4231a.c
@@ -528,7 +528,7 @@ static int cs_write_audio (CSState *s, int nchan, int dma_pos,
int dma_len, int len)
{
int temp, net;
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
IsaDmaClass *k = ISADMA_GET_CLASS(s->isa_dma);
temp = len;
@@ -547,7 +547,7 @@ static int cs_write_audio (CSState *s, int nchan, int dma_pos,
copied = k->read_memory(s->isa_dma, nchan, tmpbuf, dma_pos, to_copy);
if (s->tab) {
int i;
- int16_t linbuf[4096];
+ QEMU_UNINITIALIZED int16_t linbuf[4096];
for (i = 0; i < copied; ++i)
linbuf[i] = s->tab[tmpbuf[i]];
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (7 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 08/31] hw/audio/cs4231a: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 10/31] hw/audio/gus: " Daniel P. Berrangé
` (24 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'es1370_transfer_audio' method has a 4k byte array used for
copying data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data from
the audio backend and/or device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/es1370.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/es1370.c b/hw/audio/es1370.c
index 8efb969212..a6a32a6348 100644
--- a/hw/audio/es1370.c
+++ b/hw/audio/es1370.c
@@ -604,7 +604,7 @@ static uint64_t es1370_read(void *opaque, hwaddr addr, unsigned size)
static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel,
int max, bool *irq)
{
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
size_t to_transfer;
uint32_t addr = d->frame_addr;
int sc = d->scount & 0xffff;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 10/31] hw/audio/gus: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (8 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
` (23 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'GUS_read_DMA' method has a 4k byte array used for copying
data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data
from device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/gus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/gus.c b/hw/audio/gus.c
index 87e8634893..c36df0240f 100644
--- a/hw/audio/gus.c
+++ b/hw/audio/gus.c
@@ -183,7 +183,7 @@ static int GUS_read_DMA (void *opaque, int nchan, int dma_pos, int dma_len)
{
GUSState *s = opaque;
IsaDmaClass *k = ISADMA_GET_CLASS(s->isa_dma);
- char tmpbuf[4096];
+ QEMU_UNINITIALIZED char tmpbuf[4096];
int pos = dma_pos, mode, left = dma_len - dma_pos;
ldebug ("read DMA %#x %d\n", dma_pos, dma_len);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 11/31] hw/audio/gus: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (9 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 10/31] hw/audio/gus: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 14:23 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 12/31] hw/audio/sb16: " Daniel P. Berrangé
` (22 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'mv88w8618_audio_callback' method has a 4k byte array used for
copying data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'buf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/marvell_88w8618.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/marvell_88w8618.c b/hw/audio/marvell_88w8618.c
index 6d3ebbb0c8..c5c79d083a 100644
--- a/hw/audio/marvell_88w8618.c
+++ b/hw/audio/marvell_88w8618.c
@@ -66,7 +66,7 @@ static void mv88w8618_audio_callback(void *opaque, int free_out, int free_in)
{
mv88w8618_audio_state *s = opaque;
int16_t *codec_buffer;
- int8_t buf[4096];
+ QEMU_UNINITIALIZED int8_t buf[4096];
int8_t *mem_buffer;
int pos, block_size;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 12/31] hw/audio/sb16: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (10 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 13/31] hw/audio/via-ac97: " Daniel P. Berrangé
` (21 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'write_audio' method has a 4k byte array used for copying data
between the audio backend and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'tmpbuf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/sb16.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/sb16.c b/hw/audio/sb16.c
index 19fd3b9020..bac64118fe 100644
--- a/hw/audio/sb16.c
+++ b/hw/audio/sb16.c
@@ -1181,7 +1181,7 @@ static int write_audio (SB16State *s, int nchan, int dma_pos,
IsaDma *isa_dma = nchan == s->dma ? s->isa_dma : s->isa_hdma;
IsaDmaClass *k = ISADMA_GET_CLASS(isa_dma);
int temp, net;
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
temp = len;
net = 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 13/31] hw/audio/via-ac97: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (11 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 12/31] hw/audio/sb16: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 14/31] hw/char/sclpconsole-lm: " Daniel P. Berrangé
` (20 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'out_cb' method has a 4k byte array used for copying data
between the audio backend and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'tmpbuf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/audio/via-ac97.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/via-ac97.c b/hw/audio/via-ac97.c
index 1e0a5c7398..d5231e1cf2 100644
--- a/hw/audio/via-ac97.c
+++ b/hw/audio/via-ac97.c
@@ -175,7 +175,7 @@ static void out_cb(void *opaque, int avail)
ViaAC97SGDChannel *c = &s->aur;
int temp, to_copy, copied;
bool stop = false;
- uint8_t tmpbuf[4096];
+ QEMU_UNINITIALIZED uint8_t tmpbuf[4096];
if (c->stat & STAT_PAUSED) {
return;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 14/31] hw/char/sclpconsole-lm: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (12 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 13/31] hw/audio/via-ac97: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 15/31] hw/dma/xlnx_csu_dma: " Daniel P. Berrangé
` (19 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'process_mdb' method has a 4k byte array used for copying data
between the guest and the chardev backend. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'buffer' array will be selectively initialized when data is converted
between EBCDIC and ASCII.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/char/sclpconsole-lm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/char/sclpconsole-lm.c b/hw/char/sclpconsole-lm.c
index e9580aacba..3e40d5e434 100644
--- a/hw/char/sclpconsole-lm.c
+++ b/hw/char/sclpconsole-lm.c
@@ -214,7 +214,7 @@ static int process_mdb(SCLPEvent *event, MDBO *mdbo)
{
int rc;
int len;
- uint8_t buffer[SIZE_BUFFER];
+ QEMU_UNINITIALIZED uint8_t buffer[SIZE_BUFFER];
len = be16_to_cpu(mdbo->length);
len -= sizeof(mdbo->length) + sizeof(mdbo->type)
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 15/31] hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (13 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 14/31] hw/char/sclpconsole-lm: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct Daniel P. Berrangé
` (18 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'xlnx_csu_dma_src_notify' method has a 4k byte array used for
copying DMA data. Skip the automatic zero-init of this array to
eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when data is copied.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/dma/xlnx_csu_dma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/dma/xlnx_csu_dma.c b/hw/dma/xlnx_csu_dma.c
index 3db3904d83..d8c7da1a50 100644
--- a/hw/dma/xlnx_csu_dma.c
+++ b/hw/dma/xlnx_csu_dma.c
@@ -287,7 +287,7 @@ static uint32_t xlnx_csu_dma_advance(XlnxCSUDMA *s, uint32_t len)
static void xlnx_csu_dma_src_notify(void *opaque)
{
XlnxCSUDMA *s = XLNX_CSU_DMA(opaque);
- unsigned char buf[4 * 1024];
+ QEMU_UNINITIALIZED unsigned char buf[4 * 1024];
size_t rlen = 0;
ptimer_transaction_begin(s->src_timer);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (14 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 15/31] hw/dma/xlnx_csu_dma: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array Daniel P. Berrangé
` (17 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'vmsvga_fifo_run' method has a struct which is a little over 20k
in size, used for holding image data for cursor changes. Skip the
automatic zero-init of this struct to eliminate the performance
overhead in the I/O hot path.
The cursor variable will be fully initialized only when processing
a cursor definition message from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/display/vmware_vga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
index 544bb65320..bc1a8ed466 100644
--- a/hw/display/vmware_vga.c
+++ b/hw/display/vmware_vga.c
@@ -618,7 +618,7 @@ static void vmsvga_fifo_run(struct vmsvga_state_s *s)
uint32_t cmd, colour;
int args, len, maxloop = 1024;
int x, y, dx, dy, width, height;
- struct vmsvga_cursor_definition_s cursor;
+ QEMU_UNINITIALIZED struct vmsvga_cursor_definition_s cursor;
uint32_t cmd_start;
len = vmsvga_fifo_length(s);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (15 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 18/31] hw/misc/aspeed_hace: " Daniel P. Berrangé
` (16 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'handle_recv_msg' method has a 4k byte array used for copying
data between the network socket and guest memory. Skip the automatic
zero-init of this array to eliminate the performance overhead in the
I/O hot path.
The 'data_buf' array will be fully initialized when data is read
off the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/hyperv/syndbg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/hyperv/syndbg.c b/hw/hyperv/syndbg.c
index 8b8a14750d..ac7e15f6f1 100644
--- a/hw/hyperv/syndbg.c
+++ b/hw/hyperv/syndbg.c
@@ -192,7 +192,7 @@ static uint16_t handle_recv_msg(HvSynDbg *syndbg, uint64_t outgpa,
{
uint16_t ret;
g_assert(MSG_BUFSZ >= qemu_target_page_size());
- uint8_t data_buf[MSG_BUFSZ];
+ QEMU_UNINITIALIZED uint8_t data_buf[MSG_BUFSZ];
hwaddr out_len;
void *out_data;
ssize_t recv_byte_count;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 18/31] hw/misc/aspeed_hace: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (16 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 19/31] hw/net/rtl8139: " Daniel P. Berrangé
` (15 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'do_hash_operation' method has a 256 element iovec array used for
holding pointers to data that is to be hashed. Skip the automatic
zero-init of this array to eliminate the performance overhead in the
I/O hot path.
The 'iovec' array will be selectively initialized based on data that
needs to be hashed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/misc/aspeed_hace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c
index 8924a30eff..726368fbbc 100644
--- a/hw/misc/aspeed_hace.c
+++ b/hw/misc/aspeed_hace.c
@@ -419,7 +419,7 @@ static void hash_execute_acc_mode(AspeedHACEState *s, int algo,
static void do_hash_operation(AspeedHACEState *s, int algo, bool sg_mode,
bool acc_mode)
{
- struct iovec iov[ASPEED_HACE_MAX_SG];
+ QEMU_UNINITIALIZED struct iovec iov[ASPEED_HACE_MAX_SG];
bool acc_final_request = false;
int iov_idx = -1;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 19/31] hw/net/rtl8139: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (17 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 18/31] hw/misc/aspeed_hace: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 20/31] hw/net/tulip: " Daniel P. Berrangé
` (14 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'rtl8139_transmit_one' method has a 8k byte array used for
copying data between guest and host. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'txbuffer' will be fully initialized when reading PCI DMA
buffers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/rtl8139.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 15b8f7501a..654a087d80 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -1816,7 +1816,7 @@ static int rtl8139_transmit_one(RTL8139State *s, int descriptor)
PCIDevice *d = PCI_DEVICE(s);
int txsize = s->TxStatus[descriptor] & 0x1fff;
- uint8_t txbuffer[0x2000];
+ QEMU_UNINITIALIZED uint8_t txbuffer[0x2000];
DPRINTF("+++ transmit reading %d bytes from host memory at 0x%08x\n",
txsize, s->TxAddr[descriptor]);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 20/31] hw/net/tulip: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (18 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 19/31] hw/net/rtl8139: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays Daniel P. Berrangé
` (13 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'tulip_setup_frame' method has a 4k byte array used for copynig
DMA data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when reading data from the
device.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/tulip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/tulip.c b/hw/net/tulip.c
index 63fe513458..319af906c8 100644
--- a/hw/net/tulip.c
+++ b/hw/net/tulip.c
@@ -629,7 +629,7 @@ static void tulip_setup_filter_addr(TULIPState *s, uint8_t *buf, int n)
static void tulip_setup_frame(TULIPState *s,
struct tulip_descriptor *desc)
{
- uint8_t buf[4096];
+ QEMU_UNINITIALIZED uint8_t buf[4096];
int len = (desc->control >> TDES1_BUF1_SIZE_SHIFT) & TDES1_BUF1_SIZE_MASK;
int i;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (19 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 20/31] hw/net/tulip: " Daniel P. Berrangé
@ 2025-06-10 12:36 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array Daniel P. Berrangé
` (12 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:36 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'virtio_net_receive_rcu' method has three arrays with
VIRTQUEUE_MAX_SIZE elements, which are apprixmately 32k in
size used for copying data between guest and host. Skip the
automatic zero-init of these arrays to eliminate the
performance overhead in the I/O hot path.
The three arrays will be selectively initialized as required
when processing network buffers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/virtio-net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 221252e00a..eb93607b8c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1911,9 +1911,9 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
VirtIONet *n = qemu_get_nic_opaque(nc);
VirtIONetQueue *q;
VirtIODevice *vdev = VIRTIO_DEVICE(n);
- VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
- size_t lens[VIRTQUEUE_MAX_SIZE];
- struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
+ QEMU_UNINITIALIZED VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
+ QEMU_UNINITIALIZED size_t lens[VIRTQUEUE_MAX_SIZE];
+ QEMU_UNINITIALIZED struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
struct virtio_net_hdr_v1_hash extra_hdr;
unsigned mhdr_cnt = 0;
size_t offset, i, guest_offset, j;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (20 preceding siblings ...)
2025-06-10 12:36 ` [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
` (11 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'xgmac_enet_send' method has a 8k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'frame' buffer will be fully initialized when reading guest
memory to fetch the data to send.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/net/xgmac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/xgmac.c b/hw/net/xgmac.c
index 9c87c4e70f..d45f872467 100644
--- a/hw/net/xgmac.c
+++ b/hw/net/xgmac.c
@@ -207,7 +207,7 @@ static void xgmac_enet_send(XgmacState *s)
struct desc bd;
int frame_size;
int len;
- uint8_t frame[8192];
+ QEMU_UNINITIALIZED uint8_t frame[8192];
uint8_t *ptr;
ptr = frame;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (21 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-11 8:55 ` Klaus Jensen
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
` (10 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'nvme_map_sgl' method has a 256 element array used for copying
data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.
The 'segment' array will be fully initialized when reading data from
the device.
The 'nme_changed_nslist' method has a 4k byte array that is manually
initialized with memset(). The compiler ought to be intelligent
enough to turn the memset() into a static initialization operation,
and thus not duplicate the automatic zero-init. Replacing memset()
with '{}' makes it unambiguous that the array is statically initialized.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/nvme/ctrl.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index fd935507bc..220002830d 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1057,7 +1057,8 @@ static uint16_t nvme_map_sgl(NvmeCtrl *n, NvmeSg *sg, NvmeSglDescriptor sgl,
*/
#define SEG_CHUNK_SIZE 256
- NvmeSglDescriptor segment[SEG_CHUNK_SIZE], *sgld, *last_sgld;
+ QEMU_UNINITIALIZED NvmeSglDescriptor segment[SEG_CHUNK_SIZE];
+ NvmeSglDescriptor *sgld, *last_sgld;
uint64_t nsgld;
uint32_t seg_len;
uint16_t status;
@@ -5128,7 +5129,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
uint64_t off, NvmeRequest *req)
{
- uint32_t nslist[1024];
+ uint32_t nslist[1024] = {};
uint32_t trans_len;
int i = 0;
uint32_t nsid;
@@ -5138,7 +5139,6 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
return NVME_INVALID_FIELD | NVME_DNR;
}
- memset(nslist, 0x0, sizeof(nslist));
trans_len = MIN(sizeof(nslist) - off, buf_len);
while ((nsid = find_first_bit(n->changed_nsids, NVME_CHANGED_NSID_SIZE)) !=
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (22 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-11 9:09 ` Harsh Prateek Bora
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
` (9 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'occ_model_tick' method has a 12k struct used for copying
data between guest and host. Skip the automatic zero-init of this
struct to eliminate the performance overhead in the I/O hot path.
The 'dynamic_data' buffer will be fully initialized when reading
data from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/ppc/pnv_occ.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
index fa6f31cb8d..24b789c191 100644
--- a/hw/ppc/pnv_occ.c
+++ b/hw/ppc/pnv_occ.c
@@ -789,7 +789,7 @@ static bool occ_opal_process_command(PnvOCC *occ,
static bool occ_model_tick(PnvOCC *occ)
{
- struct occ_dynamic_data dynamic_data;
+ QEMU_UNINITIALIZED struct occ_dynamic_data dynamic_data;
if (!occ_read_dynamic_data(occ, &dynamic_data, NULL)) {
/* Can't move OCC state field to safe because we can't map it! */
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (23 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-11 9:20 ` Harsh Prateek Bora
2025-06-10 12:37 ` [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array Daniel P. Berrangé
` (8 subsequent siblings)
33 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'tpm_execute' method has a pair of 4k arrays used for copying
data between guest and host. Skip the automatic zero-init of these
arrays to eliminate the performance overhead in the I/O hot path.
The two arrays will be fully initialized when reading data from
guest memory or reading data from the proxy FD.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/ppc/spapr_tpm_proxy.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/ppc/spapr_tpm_proxy.c b/hw/ppc/spapr_tpm_proxy.c
index 862eeaa50a..1297b3ad56 100644
--- a/hw/ppc/spapr_tpm_proxy.c
+++ b/hw/ppc/spapr_tpm_proxy.c
@@ -41,8 +41,8 @@ static ssize_t tpm_execute(SpaprTpmProxy *tpm_proxy, target_ulong *args)
target_ulong data_in_size = args[2];
uint64_t data_out = ppc64_phys_to_real(args[3]);
target_ulong data_out_size = args[4];
- uint8_t buf_in[TPM_SPAPR_BUFSIZE];
- uint8_t buf_out[TPM_SPAPR_BUFSIZE];
+ QEMU_UNINITIALIZED uint8_t buf_in[TPM_SPAPR_BUFSIZE];
+ QEMU_UNINITIALIZED uint8_t buf_out[TPM_SPAPR_BUFSIZE];
ssize_t ret;
trace_spapr_tpm_execute(data_in, data_in_size, data_out, data_out_size);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (24 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 27/31] hw/scsi/lsi53c895a: " Daniel P. Berrangé
` (7 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'ohci_service_iso_td' method has a 8k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when reading data from guest
memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/usb/hcd-ohci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 71b54914d3..72a9f9f474 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -577,7 +577,7 @@ static int ohci_service_iso_td(OHCIState *ohci, struct ohci_ed *ed)
USBDevice *dev;
USBEndpoint *ep;
USBPacket *pkt;
- uint8_t buf[8192];
+ QEMU_UNINITIALIZED uint8_t buf[8192];
bool int_req;
struct ohci_iso_td iso_td;
uint32_t addr;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 27/31] hw/scsi/lsi53c895a: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (25 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays Daniel P. Berrangé
` (6 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'lsi_memcpy' method has a 4k byte array used for copying data
to/from the device. Skip the automatic zero-init of this array to
eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when data is copied.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/scsi/lsi53c895a.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index f4f2ef321e..9ea4aa0a85 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1112,7 +1112,7 @@ bad:
static void lsi_memcpy(LSIState *s, uint32_t dest, uint32_t src, int count)
{
int n;
- uint8_t buf[LSI_BUF_SIZE];
+ QEMU_UNINITIALIZED uint8_t buf[LSI_BUF_SIZE];
trace_lsi_memcpy(dest, src, count);
while (count) {
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (26 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 27/31] hw/scsi/lsi53c895a: " Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array Daniel P. Berrangé
` (5 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'megasas_dcmd_pd_get_list' and 'megasas_dcmd_get_properties'
methods have 4k structs used for copying data from the device.
Skip the automatic zero-init of this array to eliminate the
performance overhead in the I/O hot path.
The 'info' structs are manually initialized with memset(). The
compiler ought to be intelligent enough to turn the memset()
into a static initialization operation, and thus not duplicate
the automatic zero-init. Replacing memset() with '{}' makes it
unambiguous that the arrays are statically initialized.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/scsi/megasas.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 55cd188bd5..844643d916 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -981,13 +981,11 @@ static int megasas_event_wait(MegasasState *s, MegasasCmd *cmd)
static int megasas_dcmd_pd_get_list(MegasasState *s, MegasasCmd *cmd)
{
- struct mfi_pd_list info;
- size_t dcmd_size = sizeof(info);
+ struct mfi_pd_list info = {};
BusChild *kid;
uint32_t offset, dcmd_limit, num_pd_disks = 0, max_pd_disks;
dma_addr_t residual;
- memset(&info, 0, dcmd_size);
offset = 8;
dcmd_limit = offset + sizeof(struct mfi_pd_address);
if (cmd->iov_size < dcmd_limit) {
@@ -1429,11 +1427,10 @@ static int megasas_dcmd_cfg_read(MegasasState *s, MegasasCmd *cmd)
static int megasas_dcmd_get_properties(MegasasState *s, MegasasCmd *cmd)
{
- struct mfi_ctrl_props info;
+ struct mfi_ctrl_props info = {};
size_t dcmd_size = sizeof(info);
dma_addr_t residual;
- memset(&info, 0x0, dcmd_size);
if (cmd->iov_size < dcmd_size) {
trace_megasas_dcmd_invalid_xfer_len(cmd->index, cmd->iov_size,
dcmd_size);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (27 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 30/31] net/socket: " Daniel P. Berrangé
` (4 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'ufs_emulate_scsi_cmd' method has a 4k byte array used for
copying data from the device. Skip the automatic zero-init of
this array to eliminate the performance overhead in the I/O hot
path.
The 'outbuf' array will be fully initialized when data is copied
from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
hw/ufs/lu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/ufs/lu.c b/hw/ufs/lu.c
index 57b307ea56..2d8ffd72c5 100644
--- a/hw/ufs/lu.c
+++ b/hw/ufs/lu.c
@@ -194,7 +194,7 @@ static int ufs_emulate_wlun_inquiry(UfsRequest *req, uint8_t *outbuf,
static UfsReqResult ufs_emulate_scsi_cmd(UfsLu *lu, UfsRequest *req)
{
uint8_t lun = lu->lun;
- uint8_t outbuf[4096];
+ QEMU_UNINITIALIZED uint8_t outbuf[4096];
uint8_t sense_buf[UFS_SENSE_SIZE];
uint8_t scsi_status;
int len = 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 30/31] net/socket: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (28 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 31/31] net/stream: " Daniel P. Berrangé
` (3 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'net_socket_send' method has a 68k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf1' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
net/socket.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/socket.c b/net/socket.c
index 8e3702e1f3..784dda686f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -157,7 +157,7 @@ static void net_socket_send(void *opaque)
NetSocketState *s = opaque;
int size;
int ret;
- uint8_t buf1[NET_BUFSIZE];
+ QEMU_UNINITIALIZED uint8_t buf1[NET_BUFSIZE];
const uint8_t *buf;
size = recv(s->fd, buf1, sizeof(buf1), 0);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 31/31] net/stream: skip automatic zero-init of large array
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (29 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 30/31] net/socket: " Daniel P. Berrangé
@ 2025-06-10 12:37 ` Daniel P. Berrangé
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
` (2 subsequent siblings)
33 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:37 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf, Daniel P. Berrangé
The 'net_stream_send' method has a 68k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf1' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
net/stream.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/stream.c b/net/stream.c
index 4de5613844..6152d2a05e 100644
--- a/net/stream.c
+++ b/net/stream.c
@@ -148,7 +148,7 @@ static gboolean net_stream_send(QIOChannel *ioc,
NetStreamState *s = data;
int size;
int ret;
- char buf1[NET_BUFSIZE];
+ QEMU_UNINITIALIZED char buf1[NET_BUFSIZE];
const char *buf;
size = qio_channel_read(s->ioc, buf1, sizeof(buf1), NULL);
--
2.49.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (30 preceding siblings ...)
2025-06-10 12:37 ` [PATCH 31/31] net/stream: " Daniel P. Berrangé
@ 2025-06-10 12:49 ` Philippe Mathieu-Daudé
2025-06-10 12:56 ` Daniel P. Berrangé
2025-06-10 14:04 ` Stefan Hajnoczi
2025-06-11 19:19 ` Stefan Hajnoczi
33 siblings, 1 reply; 42+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-06-10 12:49 UTC (permalink / raw)
To: Daniel P. Berrangé, qemu-devel
Cc: Michael S. Tsirkin, Peter Maydell, qemu-block, Stefan Hajnoczi,
Kevin Wolf
On 10/6/25 14:36, Daniel P. Berrangé wrote:
> This series is an extension of Stefan's proposal:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>
> It used '-Wframe-larger-than=4096' to identify all code locations
> with more than 4k on the stack. Any locations in the I/O paths
> were chosen to avoid automatic zero-init, to eliminate the performance
> overhead of the automatic initialization.
Should we eventually add it to our default CFLAGS?
> Note, although all the changed locations are I/O paths, this does not
> imply that the changes make a measurement performance difference in
> every case.
>
> This is because many of the emulated devices are likely limited by the
> emulation impl, rather than any implicit memory zero'ing overhead of
> stack data.
>
> None the less the memory zero'ing is still a redundant CPU burn in
> all these cases, so I felt it worth setting the general precedent
> that any data over 4k on a stack in a device I/O path should be
> skipping zero-init.
>
> I did reasonable review in each case to identify that the data was
> indeed initialized explicitly later in the method.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
@ 2025-06-10 12:56 ` Daniel P. Berrangé
2025-06-10 15:00 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 12:56 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, Michael S. Tsirkin, Peter Maydell, qemu-block,
Stefan Hajnoczi, Kevin Wolf
On Tue, Jun 10, 2025 at 02:49:02PM +0200, Philippe Mathieu-Daudé wrote:
> On 10/6/25 14:36, Daniel P. Berrangé wrote:
> > This series is an extension of Stefan's proposal:
> >
> > https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
> >
> > It used '-Wframe-larger-than=4096' to identify all code locations
> > with more than 4k on the stack. Any locations in the I/O paths
> > were chosen to avoid automatic zero-init, to eliminate the performance
> > overhead of the automatic initialization.
>
> Should we eventually add it to our default CFLAGS?
You mean as a way to impose a hard limit on stack frame size ?
FWIW, the VNC server is biggest at 100kb, but we have a large
number of other big frames - here's a sorted list of frame
sizes vs frequency
1 103696
2 69664
46 49184
4 42608
23 32880
23 32864
1 32832
1 28688
36 24800
10 24784
1 22240
21 21760
1 20624
1 17536
2 17168
1 16592
1 16576
1 16544
1 16512
2 16480
2 16464
1 16448
4 16432
1 15136
1 12960
1 12336
1 12048
1 10064
5 9184
1 8832
1 8784
1 8576
1 8448
2 8400
1 8384
2 8304
1 8272
4 8256
1 8240
4 8224
7 8208
2 7120
1 6800
1 5856
1 4928
2 4560
1 4464
1 4336
2 4256
1 4240
2 4224
4 4208
9 4176
9 4160
40 4144
68 4128
9 4112
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (31 preceding siblings ...)
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
@ 2025-06-10 14:04 ` Stefan Hajnoczi
2025-06-11 19:19 ` Stefan Hajnoczi
33 siblings, 0 replies; 42+ messages in thread
From: Stefan Hajnoczi @ 2025-06-10 14:04 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Philippe Mathieu-Daudé, Michael S. Tsirkin,
Peter Maydell, qemu-block, Kevin Wolf
[-- Attachment #1: Type: text/plain, Size: 4637 bytes --]
On Tue, Jun 10, 2025 at 01:36:38PM +0100, Daniel P. Berrangé wrote:
> This series is an extension of Stefan's proposal:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>
> It used '-Wframe-larger-than=4096' to identify all code locations
> with more than 4k on the stack. Any locations in the I/O paths
> were chosen to avoid automatic zero-init, to eliminate the performance
> overhead of the automatic initialization.
>
> Note, although all the changed locations are I/O paths, this does not
> imply that the changes make a measurement performance difference in
> every case.
>
> This is because many of the emulated devices are likely limited by the
> emulation impl, rather than any implicit memory zero'ing overhead of
> stack data.
>
> None the less the memory zero'ing is still a redundant CPU burn in
> all these cases, so I felt it worth setting the general precedent
> that any data over 4k on a stack in a device I/O path should be
> skipping zero-init.
>
> I did reasonable review in each case to identify that the data was
> indeed initialized explicitly later in the method.
>
> Daniel P. Berrangé (29):
> block: skip automatic zero-init of large array in ioq_submit
> chardev/char-fd: skip automatic zero-init of large array
> chardev/char-pty: skip automatic zero-init of large array
> chardev/char-socket: skip automatic zero-init of large array
> hw/audio/ac97: skip automatic zero-init of large arrays
> hw/audio/cs4231a: skip automatic zero-init of large arrays
> hw/audio/es1370: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/sb16: skip automatic zero-init of large array
> hw/audio/via-ac97: skip automatic zero-init of large array
> hw/char/sclpconsole-lm: skip automatic zero-init of large array
> hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
> hw/display/vmware_vga: skip automatic zero-init of large struct
> hw/hyperv/syndbg: skip automatic zero-init of large array
> hw/misc/aspeed_hace: skip automatic zero-init of large array
> hw/net/rtl8139: skip automatic zero-init of large array
> hw/net/tulip: skip automatic zero-init of large array
> hw/net/virtio-net: skip automatic zero-init of large arrays
> hw/net/xgamc: skip automatic zero-init of large array
> hw/nvme/ctrl: skip automatic zero-init of large arrays
> hw/ppc/pnv_occ: skip automatic zero-init of large struct
> hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
> hw/usb/hcd-ohci: skip automatic zero-init of large array
> hw/scsi/lsi53c895a: skip automatic zero-init of large array
> hw/scsi/megasas: skip automatic zero-init of large arrays
> hw/ufs/lu: skip automatic zero-init of large array
> net/socket: skip automatic zero-init of large array
> net/stream: skip automatic zero-init of large array
>
> Stefan Hajnoczi (2):
> include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
> hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
>
> block/linux-aio.c | 2 +-
> chardev/char-fd.c | 2 +-
> chardev/char-pty.c | 2 +-
> chardev/char-socket.c | 2 +-
> hw/audio/ac97.c | 4 ++--
> hw/audio/cs4231a.c | 4 ++--
> hw/audio/es1370.c | 2 +-
> hw/audio/gus.c | 2 +-
> hw/audio/marvell_88w8618.c | 2 +-
> hw/audio/sb16.c | 2 +-
> hw/audio/via-ac97.c | 2 +-
> hw/char/sclpconsole-lm.c | 2 +-
> hw/display/vmware_vga.c | 2 +-
> hw/dma/xlnx_csu_dma.c | 2 +-
> hw/hyperv/syndbg.c | 2 +-
> hw/misc/aspeed_hace.c | 2 +-
> hw/net/rtl8139.c | 2 +-
> hw/net/tulip.c | 2 +-
> hw/net/virtio-net.c | 6 +++---
> hw/net/xgmac.c | 2 +-
> hw/nvme/ctrl.c | 6 +++---
> hw/ppc/pnv_occ.c | 2 +-
> hw/ppc/spapr_tpm_proxy.c | 4 ++--
> hw/scsi/lsi53c895a.c | 2 +-
> hw/scsi/megasas.c | 7 ++-----
> hw/ufs/lu.c | 2 +-
> hw/usb/hcd-ohci.c | 2 +-
> hw/virtio/virtio.c | 8 ++++----
> include/qemu/compiler.h | 20 ++++++++++++++++++++
> net/socket.c | 2 +-
> net/stream.c | 2 +-
> 31 files changed, 61 insertions(+), 44 deletions(-)
>
> --
> 2.49.0
>
I will merge this series but want to give others a chance to review it
first:
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 11/31] hw/audio/gus: skip automatic zero-init of large array
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
@ 2025-06-10 14:23 ` Daniel P. Berrangé
0 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 14:23 UTC (permalink / raw)
To: qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf
Mistake in $SUBJECT - should be 'hw/audio/marvell_88w8618:' prefix
On Tue, Jun 10, 2025 at 01:36:49PM +0100, Daniel P. Berrangé wrote:
> The 'mv88w8618_audio_callback' method has a 4k byte array used for
> copying data between the audio backend and device. Skip the automatic
> zero-init of this array to eliminate the performance overhead in
> the I/O hot path.
>
> The 'buf' array will be fully initialized when reading data from
> device memory.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/audio/marvell_88w8618.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/audio/marvell_88w8618.c b/hw/audio/marvell_88w8618.c
> index 6d3ebbb0c8..c5c79d083a 100644
> --- a/hw/audio/marvell_88w8618.c
> +++ b/hw/audio/marvell_88w8618.c
> @@ -66,7 +66,7 @@ static void mv88w8618_audio_callback(void *opaque, int free_out, int free_in)
> {
> mv88w8618_audio_state *s = opaque;
> int16_t *codec_buffer;
> - int8_t buf[4096];
> + QEMU_UNINITIALIZED int8_t buf[4096];
> int8_t *mem_buffer;
> int pos, block_size;
>
> --
> 2.49.0
>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:56 ` Daniel P. Berrangé
@ 2025-06-10 15:00 ` Philippe Mathieu-Daudé
2025-06-10 15:56 ` Daniel P. Berrangé
0 siblings, 1 reply; 42+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-06-10 15:00 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Michael S. Tsirkin, Peter Maydell, qemu-block,
Stefan Hajnoczi, Kevin Wolf
On 10/6/25 14:56, Daniel P. Berrangé wrote:
> On Tue, Jun 10, 2025 at 02:49:02PM +0200, Philippe Mathieu-Daudé wrote:
>> On 10/6/25 14:36, Daniel P. Berrangé wrote:
>>> This series is an extension of Stefan's proposal:
>>>
>>> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>>>
>>> It used '-Wframe-larger-than=4096' to identify all code locations
>>> with more than 4k on the stack. Any locations in the I/O paths
>>> were chosen to avoid automatic zero-init, to eliminate the performance
>>> overhead of the automatic initialization.
>>
>> Should we eventually add it to our default CFLAGS?
>
> You mean as a way to impose a hard limit on stack frame size ?
Yes, we could use -Wframe-larger-than=50000 for this release, then
aim to reduce by 5/10k every release until some reasonable limit...
> FWIW, the VNC server is biggest at 100kb, but we have a large
> number of other big frames - here's a sorted list of frame
> sizes vs frequency
>
> 1 103696
> 2 69664
> 46 49184
> 4 42608
> 23 32880
> 23 32864
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 15:00 ` Philippe Mathieu-Daudé
@ 2025-06-10 15:56 ` Daniel P. Berrangé
0 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Berrangé @ 2025-06-10 15:56 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, Michael S. Tsirkin, Peter Maydell, qemu-block,
Stefan Hajnoczi, Kevin Wolf
On Tue, Jun 10, 2025 at 05:00:43PM +0200, Philippe Mathieu-Daudé wrote:
> On 10/6/25 14:56, Daniel P. Berrangé wrote:
> > On Tue, Jun 10, 2025 at 02:49:02PM +0200, Philippe Mathieu-Daudé wrote:
> > > On 10/6/25 14:36, Daniel P. Berrangé wrote:
> > > > This series is an extension of Stefan's proposal:
> > > >
> > > > https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
> > > >
> > > > It used '-Wframe-larger-than=4096' to identify all code locations
> > > > with more than 4k on the stack. Any locations in the I/O paths
> > > > were chosen to avoid automatic zero-init, to eliminate the performance
> > > > overhead of the automatic initialization.
> > >
> > > Should we eventually add it to our default CFLAGS?
> >
> > You mean as a way to impose a hard limit on stack frame size ?
>
> Yes, we could use -Wframe-larger-than=50000 for this release, then
> aim to reduce by 5/10k every release until some reasonable limit...
I guess the key thing is to define approximately what our target
should be ?
There are some large stack frames in QEMU device I/O paths that are
not unreasonable to have, because they avoid need to malloc in the
I/O path. A greater many devices have 4/8k buffers for copying
data around, and some bigger ones are fairly valid to have too.
On the flip side there's still way too much use of PATH_MAX that
is gratuitous and should be removed, but its hard to enforce the
latter without negatively impacting the former.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-11 8:55 ` Klaus Jensen
0 siblings, 0 replies; 42+ messages in thread
From: Klaus Jensen @ 2025-06-11 8:55 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Philippe Mathieu-Daudé, Michael S. Tsirkin,
Peter Maydell, qemu-block, Stefan Hajnoczi, Kevin Wolf
[-- Attachment #1: Type: text/plain, Size: 2235 bytes --]
On Jun 10 13:37, Daniel P. Berrangé wrote:
> The 'nvme_map_sgl' method has a 256 element array used for copying
> data from the device. Skip the automatic zero-init of this array
> to eliminate the performance overhead in the I/O hot path.
>
> The 'segment' array will be fully initialized when reading data from
> the device.
>
> The 'nme_changed_nslist' method has a 4k byte array that is manually
> initialized with memset(). The compiler ought to be intelligent
> enough to turn the memset() into a static initialization operation,
> and thus not duplicate the automatic zero-init. Replacing memset()
> with '{}' makes it unambiguous that the array is statically initialized.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/nvme/ctrl.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index fd935507bc..220002830d 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -1057,7 +1057,8 @@ static uint16_t nvme_map_sgl(NvmeCtrl *n, NvmeSg *sg, NvmeSglDescriptor sgl,
> */
> #define SEG_CHUNK_SIZE 256
>
> - NvmeSglDescriptor segment[SEG_CHUNK_SIZE], *sgld, *last_sgld;
> + QEMU_UNINITIALIZED NvmeSglDescriptor segment[SEG_CHUNK_SIZE];
> + NvmeSglDescriptor *sgld, *last_sgld;
> uint64_t nsgld;
> uint32_t seg_len;
> uint16_t status;
> @@ -5128,7 +5129,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> uint64_t off, NvmeRequest *req)
> {
> - uint32_t nslist[1024];
> + uint32_t nslist[1024] = {};
> uint32_t trans_len;
> int i = 0;
> uint32_t nsid;
> @@ -5138,7 +5139,6 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> return NVME_INVALID_FIELD | NVME_DNR;
> }
>
> - memset(nslist, 0x0, sizeof(nslist));
> trans_len = MIN(sizeof(nslist) - off, buf_len);
>
> while ((nsid = find_first_bit(n->changed_nsids, NVME_CHANGED_NSID_SIZE)) !=
> --
> 2.49.0
>
>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
@ 2025-06-11 9:09 ` Harsh Prateek Bora
0 siblings, 0 replies; 42+ messages in thread
From: Harsh Prateek Bora @ 2025-06-11 9:09 UTC (permalink / raw)
To: Daniel P. Berrangé, qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf
On 6/10/25 18:07, Daniel P. Berrangé wrote:
> The 'occ_model_tick' method has a 12k struct used for copying
> data between guest and host. Skip the automatic zero-init of this
> struct to eliminate the performance overhead in the I/O hot path.
>
> The 'dynamic_data' buffer will be fully initialized when reading
> data from the guest.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/ppc/pnv_occ.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
> index fa6f31cb8d..24b789c191 100644
> --- a/hw/ppc/pnv_occ.c
> +++ b/hw/ppc/pnv_occ.c
> @@ -789,7 +789,7 @@ static bool occ_opal_process_command(PnvOCC *occ,
>
> static bool occ_model_tick(PnvOCC *occ)
> {
> - struct occ_dynamic_data dynamic_data;
> + QEMU_UNINITIALIZED struct occ_dynamic_data dynamic_data;
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>
> if (!occ_read_dynamic_data(occ, &dynamic_data, NULL)) {
> /* Can't move OCC state field to safe because we can't map it! */
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
@ 2025-06-11 9:20 ` Harsh Prateek Bora
0 siblings, 0 replies; 42+ messages in thread
From: Harsh Prateek Bora @ 2025-06-11 9:20 UTC (permalink / raw)
To: Daniel P. Berrangé, qemu-devel
Cc: Philippe Mathieu-Daudé, Michael S. Tsirkin, Peter Maydell,
qemu-block, Stefan Hajnoczi, Kevin Wolf
On 6/10/25 18:07, Daniel P. Berrangé wrote:
> The 'tpm_execute' method has a pair of 4k arrays used for copying
> data between guest and host. Skip the automatic zero-init of these
> arrays to eliminate the performance overhead in the I/O hot path.
>
> The two arrays will be fully initialized when reading data from
> guest memory or reading data from the proxy FD.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> hw/ppc/spapr_tpm_proxy.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/ppc/spapr_tpm_proxy.c b/hw/ppc/spapr_tpm_proxy.c
> index 862eeaa50a..1297b3ad56 100644
> --- a/hw/ppc/spapr_tpm_proxy.c
> +++ b/hw/ppc/spapr_tpm_proxy.c
> @@ -41,8 +41,8 @@ static ssize_t tpm_execute(SpaprTpmProxy *tpm_proxy, target_ulong *args)
> target_ulong data_in_size = args[2];
> uint64_t data_out = ppc64_phys_to_real(args[3]);
> target_ulong data_out_size = args[4];
> - uint8_t buf_in[TPM_SPAPR_BUFSIZE];
> - uint8_t buf_out[TPM_SPAPR_BUFSIZE];
> + QEMU_UNINITIALIZED uint8_t buf_in[TPM_SPAPR_BUFSIZE];
> + QEMU_UNINITIALIZED uint8_t buf_out[TPM_SPAPR_BUFSIZE];
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> ssize_t ret;
>
> trace_spapr_tpm_execute(data_in, data_in_size, data_out, data_out_size);
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
` (32 preceding siblings ...)
2025-06-10 14:04 ` Stefan Hajnoczi
@ 2025-06-11 19:19 ` Stefan Hajnoczi
33 siblings, 0 replies; 42+ messages in thread
From: Stefan Hajnoczi @ 2025-06-11 19:19 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Philippe Mathieu-Daudé, Michael S. Tsirkin,
Peter Maydell, qemu-block, Kevin Wolf
[-- Attachment #1: Type: text/plain, Size: 4596 bytes --]
On Tue, Jun 10, 2025 at 01:36:38PM +0100, Daniel P. Berrangé wrote:
> This series is an extension of Stefan's proposal:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2025-06/msg00736.html
>
> It used '-Wframe-larger-than=4096' to identify all code locations
> with more than 4k on the stack. Any locations in the I/O paths
> were chosen to avoid automatic zero-init, to eliminate the performance
> overhead of the automatic initialization.
>
> Note, although all the changed locations are I/O paths, this does not
> imply that the changes make a measurement performance difference in
> every case.
>
> This is because many of the emulated devices are likely limited by the
> emulation impl, rather than any implicit memory zero'ing overhead of
> stack data.
>
> None the less the memory zero'ing is still a redundant CPU burn in
> all these cases, so I felt it worth setting the general precedent
> that any data over 4k on a stack in a device I/O path should be
> skipping zero-init.
>
> I did reasonable review in each case to identify that the data was
> indeed initialized explicitly later in the method.
>
> Daniel P. Berrangé (29):
> block: skip automatic zero-init of large array in ioq_submit
> chardev/char-fd: skip automatic zero-init of large array
> chardev/char-pty: skip automatic zero-init of large array
> chardev/char-socket: skip automatic zero-init of large array
> hw/audio/ac97: skip automatic zero-init of large arrays
> hw/audio/cs4231a: skip automatic zero-init of large arrays
> hw/audio/es1370: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/gus: skip automatic zero-init of large array
> hw/audio/sb16: skip automatic zero-init of large array
> hw/audio/via-ac97: skip automatic zero-init of large array
> hw/char/sclpconsole-lm: skip automatic zero-init of large array
> hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
> hw/display/vmware_vga: skip automatic zero-init of large struct
> hw/hyperv/syndbg: skip automatic zero-init of large array
> hw/misc/aspeed_hace: skip automatic zero-init of large array
> hw/net/rtl8139: skip automatic zero-init of large array
> hw/net/tulip: skip automatic zero-init of large array
> hw/net/virtio-net: skip automatic zero-init of large arrays
> hw/net/xgamc: skip automatic zero-init of large array
> hw/nvme/ctrl: skip automatic zero-init of large arrays
> hw/ppc/pnv_occ: skip automatic zero-init of large struct
> hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
> hw/usb/hcd-ohci: skip automatic zero-init of large array
> hw/scsi/lsi53c895a: skip automatic zero-init of large array
> hw/scsi/megasas: skip automatic zero-init of large arrays
> hw/ufs/lu: skip automatic zero-init of large array
> net/socket: skip automatic zero-init of large array
> net/stream: skip automatic zero-init of large array
>
> Stefan Hajnoczi (2):
> include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro
> hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
>
> block/linux-aio.c | 2 +-
> chardev/char-fd.c | 2 +-
> chardev/char-pty.c | 2 +-
> chardev/char-socket.c | 2 +-
> hw/audio/ac97.c | 4 ++--
> hw/audio/cs4231a.c | 4 ++--
> hw/audio/es1370.c | 2 +-
> hw/audio/gus.c | 2 +-
> hw/audio/marvell_88w8618.c | 2 +-
> hw/audio/sb16.c | 2 +-
> hw/audio/via-ac97.c | 2 +-
> hw/char/sclpconsole-lm.c | 2 +-
> hw/display/vmware_vga.c | 2 +-
> hw/dma/xlnx_csu_dma.c | 2 +-
> hw/hyperv/syndbg.c | 2 +-
> hw/misc/aspeed_hace.c | 2 +-
> hw/net/rtl8139.c | 2 +-
> hw/net/tulip.c | 2 +-
> hw/net/virtio-net.c | 6 +++---
> hw/net/xgmac.c | 2 +-
> hw/nvme/ctrl.c | 6 +++---
> hw/ppc/pnv_occ.c | 2 +-
> hw/ppc/spapr_tpm_proxy.c | 4 ++--
> hw/scsi/lsi53c895a.c | 2 +-
> hw/scsi/megasas.c | 7 ++-----
> hw/ufs/lu.c | 2 +-
> hw/usb/hcd-ohci.c | 2 +-
> hw/virtio/virtio.c | 8 ++++----
> include/qemu/compiler.h | 20 ++++++++++++++++++++
> net/socket.c | 2 +-
> net/stream.c | 2 +-
> 31 files changed, 61 insertions(+), 44 deletions(-)
>
> --
> 2.49.0
>
Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2025-06-11 19:20 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-10 12:36 [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 01/31] include/qemu/compiler: add QEMU_UNINITIALIZED attribute macro Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 02/31] hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 03/31] block: skip automatic zero-init of large array in ioq_submit Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 04/31] chardev/char-fd: skip automatic zero-init of large array Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 05/31] chardev/char-pty: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 06/31] chardev/char-socket: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 07/31] hw/audio/ac97: skip automatic zero-init of large arrays Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 08/31] hw/audio/cs4231a: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 09/31] hw/audio/es1370: skip automatic zero-init of large array Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 10/31] hw/audio/gus: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 11/31] " Daniel P. Berrangé
2025-06-10 14:23 ` Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 12/31] hw/audio/sb16: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 13/31] hw/audio/via-ac97: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 14/31] hw/char/sclpconsole-lm: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 15/31] hw/dma/xlnx_csu_dma: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 16/31] hw/display/vmware_vga: skip automatic zero-init of large struct Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 17/31] hw/hyperv/syndbg: skip automatic zero-init of large array Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 18/31] hw/misc/aspeed_hace: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 19/31] hw/net/rtl8139: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 20/31] hw/net/tulip: " Daniel P. Berrangé
2025-06-10 12:36 ` [PATCH 21/31] hw/net/virtio-net: skip automatic zero-init of large arrays Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 22/31] hw/net/xgamc: skip automatic zero-init of large array Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 23/31] hw/nvme/ctrl: skip automatic zero-init of large arrays Daniel P. Berrangé
2025-06-11 8:55 ` Klaus Jensen
2025-06-10 12:37 ` [PATCH 24/31] hw/ppc/pnv_occ: skip automatic zero-init of large struct Daniel P. Berrangé
2025-06-11 9:09 ` Harsh Prateek Bora
2025-06-10 12:37 ` [PATCH 25/31] hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays Daniel P. Berrangé
2025-06-11 9:20 ` Harsh Prateek Bora
2025-06-10 12:37 ` [PATCH 26/31] hw/usb/hcd-ohci: skip automatic zero-init of large array Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 27/31] hw/scsi/lsi53c895a: " Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 28/31] hw/scsi/megasas: skip automatic zero-init of large arrays Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 29/31] hw/ufs/lu: skip automatic zero-init of large array Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 30/31] net/socket: " Daniel P. Berrangé
2025-06-10 12:37 ` [PATCH 31/31] net/stream: " Daniel P. Berrangé
2025-06-10 12:49 ` [PATCH 00/31] Skip automatic zero-init of large arrays / structs in I/O paths Philippe Mathieu-Daudé
2025-06-10 12:56 ` Daniel P. Berrangé
2025-06-10 15:00 ` Philippe Mathieu-Daudé
2025-06-10 15:56 ` Daniel P. Berrangé
2025-06-10 14:04 ` Stefan Hajnoczi
2025-06-11 19:19 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).