[PATCH v5 0/3] improve aio-polling efficiency

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/3] improve aio-polling efficiency
@ 2026-04-23 19:59 Jaehoon Kim
  2026-04-23 19:59 ` [PATCH v5 1/3] aio-poll: avoid unnecessary polling time computation Jaehoon Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Jaehoon Kim @ 2026-04-23 19:59 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: pbonzini, stefanha, fam, armbru, eblake, berrange, eduardo, dave,
	sw, mjrosato, farman, Jaehoon Kim

Dear all,

This is v5 of the patch series to refine aio_poll adaptive polling
logic for better CPU efficiency.

v1: https://lore.kernel.org/qemu-devel/20260113174824.464720-1-jhkim@linux.ibm.com/
v2: https://lore.kernel.org/qemu-devel/20260323135451.579655-1-jhkim@linux.ibm.com/
v3: https://lore.kernel.org/qemu-devel/20260405200735.3075407-1-jhkim@linux.ibm.com/
v4: https://lore.kernel.org/qemu-devel/20260412215011.326196-1-jhkim@linux.ibm.com/

Changes in v5:
- Patch 3/3: Fixed QAPI documentation based on review feedback:
  * qapi/misc.json: Removed the sentence about returning poll-weight=0
    since query-iothreads never returns 0 for this field.
  * qemu-options.hx: Enhanced poll-weight parameter documentation to
    match the detail level in qom.json, including information about
    default value and typical value examples.

Changes in v4:
- Patch 2/3: Added detailed validation tables showing poll.ns statistics
  across different poll_weight values (1-5) for SSD randread/randwrite
  workloads to demonstrate algorithm behavior and justify poll_weight=3
  as the optimal default.

- Patch 3/3: Fixed commit message to correctly reference
  adjust_polling_time() instead of the removed grow_polling_time()
  and shrink_polling_time() functions from v2.

Changes in v3:
- Patch 1/3: Removed timeout check in aio_poll() as suggested by
  Stefan Hajnoczi.

- Patch 2/3: Major refactoring based on review feedback:
  * Removed has_event and renamed poll_idle_timeout to
    last_dispatch_timestamp from AioHandler structure to identify
    active handlers.
  * Merged grow_polling_time() and shrink_polling_time() into single
    adjust_polling_time() function to simplify code review, with no
    functional changes.
  * Renamed adjust_block_ns() to update_handler_poll_times()
  * Modified remove_idle_poll_handlers() to use last_dispatch_timestamp
    directly instead of checking poll_idle_timeout
  * Updated commit message

- Patch 3/3: Enhanced parameter handling:
  * Moved IOTHREAD_POLL_*_DEFAULT constants to iothread.h header
  * Added validation for poll-weight range [0, 63] in iothread.c
  * Added the divide-by-0 protection in aio_context_set_poll_params()
  * Updated QAPI version from 10.2 to 11.1
  * Enhanced qom.json documentation for poll-weight values

This series reduces CPU usage in aio_poll adaptive polling by ~10%
with minimal throughput impact (~2%). Tested on s390x with various
workloads.

Testing details:

Initial testing (Fedora 42, 16 virtio-blk devices, FCP multipath):
 - Throughput: -3% to -8% (1 iothread), -2% to -5% (2 iothreads)
 - CPU usage: -10% to -25% (1 iothread), -7% to -12% (2 iothreads)

Additional validation (RHEL 10.1 + QEMU 10.0.0, FCP/FICON, 1-8 iothreads):
 - Throughput: -2.2% (weight=3), -2.4% (weight=2)
 - CPU usage: -9.4% (weight=3), -10.9% (weight=2)

Weight=3 selected for slightly better throughput while maintaining
substantial CPU savings.

Best regards,
Jaehoon Kim

Jaehoon Kim (3):
  aio-poll: avoid unnecessary polling time computation
  aio-poll: refine iothread polling using weighted handler intervals
  qapi/iothread: introduce poll-weight parameter for aio-poll

 include/qemu/aio.h                |   7 +-
 include/system/iothread.h         |  18 ++++
 iothread.c                        |  47 +++++++---
 monitor/hmp-cmds.c                |   1 +
 qapi/misc.json                    |   6 ++
 qapi/qom.json                     |  10 +-
 qemu-options.hx                   |   8 +-
 tests/unit/test-nested-aio-poll.c |   2 +-
 util/aio-posix.c                  | 148 ++++++++++++++++++------------
 util/aio-posix.h                  |   2 +-
 util/aio-win32.c                  |   3 +-
 util/async.c                      |   2 +
 12 files changed, 176 insertions(+), 78 deletions(-)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 1/3] aio-poll: avoid unnecessary polling time computation
  2026-04-23 19:59 [PATCH v5 0/3] improve aio-polling efficiency Jaehoon Kim
@ 2026-04-23 19:59 ` Jaehoon Kim
  2026-04-23 19:59 ` [PATCH v5 2/3] aio-poll: refine iothread polling using weighted handler intervals Jaehoon Kim
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Jaehoon Kim @ 2026-04-23 19:59 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: pbonzini, stefanha, fam, armbru, eblake, berrange, eduardo, dave,
	sw, mjrosato, farman, Jaehoon Kim

Nodes are no longer added to poll_aio_handlers when adaptive polling is
disabled, preventing unnecessary try_poll_mode() calls. This avoids
iterating over all nodes to compute max_ns unnecessarily when polling
is disabled.

Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/aio-posix.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/util/aio-posix.c b/util/aio-posix.c
index 488d964611..351847c6fb 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -307,9 +307,8 @@ static bool aio_dispatch_handler(AioContext *ctx, AioHandler *node)
      * fdmon_supports_polling(), but only until the fd fires for the first
      * time.
      */
-    if (!QLIST_IS_INSERTED(node, node_deleted) &&
-        !QLIST_IS_INSERTED(node, node_poll) &&
-        node->io_poll) {
+    if (ctx->poll_max_ns && !QLIST_IS_INSERTED(node, node_deleted) &&
+        !QLIST_IS_INSERTED(node, node_poll) && node->io_poll) {
         trace_poll_add(ctx, node, node->pfd.fd, revents);
         if (ctx->poll_started && node->io_poll_begin) {
             node->io_poll_begin(node->opaque);
@@ -631,7 +630,7 @@ static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll,
 bool aio_poll(AioContext *ctx, bool blocking)
 {
     AioHandlerList ready_list = QLIST_HEAD_INITIALIZER(ready_list);
-    bool progress;
+    bool progress = false;
     bool use_notify_me;
     int64_t timeout;
     int64_t start = 0;
@@ -656,7 +655,9 @@ bool aio_poll(AioContext *ctx, bool blocking)
     }
 
     timeout = blocking ? aio_compute_timeout(ctx) : 0;
-    progress = try_poll_mode(ctx, &ready_list, &timeout);
+    if (ctx->poll_max_ns != 0) {
+        progress = try_poll_mode(ctx, &ready_list, &timeout);
+    }
     assert(!(timeout && progress));
 
     /*
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v5 2/3] aio-poll: refine iothread polling using weighted handler intervals
  2026-04-23 19:59 [PATCH v5 0/3] improve aio-polling efficiency Jaehoon Kim
  2026-04-23 19:59 ` [PATCH v5 1/3] aio-poll: avoid unnecessary polling time computation Jaehoon Kim
@ 2026-04-23 19:59 ` Jaehoon Kim
  2026-04-23 19:59 ` [PATCH v5 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll Jaehoon Kim
  2026-04-29 18:20 ` [PATCH v5 0/3] improve aio-polling efficiency Stefan Hajnoczi
  3 siblings, 0 replies; 6+ messages in thread
From: Jaehoon Kim @ 2026-04-23 19:59 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: pbonzini, stefanha, fam, armbru, eblake, berrange, eduardo, dave,
	sw, mjrosato, farman, Jaehoon Kim

Improve adaptive polling by updating each AioHandler's poll.ns
every loop iteration using weighted averages. This reduces CPU
consumption while minimizing performance impact.

Background:
Starting from QEMU 10.0, poll.ns was introduced per event handler
to mitigate excessive fluctuations in IOThread polling times
observed in earlier versions (QEMU 9.x). However, the current
design has limitations:

1. poll.ns is updated only when an event occurs, making it
   difficult to treat block_ns as a reliable event interval.
2. The IOThread's next polling time is determined by the maximum
   poll.ns among all AioHandlers, meaning idle AioHandlers with
   high poll.ns can have an outsized impact on polling duration.
3. For io_uring, idle AioHandlers are cleared after
   POLL_IDLE_INTERVAL_NS (7s), but for ppoll/epoll there is no
   such mechanism, leading to increased CPU consumption from idle
   nodes.

Implementation:
This patch treats block_ns as an event interval and updates each
AioHandler's poll.ns in every loop iteration:

- Active handlers (with events): poll.ns is updated using a
  weighted average of the current block_ns and previous poll.ns,
  smoothing out adjustments and preventing excessive fluctuations.
- Inactive handlers (no events): poll.ns accumulates block_ns
  without weighting, allowing rapid isolation of idle nodes. When
  poll.ns exceeds poll_max_ns, it resets to 0, preventing
  sporadically active handlers from unnecessarily prolonging
  iothread polling.
- The iothread polling duration is set based on the largest poll.ns
  among active handlers. The shrink divider defaults to 2, matching
  the grow rate, to reduce frequent poll_ns resets for slow devices.

The implementation renames poll_idle_timeout to last_dispatch_timestamp
for use as an active handler identifier.

Testing:
POLL_WEIGHT_SHIFT=3 (12.5% weight) was selected based on testing
comparing baseline vs weight=2/3 across various workloads:
Performance results (RHEL 10.1 + QEMU 10.0.0, FCP/FICON, 1-8 iothreads,
numjobs 1/4/8 averaged):
                    | poll-weight=2      | poll-weight=3
--------------------|--------------------|-----------------
Throughput avg      | -2.4% (all tests)  | -2.2% (all tests)
CPU consumption avg | -10.9% (all tests) | -9.4% (all tests)

Both configurations achieve ~10% CPU reduction with minimal throughput
impact (~2%). Weight=3 is chosen as default for slightly better
throughput while maintaining substantial CPU savings.

Additional validation testing on s390x SSD with fio (bs=8k, iodepth=8,
numjobs=1) shows how poll_weight affects polling time (poll.ns)
behavior:

RandRead workload:
+-------------+-----------+-----------+-------------+-------------+
| poll_weight | #samples  | Mean (ns) | 50th % (ns) | 90th % (ns) |
+-------------+-----------+-----------+-------------+-------------+
| 1           | 4.79M     |  8,034    |  5,116      | 20,509      |
| 2           | 5.01M     | 12,584    | 11,078      | 24,693      |
| 3           | 5.01M     | 15,647    | 14,863      | 28,695      |
| 4           | 5.12M     | 16,430    | 15,556      | 30,848      |
| 5           | 5.14M     | 16,461    | 15,306      | 32,123      |
+-------------+-----------+-----------+-------------+-------------+
RandWrite workload:
+-------------+-----------+-----------+-------------+-------------+
| poll_weight | #samples  | Mean (ns) | 50th % (ns) | 90th % (ns) |
+-------------+-----------+-----------+-------------+-------------+
| 1           | 6.37M     |  2,049    |  1,262      |  4,301      |
| 2           | 7.46M     |  4,118    |  3,226      |  7,476      |
| 3           | 7.97M     |  7,034    |  5,984      | 11,645      |
| 4           | 7.96M     | 12,789    | 11,362      | 20,040      |
| 5           | 7.82M     | 22,992    | 20,644      | 32,768      |
+-------------+-----------+-----------+-------------+-------------+

Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
---
 include/qemu/aio.h |   3 +-
 util/aio-posix.c   | 130 ++++++++++++++++++++++++++++++---------------
 util/aio-posix.h   |   2 +-
 util/async.c       |   1 +
 4 files changed, 90 insertions(+), 46 deletions(-)

diff --git a/include/qemu/aio.h b/include/qemu/aio.h
index 8cca2360d1..6c22064a28 100644
--- a/include/qemu/aio.h
+++ b/include/qemu/aio.h
@@ -195,7 +195,7 @@ struct BHListSlice {
 typedef QSLIST_HEAD(, AioHandler) AioHandlerSList;
 
 typedef struct AioPolledEvent {
-    int64_t ns;        /* current polling time in nanoseconds */
+    int64_t ns;     /* estimated block time in nanoseconds */
 } AioPolledEvent;
 
 struct AioContext {
@@ -306,6 +306,7 @@ struct AioContext {
     int poll_disable_cnt;
 
     /* Polling mode parameters */
+    int64_t poll_ns;        /* current polling time in nanoseconds */
     int64_t poll_max_ns;    /* maximum polling time in nanoseconds */
     int64_t poll_grow;      /* polling time growth factor */
     int64_t poll_shrink;    /* polling time shrink factor */
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 351847c6fb..8e9e9e5d8f 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -29,9 +29,11 @@
 
 /* Stop userspace polling on a handler if it isn't active for some time */
 #define POLL_IDLE_INTERVAL_NS (7 * NANOSECONDS_PER_SECOND)
+#define POLL_WEIGHT_SHIFT   (3)
 
-static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll,
-                                int64_t block_ns);
+static void update_handler_poll_times(AioContext *ctx, int64_t block_ns,
+                                      int64_t dispatch_time);
+static void adjust_polling_time(AioContext *ctx, int64_t block_ns);
 
 bool aio_poll_disabled(AioContext *ctx)
 {
@@ -359,7 +361,7 @@ static bool aio_dispatch_handler(AioContext *ctx, AioHandler *node)
 
 static bool aio_dispatch_ready_handlers(AioContext *ctx,
                                         AioHandlerList *ready_list,
-                                        int64_t block_ns)
+                                        int64_t dispatch_time)
 {
     bool progress = false;
     AioHandler *node;
@@ -369,11 +371,11 @@ static bool aio_dispatch_ready_handlers(AioContext *ctx,
         progress = aio_dispatch_handler(ctx, node) || progress;
 
         /*
-         * Adjust polling time only after aio_dispatch_handler(), which can
-         * add the handler to ctx->poll_aio_handlers.
+         * Update last_dispatch_timestamp to mark this as an active
+         * handler for polling time adjustment and prevent idle removal.
          */
         if (ctx->poll_max_ns && QLIST_IS_INSERTED(node, node_poll)) {
-            adjust_polling_time(ctx, &node->poll, block_ns);
+            node->last_dispatch_timestamp = dispatch_time;
         }
     }
 
@@ -394,7 +396,7 @@ void aio_dispatch(AioContext *ctx)
         ctx->fdmon_ops->dispatch(ctx);
     }
 
-    /* block_ns is 0 because polling is disabled in the glib event loop */
+    /* Set now to 0 as polling is disabled in the glib event loop */
     aio_dispatch_ready_handlers(ctx, &ready_list, 0);
 
     aio_free_deleted_handlers(ctx);
@@ -415,9 +417,6 @@ static bool run_poll_handlers_once(AioContext *ctx,
     QLIST_FOREACH_SAFE(node, &ctx->poll_aio_handlers, node_poll, tmp) {
         if (node->io_poll(node->opaque)) {
             aio_add_poll_ready_handler(ready_list, node);
-
-            node->poll_idle_timeout = now + POLL_IDLE_INTERVAL_NS;
-
             /*
              * Polling was successful, exit try_poll_mode immediately
              * to adjust the next polling time.
@@ -458,11 +457,10 @@ static bool remove_idle_poll_handlers(AioContext *ctx,
     }
 
     QLIST_FOREACH_SAFE(node, &ctx->poll_aio_handlers, node_poll, tmp) {
-        if (node->poll_idle_timeout == 0LL) {
-            node->poll_idle_timeout = now + POLL_IDLE_INTERVAL_NS;
-        } else if (now >= node->poll_idle_timeout) {
+        if (node->poll_ready == false &&
+            now >= node->last_dispatch_timestamp + POLL_IDLE_INTERVAL_NS) {
             trace_poll_remove(ctx, node, node->pfd.fd);
-            node->poll_idle_timeout = 0LL;
+            node->last_dispatch_timestamp = 0LL;
             QLIST_SAFE_REMOVE(node, node_poll);
             if (ctx->poll_started && node->io_poll_end) {
                 node->io_poll_end(node->opaque);
@@ -560,18 +558,13 @@ static bool run_poll_handlers(AioContext *ctx, AioHandlerList *ready_list,
 static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list,
                           int64_t *timeout)
 {
-    AioHandler *node;
     int64_t max_ns;
 
     if (QLIST_EMPTY_RCU(&ctx->poll_aio_handlers)) {
         return false;
     }
 
-    max_ns = 0;
-    QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) {
-        max_ns = MAX(max_ns, node->poll.ns);
-    }
-    max_ns = qemu_soonest_timeout(*timeout, max_ns);
+    max_ns = qemu_soonest_timeout(*timeout, ctx->poll_ns);
 
     if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) {
         /*
@@ -587,43 +580,85 @@ static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list,
     return false;
 }
 
-static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll,
-                                int64_t block_ns)
+static void adjust_polling_time(AioContext *ctx, int64_t block_ns)
 {
-    if (block_ns <= poll->ns) {
-        /* This is the sweet spot, no adjustment needed */
-    } else if (block_ns > ctx->poll_max_ns) {
-        /* We'd have to poll for too long, poll less */
-        int64_t old = poll->ns;
-
-        if (ctx->poll_shrink) {
-            poll->ns /= ctx->poll_shrink;
-        } else {
-            poll->ns = 0;
+    if (block_ns < ctx->poll_ns) {
+        int64_t old = ctx->poll_ns;
+        int64_t shrink = ctx->poll_shrink;
+
+        if (shrink == 0) {
+            shrink = 2;
+        }
+
+        if (block_ns < (ctx->poll_ns / shrink)) {
+            ctx->poll_ns /= shrink;
         }
 
-        trace_poll_shrink(ctx, old, poll->ns);
-    } else if (poll->ns < ctx->poll_max_ns &&
-               block_ns < ctx->poll_max_ns) {
+        trace_poll_shrink(ctx, old, ctx->poll_ns);
+    } else if (block_ns > ctx->poll_ns) {
         /* There is room to grow, poll longer */
-        int64_t old = poll->ns;
+        int64_t old = ctx->poll_ns;
         int64_t grow = ctx->poll_grow;
 
         if (grow == 0) {
             grow = 2;
         }
 
-        if (poll->ns) {
-            poll->ns *= grow;
+        if (block_ns > ctx->poll_ns * grow) {
+            ctx->poll_ns = block_ns;
         } else {
-            poll->ns = 4000; /* start polling at 4 microseconds */
+            ctx->poll_ns *= grow;
         }
 
-        if (poll->ns > ctx->poll_max_ns) {
-            poll->ns = ctx->poll_max_ns;
+        if (ctx->poll_ns > ctx->poll_max_ns) {
+            ctx->poll_ns = ctx->poll_max_ns;
         }
 
-        trace_poll_grow(ctx, old, poll->ns);
+        trace_poll_grow(ctx, old, ctx->poll_ns);
+    }
+}
+
+static void update_handler_poll_times(AioContext *ctx, int64_t block_ns,
+                                      int64_t dispatch_time)
+{
+    AioHandler *node;
+    int64_t max_poll_ns = -1;
+
+    QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) {
+        if (node->last_dispatch_timestamp == dispatch_time) {
+            /*
+             * Active handler: had an event in this aio_poll() call.
+             * Update poll.ns using a weighted average of the current
+             * block_ns and previous poll.ns to smooth adjustments.
+             */
+            node->poll.ns = node->poll.ns
+                ? (node->poll.ns - (node->poll.ns >> POLL_WEIGHT_SHIFT))
+                + (block_ns >> POLL_WEIGHT_SHIFT) : block_ns;
+
+            if (node->poll.ns > ctx->poll_max_ns) {
+                node->poll.ns = 0;
+            }
+            /*
+             * Track the maximum poll.ns among active handlers to
+             * calculate the next polling time.
+             */
+            max_poll_ns = MAX(max_poll_ns, node->poll.ns);
+        } else {
+            /*
+             * Inactive handler: no event in this aio_poll() call but
+             * was active before. Increase poll.ns by block_ns. If it
+             * exceeds poll_max_ns, reset to 0 until next event.
+             */
+            if (node->poll.ns != 0) {
+                node->poll.ns += block_ns;
+                if (node->poll.ns > ctx->poll_max_ns) {
+                    node->poll.ns = 0;
+                }
+            }
+        }
+    }
+    if (max_poll_ns >= 0) {
+        adjust_polling_time(ctx, max_poll_ns);
     }
 }
 
@@ -635,6 +670,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
     int64_t timeout;
     int64_t start = 0;
     int64_t block_ns = 0;
+    int64_t dispatch_ns = 0;
 
     /*
      * There cannot be two concurrent aio_poll calls for the same AioContext (or
@@ -711,7 +747,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
 
     /* Calculate blocked time for adaptive polling */
     if (ctx->poll_max_ns) {
-        block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start;
+        dispatch_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+        block_ns = dispatch_ns - start;
     }
 
     if (ctx->fdmon_ops->dispatch) {
@@ -719,10 +756,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
     }
 
     progress |= aio_bh_poll(ctx);
-    progress |= aio_dispatch_ready_handlers(ctx, &ready_list, block_ns);
+    progress |= aio_dispatch_ready_handlers(ctx, &ready_list, dispatch_ns);
 
     aio_free_deleted_handlers(ctx);
 
+    if (ctx->poll_max_ns) {
+        update_handler_poll_times(ctx, block_ns, dispatch_ns);
+    }
+
     qemu_lockcnt_dec(&ctx->list_lock);
 
     progress |= timerlistgroup_run_timers(&ctx->tlg);
@@ -794,6 +835,7 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
     ctx->poll_max_ns = max_ns;
     ctx->poll_grow = grow;
     ctx->poll_shrink = shrink;
+    ctx->poll_ns = 0;
 
     aio_notify(ctx);
 }
diff --git a/util/aio-posix.h b/util/aio-posix.h
index ab894a3c0f..cd459bbbae 100644
--- a/util/aio-posix.h
+++ b/util/aio-posix.h
@@ -38,7 +38,7 @@ struct AioHandler {
     unsigned flags; /* see fdmon-io_uring.c */
     CqeHandler internal_cqe_handler; /* used for POLL_ADD/POLL_REMOVE */
 #endif
-    int64_t poll_idle_timeout; /* when to stop userspace polling */
+    int64_t last_dispatch_timestamp; /* when last handler was dispatched */
     bool poll_ready; /* has polling detected an event? */
     AioPolledEvent poll;
 };
diff --git a/util/async.c b/util/async.c
index 80d6b01a8a..9d3627566f 100644
--- a/util/async.c
+++ b/util/async.c
@@ -606,6 +606,7 @@ AioContext *aio_context_new(Error **errp)
     timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
 
     ctx->poll_max_ns = 0;
+    ctx->poll_ns = 0;
     ctx->poll_grow = 0;
     ctx->poll_shrink = 0;
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v5 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll
  2026-04-23 19:59 [PATCH v5 0/3] improve aio-polling efficiency Jaehoon Kim
  2026-04-23 19:59 ` [PATCH v5 1/3] aio-poll: avoid unnecessary polling time computation Jaehoon Kim
  2026-04-23 19:59 ` [PATCH v5 2/3] aio-poll: refine iothread polling using weighted handler intervals Jaehoon Kim
@ 2026-04-23 19:59 ` Jaehoon Kim
  2026-04-24  6:26   ` Markus Armbruster
  2026-04-29 18:20 ` [PATCH v5 0/3] improve aio-polling efficiency Stefan Hajnoczi
  3 siblings, 1 reply; 6+ messages in thread
From: Jaehoon Kim @ 2026-04-23 19:59 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: pbonzini, stefanha, fam, armbru, eblake, berrange, eduardo, dave,
	sw, mjrosato, farman, Jaehoon Kim

Introduce a configurable poll-weight parameter for adaptive polling
in IOThread. This parameter replaces the hardcoded POLL_WEIGHT_SHIFT
constant, allowing runtime control over how much the most recent
event interval affects the next polling duration calculation.

The poll-weight parameter uses a shift value where larger values
decrease the weight of the current interval, enabling more gradual
adjustments. When set to 0, a default value of 3 is used (meaning
the current interval contributes approximately 1/8 to the weighted
average).

This patch also removes the hardcoded default value checks from
adjust_polling_time(). Instead, poll-grow, poll-shrink, and
poll-weight now use default values initialized in iothread.c
during IOThread creation.

Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/aio.h                |  4 ++-
 include/system/iothread.h         | 18 ++++++++++++
 iothread.c                        | 47 ++++++++++++++++++++++---------
 monitor/hmp-cmds.c                |  1 +
 qapi/misc.json                    |  6 ++++
 qapi/qom.json                     | 10 ++++++-
 qemu-options.hx                   |  8 +++++-
 tests/unit/test-nested-aio-poll.c |  2 +-
 util/aio-posix.c                  | 37 +++++++++---------------
 util/aio-win32.c                  |  3 +-
 util/async.c                      |  1 +
 11 files changed, 95 insertions(+), 42 deletions(-)

diff --git a/include/qemu/aio.h b/include/qemu/aio.h
index 6c22064a28..e65e90093a 100644
--- a/include/qemu/aio.h
+++ b/include/qemu/aio.h
@@ -310,6 +310,7 @@ struct AioContext {
     int64_t poll_max_ns;    /* maximum polling time in nanoseconds */
     int64_t poll_grow;      /* polling time growth factor */
     int64_t poll_shrink;    /* polling time shrink factor */
+    int64_t poll_weight;    /* weight of current interval in calculation */
 
     /* AIO engine parameters */
     int64_t aio_max_batch;  /* maximum number of requests in a batch */
@@ -791,12 +792,13 @@ void aio_context_destroy(AioContext *ctx);
  * @max_ns: how long to busy poll for, in nanoseconds
  * @grow: polling time growth factor
  * @shrink: polling time shrink factor
+ * @weight: weight factor applied to the current polling interval
  *
  * Poll mode can be disabled by setting poll_max_ns to 0.
  */
 void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
                                  int64_t grow, int64_t shrink,
-                                 Error **errp);
+                                 int64_t weight, Error **errp);
 
 /**
  * aio_context_set_aio_params:
diff --git a/include/system/iothread.h b/include/system/iothread.h
index e26d13c6c7..a1ef7696cb 100644
--- a/include/system/iothread.h
+++ b/include/system/iothread.h
@@ -21,6 +21,23 @@
 
 #define TYPE_IOTHREAD "iothread"
 
+#ifdef CONFIG_POSIX
+/*
+ * Benchmark results from 2016 on NVMe SSD drives show max polling times around
+ * 16-32 microseconds yield IOPS improvements for both iodepth=1 and iodepth=32
+ * workloads.
+ */
+#define IOTHREAD_POLL_MAX_NS_DEFAULT 32768ULL
+#define IOTHREAD_POLL_GROW_DEFAULT 2ULL
+#define IOTHREAD_POLL_SHRINK_DEFAULT 2ULL
+#define IOTHREAD_POLL_WEIGHT_DEFAULT 3ULL
+#else
+#define IOTHREAD_POLL_MAX_NS_DEFAULT 0ULL
+#define IOTHREAD_POLL_GROW_DEFAULT 0ULL
+#define IOTHREAD_POLL_SHRINK_DEFAULT 0ULL
+#define IOTHREAD_POLL_WEIGHT_DEFAULT 0ULL
+#endif
+
 struct IOThread {
     EventLoopBase parent_obj;
 
@@ -38,6 +55,7 @@ struct IOThread {
     int64_t poll_max_ns;
     int64_t poll_grow;
     int64_t poll_shrink;
+    int64_t poll_weight;
 };
 typedef struct IOThread IOThread;
 
diff --git a/iothread.c b/iothread.c
index caf68e0764..3558535b40 100644
--- a/iothread.c
+++ b/iothread.c
@@ -25,17 +25,6 @@
 #include "qemu/rcu.h"
 #include "qemu/main-loop.h"
 
-
-#ifdef CONFIG_POSIX
-/* Benchmark results from 2016 on NVMe SSD drives show max polling times around
- * 16-32 microseconds yield IOPS improvements for both iodepth=1 and iodepth=32
- * workloads.
- */
-#define IOTHREAD_POLL_MAX_NS_DEFAULT 32768ULL
-#else
-#define IOTHREAD_POLL_MAX_NS_DEFAULT 0ULL
-#endif
-
 static void *iothread_run(void *opaque)
 {
     IOThread *iothread = opaque;
@@ -103,6 +92,10 @@ static void iothread_instance_init(Object *obj)
     IOThread *iothread = IOTHREAD(obj);
 
     iothread->poll_max_ns = IOTHREAD_POLL_MAX_NS_DEFAULT;
+    iothread->poll_grow = IOTHREAD_POLL_GROW_DEFAULT;
+    iothread->poll_shrink = IOTHREAD_POLL_SHRINK_DEFAULT;
+    iothread->poll_weight = IOTHREAD_POLL_WEIGHT_DEFAULT;
+
     iothread->thread_id = -1;
     qemu_sem_init(&iothread->init_done_sem, 0);
     /* By default, we don't run gcontext */
@@ -164,6 +157,7 @@ static void iothread_set_aio_context_params(EventLoopBase *base, Error **errp)
                                 iothread->poll_max_ns,
                                 iothread->poll_grow,
                                 iothread->poll_shrink,
+                                iothread->poll_weight,
                                 errp);
     if (*errp) {
         return;
@@ -233,6 +227,9 @@ static IOThreadParamInfo poll_grow_info = {
 static IOThreadParamInfo poll_shrink_info = {
     "poll-shrink", offsetof(IOThread, poll_shrink),
 };
+static IOThreadParamInfo poll_weight_info = {
+    "poll-weight", offsetof(IOThread, poll_weight),
+};
 
 static void iothread_get_param(Object *obj, Visitor *v,
         const char *name, IOThreadParamInfo *info, Error **errp)
@@ -254,13 +251,31 @@ static bool iothread_set_param(Object *obj, Visitor *v,
         return false;
     }
 
-    if (value < 0) {
+    if (info->offset == offsetof(IOThread, poll_weight)) {
+        if (value < 0 || value > 63) {
+            error_setg(errp, "%s value must be in range [0, 63]",
+                       info->name);
+            return false;
+        }
+    } else if (value < 0) {
         error_setg(errp, "%s value must be in range [0, %" PRId64 "]",
                    info->name, INT64_MAX);
         return false;
     }
 
-    *field = value;
+    if (value == 0) {
+        if (info->offset == offsetof(IOThread, poll_grow)) {
+            *field = IOTHREAD_POLL_GROW_DEFAULT;
+        } else if (info->offset == offsetof(IOThread, poll_shrink)) {
+            *field = IOTHREAD_POLL_SHRINK_DEFAULT;
+        } else if (info->offset == offsetof(IOThread, poll_weight)) {
+            *field = IOTHREAD_POLL_WEIGHT_DEFAULT;
+        } else {
+            *field = value;
+        }
+    } else {
+        *field = value;
+    }
 
     return true;
 }
@@ -288,6 +303,7 @@ static void iothread_set_poll_param(Object *obj, Visitor *v,
                                     iothread->poll_max_ns,
                                     iothread->poll_grow,
                                     iothread->poll_shrink,
+                                    iothread->poll_weight,
                                     errp);
     }
 }
@@ -311,6 +327,10 @@ static void iothread_class_init(ObjectClass *klass, const void *class_data)
                               iothread_get_poll_param,
                               iothread_set_poll_param,
                               NULL, &poll_shrink_info);
+    object_class_property_add(klass, "poll-weight", "int",
+                              iothread_get_poll_param,
+                              iothread_set_poll_param,
+                              NULL, &poll_weight_info);
 }
 
 static const TypeInfo iothread_info = {
@@ -356,6 +376,7 @@ static int query_one_iothread(Object *object, void *opaque)
     info->poll_max_ns = iothread->poll_max_ns;
     info->poll_grow = iothread->poll_grow;
     info->poll_shrink = iothread->poll_shrink;
+    info->poll_weight = iothread->poll_weight;
     info->aio_max_batch = iothread->parent_obj.aio_max_batch;
 
     QAPI_LIST_APPEND(*tail, info);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index bc26b39d70..afa7b709a6 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -206,6 +206,7 @@ void hmp_info_iothreads(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "  poll-max-ns=%" PRId64 "\n", value->poll_max_ns);
         monitor_printf(mon, "  poll-grow=%" PRId64 "\n", value->poll_grow);
         monitor_printf(mon, "  poll-shrink=%" PRId64 "\n", value->poll_shrink);
+        monitor_printf(mon, "  poll-weight=%" PRId64 "\n", value->poll_weight);
         monitor_printf(mon, "  aio-max-batch=%" PRId64 "\n",
                        value->aio_max_batch);
     }
diff --git a/qapi/misc.json b/qapi/misc.json
index 28c641fe2f..22b7afed9f 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -85,6 +85,11 @@
 # @poll-shrink: how many ns will be removed from polling time, 0 means
 #     that it's not configured (since 2.9)
 #
+# @poll-weight: the weight factor for adaptive polling.
+#     Determines how much the current event interval contributes to
+#     the next polling time calculation.  Valid values are 1 or
+#     greater (since 11.1)
+#
 # @aio-max-batch: maximum number of requests in a batch for the AIO
 #     engine, 0 means that the engine will use its default (since 6.1)
 #
@@ -96,6 +101,7 @@
            'poll-max-ns': 'int',
            'poll-grow': 'int',
            'poll-shrink': 'int',
+           'poll-weight': 'int',
            'aio-max-batch': 'int' } }
 
 ##
diff --git a/qapi/qom.json b/qapi/qom.json
index c653248f85..dd45ac1087 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -606,6 +606,13 @@
 #     algorithm detects it is spending too long polling without
 #     encountering events.  0 selects a default behaviour (default: 0)
 #
+# @poll-weight: the weight factor for adaptive polling.  Determines
+#     how much the most recent event interval affects the next
+#     polling duration calculation.  If set to 0, the system default
+#     value of 3 is used.  Typical values: 1 (high weight on recent
+#     interval), 2-4 (moderate weight on recent interval).
+#     (default: 0) (since 11.1)
+#
 # The @aio-max-batch option is available since 6.1.
 #
 # Since: 2.0
@@ -614,7 +621,8 @@
   'base': 'EventLoopBaseProperties',
   'data': { '*poll-max-ns': 'int',
             '*poll-grow': 'int',
-            '*poll-shrink': 'int' } }
+            '*poll-shrink': 'int',
+            '*poll-weight': 'int' } }
 
 ##
 # @MainLoopProperties:
diff --git a/qemu-options.hx b/qemu-options.hx
index 21972f8326..29c09415c1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -6443,7 +6443,7 @@ SRST
 
             CN=laptop.example.com,O=Example Home,L=London,ST=London,C=GB
 
-    ``-object iothread,id=id,poll-max-ns=poll-max-ns,poll-grow=poll-grow,poll-shrink=poll-shrink,aio-max-batch=aio-max-batch``
+    ``-object iothread,id=id,poll-max-ns=poll-max-ns,poll-grow=poll-grow,poll-shrink=poll-shrink,poll-weight=poll-weight,aio-max-batch=aio-max-batch``
         Creates a dedicated event loop thread that devices can be
         assigned to. This is known as an IOThread. By default device
         emulation happens in vCPU threads or the main event loop thread.
@@ -6479,6 +6479,12 @@ SRST
         the polling time when the algorithm detects it is spending too
         long polling without encountering events.
 
+        The ``poll-weight`` parameter is the weight factor for adaptive
+        polling. It determines how much the most recent event interval
+        affects the next polling duration calculation. If set to 0, the
+        system default value of 3 is used. Typical values: 1 (high weight
+        on recent interval), 2-4 (moderate weight on recent interval).
+
         The ``aio-max-batch`` parameter is the maximum number of requests
         in a batch for the AIO engine, 0 means that the engine will use
         its default.
diff --git a/tests/unit/test-nested-aio-poll.c b/tests/unit/test-nested-aio-poll.c
index 9ab1ad08a7..4c38f36fd4 100644
--- a/tests/unit/test-nested-aio-poll.c
+++ b/tests/unit/test-nested-aio-poll.c
@@ -81,7 +81,7 @@ static void test(void)
     qemu_set_current_aio_context(td.ctx);
 
     /* Enable polling */
-    aio_context_set_poll_params(td.ctx, 1000000, 2, 2, &error_abort);
+    aio_context_set_poll_params(td.ctx, 1000000, 2, 2, 3, &error_abort);
 
     /* Make the event notifier active (set) right away */
     event_notifier_init(&td.poll_notifier, 1);
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 8e9e9e5d8f..df1c213ce5 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -29,7 +29,6 @@
 
 /* Stop userspace polling on a handler if it isn't active for some time */
 #define POLL_IDLE_INTERVAL_NS (7 * NANOSECONDS_PER_SECOND)
-#define POLL_WEIGHT_SHIFT   (3)
 
 static void update_handler_poll_times(AioContext *ctx, int64_t block_ns,
                                       int64_t dispatch_time);
@@ -582,28 +581,11 @@ static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list,
 
 static void adjust_polling_time(AioContext *ctx, int64_t block_ns)
 {
-    if (block_ns < ctx->poll_ns) {
-        int64_t old = ctx->poll_ns;
-        int64_t shrink = ctx->poll_shrink;
-
-        if (shrink == 0) {
-            shrink = 2;
-        }
-
-        if (block_ns < (ctx->poll_ns / shrink)) {
-            ctx->poll_ns /= shrink;
-        }
-
-        trace_poll_shrink(ctx, old, ctx->poll_ns);
-    } else if (block_ns > ctx->poll_ns) {
+    if (block_ns > ctx->poll_ns) {
         /* There is room to grow, poll longer */
         int64_t old = ctx->poll_ns;
         int64_t grow = ctx->poll_grow;
 
-        if (grow == 0) {
-            grow = 2;
-        }
-
         if (block_ns > ctx->poll_ns * grow) {
             ctx->poll_ns = block_ns;
         } else {
@@ -615,6 +597,11 @@ static void adjust_polling_time(AioContext *ctx, int64_t block_ns)
         }
 
         trace_poll_grow(ctx, old, ctx->poll_ns);
+    } else if (block_ns < (ctx->poll_ns / ctx->poll_shrink)) {
+        int64_t old = ctx->poll_ns;
+        ctx->poll_ns /= ctx->poll_shrink;
+
+        trace_poll_shrink(ctx, old, ctx->poll_ns);
     }
 }
 
@@ -632,8 +619,8 @@ static void update_handler_poll_times(AioContext *ctx, int64_t block_ns,
              * block_ns and previous poll.ns to smooth adjustments.
              */
             node->poll.ns = node->poll.ns
-                ? (node->poll.ns - (node->poll.ns >> POLL_WEIGHT_SHIFT))
-                + (block_ns >> POLL_WEIGHT_SHIFT) : block_ns;
+                ? (node->poll.ns - (node->poll.ns >> ctx->poll_weight))
+                + (block_ns >> ctx->poll_weight) : block_ns;
 
             if (node->poll.ns > ctx->poll_max_ns) {
                 node->poll.ns = 0;
@@ -819,7 +806,8 @@ void aio_context_destroy(AioContext *ctx)
 }
 
 void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
-                                 int64_t grow, int64_t shrink, Error **errp)
+                                 int64_t grow, int64_t shrink,
+                                 int64_t weight, Error **errp)
 {
     AioHandler *node;
 
@@ -833,8 +821,9 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
      * is used once.
      */
     ctx->poll_max_ns = max_ns;
-    ctx->poll_grow = grow;
-    ctx->poll_shrink = shrink;
+    ctx->poll_grow = (grow ? grow : IOTHREAD_POLL_GROW_DEFAULT);
+    ctx->poll_shrink = (shrink ? shrink : IOTHREAD_POLL_SHRINK_DEFAULT);
+    ctx->poll_weight = (weight ? weight : IOTHREAD_POLL_WEIGHT_DEFAULT);
     ctx->poll_ns = 0;
 
     aio_notify(ctx);
diff --git a/util/aio-win32.c b/util/aio-win32.c
index 6e6f699e4b..1985843233 100644
--- a/util/aio-win32.c
+++ b/util/aio-win32.c
@@ -429,7 +429,8 @@ void aio_context_destroy(AioContext *ctx)
 }
 
 void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
-                                 int64_t grow, int64_t shrink, Error **errp)
+                                 int64_t grow, int64_t shrink,
+                                 int64_t weight, Error **errp)
 {
     if (max_ns) {
         error_setg(errp, "AioContext polling is not implemented on Windows");
diff --git a/util/async.c b/util/async.c
index 9d3627566f..741fcfd6a7 100644
--- a/util/async.c
+++ b/util/async.c
@@ -609,6 +609,7 @@ AioContext *aio_context_new(Error **errp)
     ctx->poll_ns = 0;
     ctx->poll_grow = 0;
     ctx->poll_shrink = 0;
+    ctx->poll_weight = 0;
 
     ctx->aio_max_batch = 0;
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll
  2026-04-23 19:59 ` [PATCH v5 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll Jaehoon Kim
@ 2026-04-24  6:26   ` Markus Armbruster
  0 siblings, 0 replies; 6+ messages in thread
From: Markus Armbruster @ 2026-04-24  6:26 UTC (permalink / raw)
  To: Jaehoon Kim
  Cc: qemu-devel, qemu-block, pbonzini, stefanha, fam, eblake, berrange,
	eduardo, dave, sw, mjrosato, farman

Jaehoon Kim <jhkim@linux.ibm.com> writes:

> Introduce a configurable poll-weight parameter for adaptive polling
> in IOThread. This parameter replaces the hardcoded POLL_WEIGHT_SHIFT
> constant, allowing runtime control over how much the most recent
> event interval affects the next polling duration calculation.
>
> The poll-weight parameter uses a shift value where larger values
> decrease the weight of the current interval, enabling more gradual
> adjustments. When set to 0, a default value of 3 is used (meaning
> the current interval contributes approximately 1/8 to the weighted
> average).
>
> This patch also removes the hardcoded default value checks from
> adjust_polling_time(). Instead, poll-grow, poll-shrink, and
> poll-weight now use default values initialized in iothread.c
> during IOThread creation.
>
> Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

QAPI schema
Acked-by: Markus Armbruster <armbru@redhat.com>



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/3] improve aio-polling efficiency
  2026-04-23 19:59 [PATCH v5 0/3] improve aio-polling efficiency Jaehoon Kim
                   ` (2 preceding siblings ...)
  2026-04-23 19:59 ` [PATCH v5 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll Jaehoon Kim
@ 2026-04-29 18:20 ` Stefan Hajnoczi
  3 siblings, 0 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2026-04-29 18:20 UTC (permalink / raw)
  To: Jaehoon Kim
  Cc: qemu-devel, qemu-block, pbonzini, fam, armbru, eblake, berrange,
	eduardo, dave, sw, mjrosato, farman

[-- Attachment #1: Type: text/plain, Size: 4220 bytes --]

On Thu, Apr 23, 2026 at 02:59:15PM -0500, Jaehoon Kim wrote:
> Dear all,
> 
> This is v5 of the patch series to refine aio_poll adaptive polling
> logic for better CPU efficiency.
> 
> v1: https://lore.kernel.org/qemu-devel/20260113174824.464720-1-jhkim@linux.ibm.com/
> v2: https://lore.kernel.org/qemu-devel/20260323135451.579655-1-jhkim@linux.ibm.com/
> v3: https://lore.kernel.org/qemu-devel/20260405200735.3075407-1-jhkim@linux.ibm.com/
> v4: https://lore.kernel.org/qemu-devel/20260412215011.326196-1-jhkim@linux.ibm.com/
> 
> Changes in v5:
> - Patch 3/3: Fixed QAPI documentation based on review feedback:
>   * qapi/misc.json: Removed the sentence about returning poll-weight=0
>     since query-iothreads never returns 0 for this field.
>   * qemu-options.hx: Enhanced poll-weight parameter documentation to
>     match the detail level in qom.json, including information about
>     default value and typical value examples.
> 
> Changes in v4:
> - Patch 2/3: Added detailed validation tables showing poll.ns statistics
>   across different poll_weight values (1-5) for SSD randread/randwrite
>   workloads to demonstrate algorithm behavior and justify poll_weight=3
>   as the optimal default.
> 
> - Patch 3/3: Fixed commit message to correctly reference
>   adjust_polling_time() instead of the removed grow_polling_time()
>   and shrink_polling_time() functions from v2.
> 
> Changes in v3:
> - Patch 1/3: Removed timeout check in aio_poll() as suggested by
>   Stefan Hajnoczi.
> 
> - Patch 2/3: Major refactoring based on review feedback:
>   * Removed has_event and renamed poll_idle_timeout to
>     last_dispatch_timestamp from AioHandler structure to identify
>     active handlers.
>   * Merged grow_polling_time() and shrink_polling_time() into single
>     adjust_polling_time() function to simplify code review, with no
>     functional changes.
>   * Renamed adjust_block_ns() to update_handler_poll_times()
>   * Modified remove_idle_poll_handlers() to use last_dispatch_timestamp
>     directly instead of checking poll_idle_timeout
>   * Updated commit message
> 
> - Patch 3/3: Enhanced parameter handling:
>   * Moved IOTHREAD_POLL_*_DEFAULT constants to iothread.h header
>   * Added validation for poll-weight range [0, 63] in iothread.c
>   * Added the divide-by-0 protection in aio_context_set_poll_params()
>   * Updated QAPI version from 10.2 to 11.1
>   * Enhanced qom.json documentation for poll-weight values
> 
> This series reduces CPU usage in aio_poll adaptive polling by ~10%
> with minimal throughput impact (~2%). Tested on s390x with various
> workloads.
> 
> Testing details:
> 
> Initial testing (Fedora 42, 16 virtio-blk devices, FCP multipath):
>  - Throughput: -3% to -8% (1 iothread), -2% to -5% (2 iothreads)
>  - CPU usage: -10% to -25% (1 iothread), -7% to -12% (2 iothreads)
> 
> Additional validation (RHEL 10.1 + QEMU 10.0.0, FCP/FICON, 1-8 iothreads):
>  - Throughput: -2.2% (weight=3), -2.4% (weight=2)
>  - CPU usage: -9.4% (weight=3), -10.9% (weight=2)
> 
> Weight=3 selected for slightly better throughput while maintaining
> substantial CPU savings.
> 
> Best regards,
> Jaehoon Kim
> 
> Jaehoon Kim (3):
>   aio-poll: avoid unnecessary polling time computation
>   aio-poll: refine iothread polling using weighted handler intervals
>   qapi/iothread: introduce poll-weight parameter for aio-poll
> 
>  include/qemu/aio.h                |   7 +-
>  include/system/iothread.h         |  18 ++++
>  iothread.c                        |  47 +++++++---
>  monitor/hmp-cmds.c                |   1 +
>  qapi/misc.json                    |   6 ++
>  qapi/qom.json                     |  10 +-
>  qemu-options.hx                   |   8 +-
>  tests/unit/test-nested-aio-poll.c |   2 +-
>  util/aio-posix.c                  | 148 ++++++++++++++++++------------
>  util/aio-posix.h                  |   2 +-
>  util/aio-win32.c                  |   3 +-
>  util/async.c                      |   2 +
>  12 files changed, 176 insertions(+), 78 deletions(-)
> 
> -- 
> 2.50.1
> 

Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-29 18:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 19:59 [PATCH v5 0/3] improve aio-polling efficiency Jaehoon Kim
2026-04-23 19:59 ` [PATCH v5 1/3] aio-poll: avoid unnecessary polling time computation Jaehoon Kim
2026-04-23 19:59 ` [PATCH v5 2/3] aio-poll: refine iothread polling using weighted handler intervals Jaehoon Kim
2026-04-23 19:59 ` [PATCH v5 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll Jaehoon Kim
2026-04-24  6:26   ` Markus Armbruster
2026-04-29 18:20 ` [PATCH v5 0/3] improve aio-polling efficiency Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.