* [PATCH] migration: expose per-device state save times via query-migrate
@ 2026-04-16 17:56 Trieu Huynh
2026-04-16 19:21 ` Peter Xu
0 siblings, 1 reply; 3+ messages in thread
From: Trieu Huynh @ 2026-04-16 17:56 UTC (permalink / raw)
To: qemu-devel
Cc: Trieu Huynh, Peter Xu, Fabiano Rosas, Eric Blake,
Markus Armbruster
From: Trieu Huynh <vikingtc4@gmail.com>
The stop-and-copy phase pauses the VM and saves all non-iterable device
states. qemu_savevm_state_non_iterable() already measures per-device
elapsed time for tracing (trace_vmstate_downtime_save, added in
commit 3c80f14272), but this information is never stored or surfaced
to somewhere.
Expose the result through a new 'device-state-times' list in
MigrationInfo, filled by qemu_savevm_get_device_state_times() helper
and returned by query-migrate when status is completed.
A new QAPI type is introduced:
DeviceSaveStateTime { 'name': 'str', 'instance-id': 'int', 'save-time': 'int' }
where 'save-time' is the elapsed time in microseconds.
This allows operators and tooling to identify which device(s) are
contributing most to migration downtime.
Related prior work: Joao Martins posted a phase-level downtime
breakdown series in 2023 [1] that stalled during review. This patch
takes a complementary approach with per-device rather than per-phase
granularity, building on the tracepoint infrastructure that was merged
from that discussion.
* As-is:
{"execute":"query-migrate"}
{
"return": {
"expected-downtime": 300,
"status": "active",
"setup-time": 1,
"total-time": 25501,
"ram": {
"total": 8607571968,
"postcopy-requests": 0,
"dirty-sync-count": 1,
"multifd-bytes": 0,
"pages-per-second": 32960,
"downtime-bytes": 0,
"page-size": 4096,
"remaining": 2027167744,
"postcopy-bytes": 0,
"mbps": 1082.1441599999998,
"transferred": 3409746788,
"dirty-sync-missed-zero-copy": 0,
"precopy-bytes": 3409871158,
"duplicate": 777383,
"dirty-pages-rate": 0,
"normal-bytes": 3396239360,
"normal": 829160
}
}
}
* To-be:
{"execute":"query-migrate"}
{
"return": {
"device-state-times": [
{
"name": "apic",
"instance-id": 0,
"save-time": 35
},
{
"name": "cpu_common",
"instance-id": 0,
"save-time": 4
},
{
"name": "cpu",
"instance-id": 0,
"save-time": 145
},
...
{
"name": "kvm-tpr-opt",
"instance-id": 0,
"save-time": 23
},
{
"name": "0000:00:00.0/I440FX",
"instance-id": 0,
"save-time": 9
},
{
"name": "PCIHost",
"instance-id": 0,
"save-time": 3
},
{
"name": "PCIBUS",
"instance-id": 0,
"save-time": 4
},
],
"status": "completed",
"setup-time": 1,
"downtime": 42,
"total-time": 36049,
"ram": {
"total": 8607571968,
"postcopy-requests": 0,
"dirty-sync-count": 5,
"multifd-bytes": 0,
"pages-per-second": 32980,
"downtime-bytes": 39224358,
"page-size": 4096,
"remaining": 0,
"postcopy-bytes": 0,
"mbps": 1080.48687394585,
"transferred": 4868673854,
"dirty-sync-missed-zero-copy": 0,
"precopy-bytes": 4829122945,
"duplicate": 1032714,
"dirty-pages-rate": 0,
"normal-bytes": 4849577984,
"normal": 1183979
}
}
}
[1] https://lore.kernel.org/qemu-devel/20230926161841.98464-1-joao.m.martins@oracle.com/
Related-to: https://wiki.qemu.org/ToDo/LiveMigration#Device_state_downtime_analysis_and_accountings
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
---
migration/migration.c | 1 +
migration/savevm.c | 29 +++++++++++++++++++++++++++++
migration/savevm.h | 3 +++
qapi/migration.json | 26 +++++++++++++++++++++++++-
4 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 5c9aaa6e58..61bd98dc84 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1161,6 +1161,7 @@ static void fill_source_migration_info(MigrationInfo *info)
populate_time_info(info, s);
populate_ram_info(info, s);
migration_populate_vfio_info(info);
+ qemu_savevm_get_device_state_times(info);
break;
case MIGRATION_STATUS_FAILED:
info->has_status = true;
diff --git a/migration/savevm.c b/migration/savevm.c
index dd58f2a705..3b1c4ff4a7 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -46,6 +46,7 @@
#include "qapi/qapi-commands-migration.h"
#include "qapi/clone-visitor.h"
#include "qapi/qapi-builtin-visit.h"
+#include "qapi/qapi-visit-migration.h"
#include "qemu/error-report.h"
#include "system/cpus.h"
#include "system/memory.h"
@@ -263,6 +264,9 @@ typedef struct SaveState {
QemuUUID uuid;
} SaveState;
+/* Per-device save times recorded */
+static DeviceSaveStateTimeList *savevm_device_state_times;
+
static SaveState savevm_state = {
.handlers = QTAILQ_HEAD_INITIALIZER(savevm_state.handlers),
.handler_pri_head = { [0 ... MIG_PRI_MAX] = NULL },
@@ -1710,11 +1714,18 @@ int qemu_savevm_state_non_iterable(QEMUFile *f, Error **errp)
int64_t start_ts_each, end_ts_each;
JSONWriter *vmdesc = ms->vmdesc;
SaveStateEntry *se;
+ DeviceSaveStateTimeList **tail;
+ int64_t save_time;
int ret;
/* Making sure cpu states are synchronized before saving non-iterable */
cpu_synchronize_all_states();
+ /* Reset list from any previous migration run */
+ qapi_free_DeviceSaveStateTimeList(savevm_device_state_times);
+ savevm_device_state_times = NULL;
+ tail = &savevm_device_state_times;
+
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->early_setup) {
/* Already saved during qemu_savevm_state_setup(). */
@@ -1731,6 +1742,14 @@ int qemu_savevm_state_non_iterable(QEMUFile *f, Error **errp)
end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
trace_vmstate_downtime_save("non-iterable", se->idstr, se->instance_id,
end_ts_each - start_ts_each);
+ save_time = end_ts_each - start_ts_each;
+ if (se->vmsd && save_time > 0) {
+ DeviceSaveStateTime *dsst = g_new0(DeviceSaveStateTime, 1);
+ dsst->name = g_strdup(se->idstr);
+ dsst->instance_id = se->instance_id;
+ dsst->save_time = save_time;
+ QAPI_LIST_APPEND(tail, dsst);
+ }
}
trace_vmstate_downtime_checkpoint("src-non-iterable-saved");
@@ -3177,6 +3196,16 @@ bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id,
return se->ops->load_state_buffer(se->opaque, buf, len, errp);
}
+void qemu_savevm_get_device_state_times(MigrationInfo *info)
+{
+ if (!savevm_device_state_times) {
+ return;
+ }
+ info->device_state_times = QAPI_CLONE(DeviceSaveStateTimeList,
+ savevm_device_state_times);
+ info->has_device_state_times = true;
+}
+
bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
bool has_devices, strList *devices, Error **errp)
{
diff --git a/migration/savevm.h b/migration/savevm.h
index b3d1e8a13c..a18af7eaed 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -14,6 +14,8 @@
#ifndef MIGRATION_SAVEVM_H
#define MIGRATION_SAVEVM_H
+#include "qapi/qapi-types-migration.h"
+
#define QEMU_VM_FILE_MAGIC 0x5145564d
#define QEMU_VM_FILE_VERSION_COMPAT 0x00000002
#define QEMU_VM_FILE_VERSION 0x00000003
@@ -78,5 +80,6 @@ int qemu_savevm_state_non_iterable_early(QEMUFile *f,
Error **errp);
bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id,
char *buf, size_t len, Error **errp);
+void qemu_savevm_get_device_state_times(MigrationInfo *info);
#endif
diff --git a/qapi/migration.json b/qapi/migration.json
index 7134d4ce47..17c7876392 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -200,6 +200,23 @@
{ 'struct': 'VfioStats',
'data': {'transferred': 'int' } }
+##
+# @DeviceSaveStateTime:
+#
+# Save time information for a single device's state during migration.
+#
+# @name: device name as registered in the migration stream
+#
+# @instance-id: device instance ID
+#
+# @save-time: time in microseconds spent saving this device's state
+# during the stop-and-copy phase
+#
+# Since: 10.2
+##
+{ 'struct': 'DeviceSaveStateTime',
+ 'data': { 'name': 'str', 'instance-id': 'uint32', 'save-time': 'uint64' } }
+
##
# @MigrationInfo:
#
@@ -300,6 +317,12 @@
# average memory load of the virtual CPU indirectly. Note that
# zero means guest doesn't dirty memory. (Since 8.1)
#
+# @device-state-times: list of per-device state save times recorded
+# during the stop-and-copy phase. Only present after a completed
+# migration. Entries with zero save time are omitted. Useful for
+# diagnosing which devices contribute most to migration downtime.
+# (Since 10.2)
+#
# Features:
#
# @unstable: Members @postcopy-latency, @postcopy-vcpu-latency,
@@ -331,7 +354,8 @@
'type': 'uint64', 'features': [ 'unstable' ] },
'*socket-address': ['SocketAddress'],
'*dirty-limit-throttle-time-per-round': 'uint64',
- '*dirty-limit-ring-full-time': 'uint64'} }
+ '*dirty-limit-ring-full-time': 'uint64',
+ '*device-state-times': ['DeviceSaveStateTime']} }
##
# @query-migrate:
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] migration: expose per-device state save times via query-migrate
2026-04-16 17:56 [PATCH] migration: expose per-device state save times via query-migrate Trieu Huynh
@ 2026-04-16 19:21 ` Peter Xu
2026-04-17 9:34 ` Trieu Huynh
0 siblings, 1 reply; 3+ messages in thread
From: Peter Xu @ 2026-04-16 19:21 UTC (permalink / raw)
To: Trieu Huynh; +Cc: qemu-devel, Fabiano Rosas, Eric Blake, Markus Armbruster
On Fri, Apr 17, 2026 at 12:56:59AM +0700, Trieu Huynh wrote:
> From: Trieu Huynh <vikingtc4@gmail.com>
>
> The stop-and-copy phase pauses the VM and saves all non-iterable device
> states. qemu_savevm_state_non_iterable() already measures per-device
> elapsed time for tracing (trace_vmstate_downtime_save, added in
> commit 3c80f14272), but this information is never stored or surfaced
> to somewhere.
>
> Expose the result through a new 'device-state-times' list in
> MigrationInfo, filled by qemu_savevm_get_device_state_times() helper
> and returned by query-migrate when status is completed.
>
> A new QAPI type is introduced:
> DeviceSaveStateTime { 'name': 'str', 'instance-id': 'int', 'save-time': 'int' }
> where 'save-time' is the elapsed time in microseconds.
Hi, Trieu,
Thanks again for your patch, especially during your spare time.
Though I need to say this is another example I want to mention, that QMP is
an API that QEMU relies on a lot, and we're serious on what it exposes. We
need to justify whatever new info to be exposed.
So if we start to report something in QAPI, we'd better be very certain at
least someone will be consuming this at the very least. Starting from the
1st day this API got merged, we will need to stick with it and it can be
forever; we can obsolete things, but we need to evaluate risk. Before that
risk analysis, we better evaluate why an API is needed in the first place.
What is much less controversial is, if you could look at how to improve any
of these numbers reported, say, if we can shrink some device save/load
time, that'll always be a performance improvements.
And just to mention Joao's effort was not discontinued and gone, it's just
done instead by tracepoints here rather than QMP queries (before we're more
confident that we can leverage some new data in query-migrate):
https://lore.kernel.org/all/20231030163346.765724-6-peterx@redhat.com/
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] migration: expose per-device state save times via query-migrate
2026-04-16 19:21 ` Peter Xu
@ 2026-04-17 9:34 ` Trieu Huynh
0 siblings, 0 replies; 3+ messages in thread
From: Trieu Huynh @ 2026-04-17 9:34 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Eric Blake, Markus Armbruster
On Thu, Apr 16, 2026 at 03:21:33PM -0400, Peter Xu wrote:
> On Fri, Apr 17, 2026 at 12:56:59AM +0700, Trieu Huynh wrote:
> > From: Trieu Huynh <vikingtc4@gmail.com>
> >
> > The stop-and-copy phase pauses the VM and saves all non-iterable device
> > states. qemu_savevm_state_non_iterable() already measures per-device
> > elapsed time for tracing (trace_vmstate_downtime_save, added in
> > commit 3c80f14272), but this information is never stored or surfaced
> > to somewhere.
> >
> > Expose the result through a new 'device-state-times' list in
> > MigrationInfo, filled by qemu_savevm_get_device_state_times() helper
> > and returned by query-migrate when status is completed.
> >
> > A new QAPI type is introduced:
> > DeviceSaveStateTime { 'name': 'str', 'instance-id': 'int', 'save-time': 'int' }
> > where 'save-time' is the elapsed time in microseconds.
>
> Hi, Trieu,
>
> Thanks again for your patch, especially during your spare time.
No worry, thanks.
>
> Though I need to say this is another example I want to mention, that QMP is
> an API that QEMU relies on a lot, and we're serious on what it exposes. We
> need to justify whatever new info to be exposed.
>
> So if we start to report something in QAPI, we'd better be very certain at
> least someone will be consuming this at the very least. Starting from the
> 1st day this API got merged, we will need to stick with it and it can be
> forever; we can obsolete things, but we need to evaluate risk. Before that
> risk analysis, we better evaluate why an API is needed in the first place.
>
> What is much less controversial is, if you could look at how to improve any
> of these numbers reported, say, if we can shrink some device save/load
> time, that'll always be a performance improvements.
>
> And just to mention Joao's effort was not discontinued and gone, it's just
> done instead by tracepoints here rather than QMP queries (before we're more
> confident that we can leverage some new data in query-migrate):
Make sense to me a lot. Thanks for your kind review.
So, I'd rather go that way which focusing on performance
tuning/optimizing for device save/load time (which also mentioned in
todo list [1]) than expose exposing it via
QMP command that uncertainly needed.
>
> https://lore.kernel.org/all/20231030163346.765724-6-peterx@redhat.com/
Thanks for the context, I'll also study on this.
>
> Thanks,
>
> --
> Peter Xu
>
[1] https://wiki.qemu.org/ToDo/LiveMigration#Optimize_memory_updates_for_non-iterative_vmstates
BRs,
Trieu Huynh
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-17 9:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-16 17:56 [PATCH] migration: expose per-device state save times via query-migrate Trieu Huynh
2026-04-16 19:21 ` Peter Xu
2026-04-17 9:34 ` Trieu Huynh
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.