* [PATCH 1/4] vfio/migration: Add save_{iterate, complete_precopy}_started trace events
2024-10-29 14:58 [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Maciej S. Szmigiero
@ 2024-10-29 14:58 ` Maciej S. Szmigiero
2024-10-31 14:21 ` [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started " Avihai Horon
2024-10-29 14:58 ` [PATCH 2/4] migration/ram: Add load start trace event Maciej S. Szmigiero
` (3 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-29 14:58 UTC (permalink / raw)
To: Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
This way both the start and end points of migrating a particular VFIO
device are known.
Add also a vfio_save_iterate_empty_hit trace event so it is known when
there's no more data to send for that device.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
hw/vfio/migration.c | 13 +++++++++++++
hw/vfio/trace-events | 3 +++
include/hw/vfio/vfio-common.h | 3 +++
3 files changed, 19 insertions(+)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 992dc3b10257..1b1ddf527d69 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -472,6 +472,9 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
return -ENOMEM;
}
+ migration->save_iterate_started = false;
+ migration->save_iterate_empty_hit = false;
+
if (vfio_precopy_supported(vbasedev)) {
switch (migration->device_state) {
case VFIO_DEVICE_STATE_RUNNING:
@@ -602,9 +605,17 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
VFIOMigration *migration = vbasedev->migration;
ssize_t data_size;
+ if (!migration->save_iterate_started) {
+ trace_vfio_save_iterate_started(vbasedev->name);
+ migration->save_iterate_started = true;
+ }
+
data_size = vfio_save_block(f, migration);
if (data_size < 0) {
return data_size;
+ } else if (data_size == 0 && !migration->save_iterate_empty_hit) {
+ trace_vfio_save_iterate_empty_hit(vbasedev->name);
+ migration->save_iterate_empty_hit = true;
}
vfio_update_estimated_pending_data(migration, data_size);
@@ -630,6 +641,8 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
int ret;
Error *local_err = NULL;
+ trace_vfio_save_complete_precopy_started(vbasedev->name);
+
/* We reach here with device state STOP or STOP_COPY only */
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
VFIO_DEVICE_STATE_STOP, &local_err);
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 29789e8d276d..e58deab232ed 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -159,8 +159,11 @@ vfio_migration_state_notifier(const char *name, int state) " (%s) state %d"
vfio_save_block(const char *name, int data_size) " (%s) data_size %d"
vfio_save_cleanup(const char *name) " (%s)"
vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d"
+vfio_save_complete_precopy_started(const char *name) " (%s)"
vfio_save_device_config_state(const char *name) " (%s)"
vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64
+vfio_save_iterate_empty_hit(const char *name) " (%s)"
+vfio_save_iterate_started(const char *name) " (%s)"
vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size %"PRIu64
vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" stopcopy size %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index fed499b199f0..997ee5af2d5b 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -73,6 +73,9 @@ typedef struct VFIOMigration {
uint64_t precopy_init_size;
uint64_t precopy_dirty_size;
bool initial_data_sent;
+
+ bool save_iterate_started;
+ bool save_iterate_empty_hit;
} VFIOMigration;
struct VFIOGroup;
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started trace events
2024-10-29 14:58 ` [PATCH 1/4] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
@ 2024-10-31 14:21 ` Avihai Horon
2024-10-31 22:17 ` Maciej S. Szmigiero
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2024-10-31 14:21 UTC (permalink / raw)
To: Maciej S. Szmigiero, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Joao Martins,
qemu-devel
Hi Maciej,
On 29/10/2024 16:58, Maciej S. Szmigiero wrote:
> External email: Use caution opening links or attachments
>
>
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> This way both the start and end points of migrating a particular VFIO
> device are known.
>
> Add also a vfio_save_iterate_empty_hit trace event so it is known when
> there's no more data to send for that device.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---
> hw/vfio/migration.c | 13 +++++++++++++
> hw/vfio/trace-events | 3 +++
> include/hw/vfio/vfio-common.h | 3 +++
> 3 files changed, 19 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 992dc3b10257..1b1ddf527d69 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -472,6 +472,9 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
> return -ENOMEM;
> }
>
> + migration->save_iterate_started = false;
> + migration->save_iterate_empty_hit = false;
> +
> if (vfio_precopy_supported(vbasedev)) {
> switch (migration->device_state) {
> case VFIO_DEVICE_STATE_RUNNING:
> @@ -602,9 +605,17 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
> VFIOMigration *migration = vbasedev->migration;
> ssize_t data_size;
>
> + if (!migration->save_iterate_started) {
> + trace_vfio_save_iterate_started(vbasedev->name);
> + migration->save_iterate_started = true;
> + }
> +
> data_size = vfio_save_block(f, migration);
> if (data_size < 0) {
> return data_size;
> + } else if (data_size == 0 && !migration->save_iterate_empty_hit) {
> + trace_vfio_save_iterate_empty_hit(vbasedev->name);
> + migration->save_iterate_empty_hit = true;
> }
Can we instead use trace_vfio_save_iterate to understand if the device
reached 0?
In any case, I think the above could fit better in vfio_save_block(),
where ENOMSG indicates that the device has no more data to send during
pre-copy phase:
...
if (data_size < 0) {
/*
* Pre-copy emptied all the device state for now. For more information,
* please refer to the Linux kernel VFIO uAPI.
*/
if (errno == ENOMSG) {
trace_vfio_save_iterate_empty_hit(vbasedev->name)
<--------------- move it here
return 0;
}
return -errno;
}
...
If you move the trace there, maybe renaming it to
trace_vfio_precopy_empty_hit() will be more accurate?
And trying to avoid adding the extra
VFIOMigration->save_iterate_empty_hit flag, can we simply trace it every
time?
>
> vfio_update_estimated_pending_data(migration, data_size);
> @@ -630,6 +641,8 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
> int ret;
> Error *local_err = NULL;
>
> + trace_vfio_save_complete_precopy_started(vbasedev->name);
I assume this trace is used to measure how long it takes for
vfio_save_complete_precopy() to run? If so, can we use
trace_vmstate_downtime_save to achieve the same goal?
Thanks.
> +
> /* We reach here with device state STOP or STOP_COPY only */
> ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
> VFIO_DEVICE_STATE_STOP, &local_err);
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index 29789e8d276d..e58deab232ed 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -159,8 +159,11 @@ vfio_migration_state_notifier(const char *name, int state) " (%s) state %d"
> vfio_save_block(const char *name, int data_size) " (%s) data_size %d"
> vfio_save_cleanup(const char *name) " (%s)"
> vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d"
> +vfio_save_complete_precopy_started(const char *name) " (%s)"
> vfio_save_device_config_state(const char *name) " (%s)"
> vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64
> +vfio_save_iterate_empty_hit(const char *name) " (%s)"
> +vfio_save_iterate_started(const char *name) " (%s)"
> vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size %"PRIu64
> vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
> vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" stopcopy size %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index fed499b199f0..997ee5af2d5b 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -73,6 +73,9 @@ typedef struct VFIOMigration {
> uint64_t precopy_init_size;
> uint64_t precopy_dirty_size;
> bool initial_data_sent;
> +
> + bool save_iterate_started;
> + bool save_iterate_empty_hit;
> } VFIOMigration;
>
> struct VFIOGroup;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started trace events
2024-10-31 14:21 ` [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started " Avihai Horon
@ 2024-10-31 22:17 ` Maciej S. Szmigiero
2024-11-01 16:48 ` Maciej S. Szmigiero
2024-11-04 8:08 ` Avihai Horon
0 siblings, 2 replies; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-31 22:17 UTC (permalink / raw)
To: Avihai Horon
Cc: Alex Williamson, Fabiano Rosas, Peter Xu, Cédric Le Goater,
Eric Blake, Markus Armbruster, Daniel P . Berrangé,
Joao Martins, qemu-devel
Hi Avihai,
On 31.10.2024 15:21, Avihai Horon wrote:
> Hi Maciej,
>
> On 29/10/2024 16:58, Maciej S. Szmigiero wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> This way both the start and end points of migrating a particular VFIO
>> device are known.
>>
>> Add also a vfio_save_iterate_empty_hit trace event so it is known when
>> there's no more data to send for that device.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> ---
>> hw/vfio/migration.c | 13 +++++++++++++
>> hw/vfio/trace-events | 3 +++
>> include/hw/vfio/vfio-common.h | 3 +++
>> 3 files changed, 19 insertions(+)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 992dc3b10257..1b1ddf527d69 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -472,6 +472,9 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
>> return -ENOMEM;
>> }
>>
>> + migration->save_iterate_started = false;
>> + migration->save_iterate_empty_hit = false;
>> +
>> if (vfio_precopy_supported(vbasedev)) {
>> switch (migration->device_state) {
>> case VFIO_DEVICE_STATE_RUNNING:
>> @@ -602,9 +605,17 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
>> VFIOMigration *migration = vbasedev->migration;
>> ssize_t data_size;
>>
>> + if (!migration->save_iterate_started) {
>> + trace_vfio_save_iterate_started(vbasedev->name);
>> + migration->save_iterate_started = true;
>> + }
>> +
>> data_size = vfio_save_block(f, migration);
>> if (data_size < 0) {
>> return data_size;
>> + } else if (data_size == 0 && !migration->save_iterate_empty_hit) {
>> + trace_vfio_save_iterate_empty_hit(vbasedev->name);
>> + migration->save_iterate_empty_hit = true;
>> }
>
> Can we instead use trace_vfio_save_iterate to understand if the device reached 0?
AFAIK there's not way to filter trace events by their parameters,
like only logging vfio_save_iterate trace event if both parameters
are zero.
It means that vfio_save_iterate has to be enabled unconditionally to
serve as a replacement for vfio_save_iterate_empty_hit, which could
result in it being logged/emitted many extra times (with non-zero
parameters).
Because of that I think having a dedicated trace event for such
occasion makes sense (it is also easily grep-able).
> In any case, I think the above could fit better in vfio_save_block(), where ENOMSG indicates that the device has no more data to send during pre-copy phase:
>
> ...
> if (data_size < 0) {
> /*
> * Pre-copy emptied all the device state for now. For more information,
> * please refer to the Linux kernel VFIO uAPI.
> */
> if (errno == ENOMSG) {
> trace_vfio_save_iterate_empty_hit(vbasedev->name) <--------------- move it here
> return 0;
> }
>
> return -errno;
> }
> ...
>
> If you move the trace there, maybe renaming it to trace_vfio_precopy_empty_hit() will be more accurate?
This move and rename seems sensible to me.
> And trying to avoid adding the extra VFIOMigration->save_iterate_empty_hit flag, can we simply trace it every time?
Will have to do some tests to be sure but if there's possibility that
we get ENOMSG many times then obviously we don't want to flood logs with
this trace event in this case - we want to only log the
"data present" -> "data not present" edge/change.
>>
>> vfio_update_estimated_pending_data(migration, data_size);
>> @@ -630,6 +641,8 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
>> int ret;
>> Error *local_err = NULL;
>>
>> + trace_vfio_save_complete_precopy_started(vbasedev->name);
>
> I assume this trace is used to measure how long it takes for vfio_save_complete_precopy() to run? If so, can we use trace_vmstate_downtime_save to achieve the same goal?
With an appropriate filtering I guess it could be used to
extract the same data, although explicit VFIO trace point
makes it easier to just look at these traces directly
(less noise from other devices).
But at the same time trace_vfio_save_complete_precopy
already exists and by this metric it would also be
unnecessary.
> Thanks.
Thanks,
Maciej
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started trace events
2024-10-31 22:17 ` Maciej S. Szmigiero
@ 2024-11-01 16:48 ` Maciej S. Szmigiero
2024-11-04 8:08 ` Avihai Horon
1 sibling, 0 replies; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-11-01 16:48 UTC (permalink / raw)
To: Avihai Horon
Cc: Alex Williamson, Fabiano Rosas, Peter Xu, Cédric Le Goater,
Eric Blake, Markus Armbruster, Daniel P . Berrangé,
Joao Martins, qemu-devel
On 31.10.2024 23:17, Maciej S. Szmigiero wrote:
> Hi Avihai,
>
> On 31.10.2024 15:21, Avihai Horon wrote:
>> Hi Maciej,
>>
>> On 29/10/2024 16:58, Maciej S. Szmigiero wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> This way both the start and end points of migrating a particular VFIO
>>> device are known.
>>>
>>> Add also a vfio_save_iterate_empty_hit trace event so it is known when
>>> there's no more data to send for that device.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>> ---
>>> hw/vfio/migration.c | 13 +++++++++++++
>>> hw/vfio/trace-events | 3 +++
>>> include/hw/vfio/vfio-common.h | 3 +++
>>> 3 files changed, 19 insertions(+)
>>>
>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index 992dc3b10257..1b1ddf527d69 100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
(..)
>> In any case, I think the above could fit better in vfio_save_block(), where ENOMSG indicates that the device has no more data to send during pre-copy phase:
>>
>> ...
>> if (data_size < 0) {
>> /*
>> * Pre-copy emptied all the device state for now. For more information,
>> * please refer to the Linux kernel VFIO uAPI.
>> */
>> if (errno == ENOMSG) {
>> trace_vfio_save_iterate_empty_hit(vbasedev->name) <--------------- move it here
>> return 0;
>> }
>>
>> return -errno;
>> }
>> ...
>>
>> If you move the trace there, maybe renaming it to trace_vfio_precopy_empty_hit() will be more accurate?
>
> This move and rename seems sensible to me.
>
>> And trying to avoid adding the extra VFIOMigration->save_iterate_empty_hit flag, can we simply trace it every time?
>
> Will have to do some tests to be sure but if there's possibility that
> we get ENOMSG many times then obviously we don't want to flood logs with
> this trace event in this case - we want to only log the
> "data present" -> "data not present" edge/change.
>
Indeed - it needs to be triggered only on the "non-empty"/"empty" change
otherwise there are 3 repeats of this trace event per VFIO device in
a rapid succession.
So some kind of a flag here is necessary: with the trace event being
"re-armed" by a non-empty read from the VFIO device I get just one such
event per device.
Thanks,
Maciej
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started trace events
2024-10-31 22:17 ` Maciej S. Szmigiero
2024-11-01 16:48 ` Maciej S. Szmigiero
@ 2024-11-04 8:08 ` Avihai Horon
2024-11-04 14:00 ` Maciej S. Szmigiero
1 sibling, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2024-11-04 8:08 UTC (permalink / raw)
To: Maciej S. Szmigiero
Cc: Alex Williamson, Fabiano Rosas, Peter Xu, Cédric Le Goater,
Eric Blake, Markus Armbruster, Daniel P . Berrangé,
Joao Martins, qemu-devel
On 01/11/2024 0:17, Maciej S. Szmigiero wrote:
> External email: Use caution opening links or attachments
>
>
> Hi Avihai,
>
> On 31.10.2024 15:21, Avihai Horon wrote:
>> Hi Maciej,
>>
>> On 29/10/2024 16:58, Maciej S. Szmigiero wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> This way both the start and end points of migrating a particular VFIO
>>> device are known.
>>>
>>> Add also a vfio_save_iterate_empty_hit trace event so it is known when
>>> there's no more data to send for that device.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>> ---
>>> hw/vfio/migration.c | 13 +++++++++++++
>>> hw/vfio/trace-events | 3 +++
>>> include/hw/vfio/vfio-common.h | 3 +++
>>> 3 files changed, 19 insertions(+)
>>>
>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index 992dc3b10257..1b1ddf527d69 100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
>>> @@ -472,6 +472,9 @@ static int vfio_save_setup(QEMUFile *f, void
>>> *opaque, Error **errp)
>>> return -ENOMEM;
>>> }
>>>
>>> + migration->save_iterate_started = false;
>>> + migration->save_iterate_empty_hit = false;
>>> +
>>> if (vfio_precopy_supported(vbasedev)) {
>>> switch (migration->device_state) {
>>> case VFIO_DEVICE_STATE_RUNNING:
>>> @@ -602,9 +605,17 @@ static int vfio_save_iterate(QEMUFile *f, void
>>> *opaque)
>>> VFIOMigration *migration = vbasedev->migration;
>>> ssize_t data_size;
>>>
>>> + if (!migration->save_iterate_started) {
>>> + trace_vfio_save_iterate_started(vbasedev->name);
>>> + migration->save_iterate_started = true;
>>> + }
>>> +
>>> data_size = vfio_save_block(f, migration);
>>> if (data_size < 0) {
>>> return data_size;
>>> + } else if (data_size == 0 && !migration->save_iterate_empty_hit) {
>>> + trace_vfio_save_iterate_empty_hit(vbasedev->name);
>>> + migration->save_iterate_empty_hit = true;
>>> }
>>
>> Can we instead use trace_vfio_save_iterate to understand if the
>> device reached 0?
>
> AFAIK there's not way to filter trace events by their parameters,
> like only logging vfio_save_iterate trace event if both parameters
> are zero.
>
> It means that vfio_save_iterate has to be enabled unconditionally to
> serve as a replacement for vfio_save_iterate_empty_hit, which could
> result in it being logged/emitted many extra times (with non-zero
> parameters).
>
> Because of that I think having a dedicated trace event for such
> occasion makes sense (it is also easily grep-able).
Ahh, I understand.
>
>> In any case, I think the above could fit better in vfio_save_block(),
>> where ENOMSG indicates that the device has no more data to send
>> during pre-copy phase:
>>
>> ...
>> if (data_size < 0) {
>> /*
>> * Pre-copy emptied all the device state for now. For more
>> information,
>> * please refer to the Linux kernel VFIO uAPI.
>> */
>> if (errno == ENOMSG) {
>> trace_vfio_save_iterate_empty_hit(vbasedev->name) <---------------
>> move it here
>> return 0;
>> }
>>
>> return -errno;
>> }
>> ...
>>
>> If you move the trace there, maybe renaming it to
>> trace_vfio_precopy_empty_hit() will be more accurate?
>
> This move and rename seems sensible to me.
>
>> And trying to avoid adding the extra
>> VFIOMigration->save_iterate_empty_hit flag, can we simply trace it
>> every time?
>
> Will have to do some tests to be sure but if there's possibility that
> we get ENOMSG many times then obviously we don't want to flood logs with
> this trace event in this case - we want to only log the
> "data present" -> "data not present" edge/change.
OK, so I guess a flag is really needed.
BTW, there is also trace_vfio_state_pending_exact, maybe it could do the
job? It might get called multiple times but not as many as
vfio_save_iterate.
>
>>>
>>> vfio_update_estimated_pending_data(migration, data_size);
>>> @@ -630,6 +641,8 @@ static int vfio_save_complete_precopy(QEMUFile
>>> *f, void *opaque)
>>> int ret;
>>> Error *local_err = NULL;
>>>
>>> + trace_vfio_save_complete_precopy_started(vbasedev->name);
>>
>> I assume this trace is used to measure how long it takes for
>> vfio_save_complete_precopy() to run? If so, can we use
>> trace_vmstate_downtime_save to achieve the same goal?
>
> With an appropriate filtering I guess it could be used to
> extract the same data, although explicit VFIO trace point
> makes it easier to just look at these traces directly
> (less noise from other devices).
I see. Well, I don't have a strong opinion if it makes life easier.
Thanks.
>
> But at the same time trace_vfio_save_complete_precopy
> already exists and by this metric it would also be
> unnecessary.
>
>> Thanks.
>
> Thanks,
> Maciej
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started trace events
2024-11-04 8:08 ` Avihai Horon
@ 2024-11-04 14:00 ` Maciej S. Szmigiero
2024-11-04 14:10 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-11-04 14:00 UTC (permalink / raw)
To: Avihai Horon
Cc: Alex Williamson, Fabiano Rosas, Peter Xu, Cédric Le Goater,
Eric Blake, Markus Armbruster, Daniel P . Berrangé,
Joao Martins, qemu-devel
On 4.11.2024 09:08, Avihai Horon wrote:
>
> On 01/11/2024 0:17, Maciej S. Szmigiero wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Avihai,
>>
>> On 31.10.2024 15:21, Avihai Horon wrote:
>>> Hi Maciej,
>>>
>>> On 29/10/2024 16:58, Maciej S. Szmigiero wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>>
>>>> This way both the start and end points of migrating a particular VFIO
>>>> device are known.
>>>>
>>>> Add also a vfio_save_iterate_empty_hit trace event so it is known when
>>>> there's no more data to send for that device.
>>>>
>>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>> ---
>>>> hw/vfio/migration.c | 13 +++++++++++++
>>>> hw/vfio/trace-events | 3 +++
>>>> include/hw/vfio/vfio-common.h | 3 +++
>>>> 3 files changed, 19 insertions(+)
>>>>
>>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>>> index 992dc3b10257..1b1ddf527d69 100644
>>>> --- a/hw/vfio/migration.c
>>>> +++ b/hw/vfio/migration.c
>>>> @@ -472,6 +472,9 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
>>>> return -ENOMEM;
>>>> }
>>>>
>>>> + migration->save_iterate_started = false;
>>>> + migration->save_iterate_empty_hit = false;
>>>> +
>>>> if (vfio_precopy_supported(vbasedev)) {
>>>> switch (migration->device_state) {
>>>> case VFIO_DEVICE_STATE_RUNNING:
>>>> @@ -602,9 +605,17 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
>>>> VFIOMigration *migration = vbasedev->migration;
>>>> ssize_t data_size;
>>>>
>>>> + if (!migration->save_iterate_started) {
>>>> + trace_vfio_save_iterate_started(vbasedev->name);
>>>> + migration->save_iterate_started = true;
>>>> + }
>>>> +
>>>> data_size = vfio_save_block(f, migration);
>>>> if (data_size < 0) {
>>>> return data_size;
>>>> + } else if (data_size == 0 && !migration->save_iterate_empty_hit) {
>>>> + trace_vfio_save_iterate_empty_hit(vbasedev->name);
>>>> + migration->save_iterate_empty_hit = true;
>>>> }
>>>
>>> Can we instead use trace_vfio_save_iterate to understand if the device reached 0?
>>
>> AFAIK there's not way to filter trace events by their parameters,
>> like only logging vfio_save_iterate trace event if both parameters
>> are zero.
>>
>> It means that vfio_save_iterate has to be enabled unconditionally to
>> serve as a replacement for vfio_save_iterate_empty_hit, which could
>> result in it being logged/emitted many extra times (with non-zero
>> parameters).
>>
>> Because of that I think having a dedicated trace event for such
>> occasion makes sense (it is also easily grep-able).
>
> Ahh, I understand.
>
>>
>>> In any case, I think the above could fit better in vfio_save_block(), where ENOMSG indicates that the device has no more data to send during pre-copy phase:
>>>
>>> ...
>>> if (data_size < 0) {
>>> /*
>>> * Pre-copy emptied all the device state for now. For more information,
>>> * please refer to the Linux kernel VFIO uAPI.
>>> */
>>> if (errno == ENOMSG) {
>>> trace_vfio_save_iterate_empty_hit(vbasedev->name) <--------------- move it here
>>> return 0;
>>> }
>>>
>>> return -errno;
>>> }
>>> ...
>>>
>>> If you move the trace there, maybe renaming it to trace_vfio_precopy_empty_hit() will be more accurate?
>>
>> This move and rename seems sensible to me.
>>
>>> And trying to avoid adding the extra VFIOMigration->save_iterate_empty_hit flag, can we simply trace it every time?
>>
>> Will have to do some tests to be sure but if there's possibility that
>> we get ENOMSG many times then obviously we don't want to flood logs with
>> this trace event in this case - we want to only log the
>> "data present" -> "data not present" edge/change.
>
> OK, so I guess a flag is really needed.
> BTW, there is also trace_vfio_state_pending_exact, maybe it could do the job? It might get called multiple times but not as many as vfio_save_iterate.
In a quick test run it was still called/logged 5 times for each VFIO device
so quite more often than the empty_hit one (which was logged just once per dev).
>>
>>>>
>>>> vfio_update_estimated_pending_data(migration, data_size);
>>>> @@ -630,6 +641,8 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
>>>> int ret;
>>>> Error *local_err = NULL;
>>>>
>>>> + trace_vfio_save_complete_precopy_started(vbasedev->name);
>>>
>>> I assume this trace is used to measure how long it takes for vfio_save_complete_precopy() to run? If so, can we use trace_vmstate_downtime_save to achieve the same goal?
>>
>> With an appropriate filtering I guess it could be used to
>> extract the same data, although explicit VFIO trace point
>> makes it easier to just look at these traces directly
>> (less noise from other devices).
>
> I see. Well, I don't have a strong opinion if it makes life easier.
>
> Thanks.
>
>>
>> But at the same time trace_vfio_save_complete_precopy
>> already exists and by this metric it would also be
>> unnecessary.
>>
>>> Thanks.
>>
>> Thanks,
>> Maciej
>>
Thanks,
Maciej
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] vfio/migration: Add save_{iterate,complete_precopy}_started trace events
2024-11-04 14:00 ` Maciej S. Szmigiero
@ 2024-11-04 14:10 ` Avihai Horon
0 siblings, 0 replies; 18+ messages in thread
From: Avihai Horon @ 2024-11-04 14:10 UTC (permalink / raw)
To: Maciej S. Szmigiero
Cc: Alex Williamson, Fabiano Rosas, Peter Xu, Cédric Le Goater,
Eric Blake, Markus Armbruster, Daniel P . Berrangé,
Joao Martins, qemu-devel
On 04/11/2024 16:00, Maciej S. Szmigiero wrote:
> External email: Use caution opening links or attachments
>
>
> On 4.11.2024 09:08, Avihai Horon wrote:
>>
>> On 01/11/2024 0:17, Maciej S. Szmigiero wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Hi Avihai,
>>>
>>> On 31.10.2024 15:21, Avihai Horon wrote:
>>>> Hi Maciej,
>>>>
>>>> On 29/10/2024 16:58, Maciej S. Szmigiero wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>>>
>>>>> This way both the start and end points of migrating a particular VFIO
>>>>> device are known.
>>>>>
>>>>> Add also a vfio_save_iterate_empty_hit trace event so it is known
>>>>> when
>>>>> there's no more data to send for that device.
>>>>>
>>>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>>> ---
>>>>> hw/vfio/migration.c | 13 +++++++++++++
>>>>> hw/vfio/trace-events | 3 +++
>>>>> include/hw/vfio/vfio-common.h | 3 +++
>>>>> 3 files changed, 19 insertions(+)
>>>>>
>>>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>>>> index 992dc3b10257..1b1ddf527d69 100644
>>>>> --- a/hw/vfio/migration.c
>>>>> +++ b/hw/vfio/migration.c
>>>>> @@ -472,6 +472,9 @@ static int vfio_save_setup(QEMUFile *f, void
>>>>> *opaque, Error **errp)
>>>>> return -ENOMEM;
>>>>> }
>>>>>
>>>>> + migration->save_iterate_started = false;
>>>>> + migration->save_iterate_empty_hit = false;
>>>>> +
>>>>> if (vfio_precopy_supported(vbasedev)) {
>>>>> switch (migration->device_state) {
>>>>> case VFIO_DEVICE_STATE_RUNNING:
>>>>> @@ -602,9 +605,17 @@ static int vfio_save_iterate(QEMUFile *f,
>>>>> void *opaque)
>>>>> VFIOMigration *migration = vbasedev->migration;
>>>>> ssize_t data_size;
>>>>>
>>>>> + if (!migration->save_iterate_started) {
>>>>> + trace_vfio_save_iterate_started(vbasedev->name);
>>>>> + migration->save_iterate_started = true;
>>>>> + }
>>>>> +
>>>>> data_size = vfio_save_block(f, migration);
>>>>> if (data_size < 0) {
>>>>> return data_size;
>>>>> + } else if (data_size == 0 &&
>>>>> !migration->save_iterate_empty_hit) {
>>>>> + trace_vfio_save_iterate_empty_hit(vbasedev->name);
>>>>> + migration->save_iterate_empty_hit = true;
>>>>> }
>>>>
>>>> Can we instead use trace_vfio_save_iterate to understand if the
>>>> device reached 0?
>>>
>>> AFAIK there's not way to filter trace events by their parameters,
>>> like only logging vfio_save_iterate trace event if both parameters
>>> are zero.
>>>
>>> It means that vfio_save_iterate has to be enabled unconditionally to
>>> serve as a replacement for vfio_save_iterate_empty_hit, which could
>>> result in it being logged/emitted many extra times (with non-zero
>>> parameters).
>>>
>>> Because of that I think having a dedicated trace event for such
>>> occasion makes sense (it is also easily grep-able).
>>
>> Ahh, I understand.
>>
>>>
>>>> In any case, I think the above could fit better in
>>>> vfio_save_block(), where ENOMSG indicates that the device has no
>>>> more data to send during pre-copy phase:
>>>>
>>>> ...
>>>> if (data_size < 0) {
>>>> /*
>>>> * Pre-copy emptied all the device state for now. For more
>>>> information,
>>>> * please refer to the Linux kernel VFIO uAPI.
>>>> */
>>>> if (errno == ENOMSG) {
>>>> trace_vfio_save_iterate_empty_hit(vbasedev->name) <---------------
>>>> move it here
>>>> return 0;
>>>> }
>>>>
>>>> return -errno;
>>>> }
>>>> ...
>>>>
>>>> If you move the trace there, maybe renaming it to
>>>> trace_vfio_precopy_empty_hit() will be more accurate?
>>>
>>> This move and rename seems sensible to me.
>>>
>>>> And trying to avoid adding the extra
>>>> VFIOMigration->save_iterate_empty_hit flag, can we simply trace it
>>>> every time?
>>>
>>> Will have to do some tests to be sure but if there's possibility that
>>> we get ENOMSG many times then obviously we don't want to flood logs
>>> with
>>> this trace event in this case - we want to only log the
>>> "data present" -> "data not present" edge/change.
>>
>> OK, so I guess a flag is really needed.
>> BTW, there is also trace_vfio_state_pending_exact, maybe it could do
>> the job? It might get called multiple times but not as many as
>> vfio_save_iterate.
>
> In a quick test run it was still called/logged 5 times for each VFIO
> device
> so quite more often than the empty_hit one (which was logged just once
> per dev).
Yes, that is expected.
If that's too noisy for you then the empty_hit trace seems fine, IMHO.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/4] migration/ram: Add load start trace event
2024-10-29 14:58 [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Maciej S. Szmigiero
2024-10-29 14:58 ` [PATCH 1/4] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
@ 2024-10-29 14:58 ` Maciej S. Szmigiero
2024-10-29 19:10 ` Fabiano Rosas
2024-10-29 14:58 ` [PATCH 3/4] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
` (2 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-29 14:58 UTC (permalink / raw)
To: Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
There's a RAM load complete trace event but there wasn't its start equivalent.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
migration/ram.c | 1 +
migration/trace-events | 1 +
2 files changed, 2 insertions(+)
diff --git a/migration/ram.c b/migration/ram.c
index 326ce7eb791f..72cd1dbb22c4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4294,6 +4294,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
* it will be necessary to reduce the granularity of this
* critical section.
*/
+ trace_ram_load_start();
WITH_RCU_READ_LOCK_GUARD() {
if (postcopy_running) {
/*
diff --git a/migration/trace-events b/migration/trace-events
index c65902f042bd..2a99a7baaea6 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -115,6 +115,7 @@ colo_flush_ram_cache_end(void) ""
save_xbzrle_page_skipping(void) ""
save_xbzrle_page_overflow(void) ""
ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations"
+ram_load_start(void) ""
ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64
ram_write_tracking_ramblock_start(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu"
ram_write_tracking_ramblock_stop(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu"
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/4] migration/multifd: Zero p->flags before starting filling a packet
2024-10-29 14:58 [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Maciej S. Szmigiero
2024-10-29 14:58 ` [PATCH 1/4] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
2024-10-29 14:58 ` [PATCH 2/4] migration/ram: Add load start trace event Maciej S. Szmigiero
@ 2024-10-29 14:58 ` Maciej S. Szmigiero
2024-10-29 14:58 ` [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers Maciej S. Szmigiero
2024-10-29 20:40 ` [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Peter Xu
4 siblings, 0 replies; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-29 14:58 UTC (permalink / raw)
To: Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
This way there aren't stale flags there.
p->flags can't contain SYNC to be sent at the next RAM packet since syncs
are now handled separately in multifd_send_thread.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
migration/multifd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 9b200f4ad912..c9b24caa64bc 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -600,6 +600,7 @@ static void *multifd_send_thread(void *opaque)
* qatomic_store_release() in multifd_send().
*/
if (qatomic_load_acquire(&p->pending_job)) {
+ p->flags = 0;
p->iovs_num = 0;
assert(!multifd_payload_empty(p->data));
@@ -651,7 +652,6 @@ static void *multifd_send_thread(void *opaque)
}
/* p->next_packet_size will always be zero for a SYNC packet */
stat64_add(&mig_stats.multifd_bytes, p->packet_len);
- p->flags = 0;
}
qatomic_set(&p->pending_sync, false);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers
2024-10-29 14:58 [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Maciej S. Szmigiero
` (2 preceding siblings ...)
2024-10-29 14:58 ` [PATCH 3/4] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
@ 2024-10-29 14:58 ` Maciej S. Szmigiero
2024-10-29 19:26 ` Fabiano Rosas
2024-10-29 20:35 ` Peter Xu
2024-10-29 20:40 ` [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Peter Xu
4 siblings, 2 replies; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-29 14:58 UTC (permalink / raw)
To: Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
Some of these SaveVMHandlers were missing the BQL behavior annotation,
making people wonder what it exactly is.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
include/migration/register.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/migration/register.h b/include/migration/register.h
index f60e797894e5..c411d84876ec 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -210,6 +210,8 @@ typedef struct SaveVMHandlers {
void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
+ /* This runs inside the BQL. */
+
/**
* @load_state
*
@@ -227,6 +229,8 @@ typedef struct SaveVMHandlers {
*/
int (*load_state)(QEMUFile *f, void *opaque, int version_id);
+ /* The following handlers run inside the BQL. */
+
/**
* @load_setup
*
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers
2024-10-29 14:58 ` [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers Maciej S. Szmigiero
@ 2024-10-29 19:26 ` Fabiano Rosas
2024-10-29 20:35 ` Peter Xu
1 sibling, 0 replies; 18+ messages in thread
From: Fabiano Rosas @ 2024-10-29 19:26 UTC (permalink / raw)
To: Maciej S. Szmigiero, Peter Xu
Cc: Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Some of these SaveVMHandlers were missing the BQL behavior annotation,
> making people wonder what it exactly is.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers
2024-10-29 14:58 ` [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers Maciej S. Szmigiero
2024-10-29 19:26 ` Fabiano Rosas
@ 2024-10-29 20:35 ` Peter Xu
2024-10-29 20:46 ` Maciej S. Szmigiero
1 sibling, 1 reply; 18+ messages in thread
From: Peter Xu @ 2024-10-29 20:35 UTC (permalink / raw)
To: Maciej S. Szmigiero
Cc: Fabiano Rosas, Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
On Tue, Oct 29, 2024 at 03:58:16PM +0100, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Some of these SaveVMHandlers were missing the BQL behavior annotation,
> making people wonder what it exactly is.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---
> include/migration/register.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/migration/register.h b/include/migration/register.h
> index f60e797894e5..c411d84876ec 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -210,6 +210,8 @@ typedef struct SaveVMHandlers {
> void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
> uint64_t *can_postcopy);
>
> + /* This runs inside the BQL. */
> +
> /**
> * @load_state
> *
> @@ -227,6 +229,8 @@ typedef struct SaveVMHandlers {
> */
> int (*load_state)(QEMUFile *f, void *opaque, int version_id);
>
> + /* The following handlers run inside the BQL. */
If above also requires BQL, why not move this line upper?
OTOH, I think resume_prepare() doesn't require BQL..
> +
> /**
> * @load_setup
> *
>
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers
2024-10-29 20:35 ` Peter Xu
@ 2024-10-29 20:46 ` Maciej S. Szmigiero
2024-10-29 21:04 ` Peter Xu
0 siblings, 1 reply; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-29 20:46 UTC (permalink / raw)
To: Peter Xu
Cc: Fabiano Rosas, Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P. Berrangé, Avihai Horon,
Joao Martins, qemu-devel
On 29.10.2024 21:35, Peter Xu wrote:
> On Tue, Oct 29, 2024 at 03:58:16PM +0100, Maciej S. Szmigiero wrote:
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Some of these SaveVMHandlers were missing the BQL behavior annotation,
>> making people wonder what it exactly is.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> ---
>> include/migration/register.h | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/include/migration/register.h b/include/migration/register.h
>> index f60e797894e5..c411d84876ec 100644
>> --- a/include/migration/register.h
>> +++ b/include/migration/register.h
>> @@ -210,6 +210,8 @@ typedef struct SaveVMHandlers {
>> void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
>> uint64_t *can_postcopy);
>>
>> + /* This runs inside the BQL. */
>> +
>> /**
>> * @load_state
>> *
>> @@ -227,6 +229,8 @@ typedef struct SaveVMHandlers {
>> */
>> int (*load_state)(QEMUFile *f, void *opaque, int version_id);
>>
>> + /* The following handlers run inside the BQL. */
>
> If above also requires BQL, why not move this line upper?
The reason for this is that my main patch set also adds
"load_state_buffer" handler, which runs without BQL.
That handler is ordered next after "load_state" and I tried
to avoid further comment churn here.
But if you prefer to change these comments in the patch
introducing "load_state_buffer" handler instead then it's
fine.
> OTOH, I think resume_prepare() doesn't require BQL..
Yes, it seems like resume_prepare() is only called outside BQL.
Will update the patch.
Thanks,
Maciej
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers
2024-10-29 20:46 ` Maciej S. Szmigiero
@ 2024-10-29 21:04 ` Peter Xu
0 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2024-10-29 21:04 UTC (permalink / raw)
To: Maciej S. Szmigiero
Cc: Fabiano Rosas, Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P. Berrangé, Avihai Horon,
Joao Martins, qemu-devel
On Tue, Oct 29, 2024 at 09:46:01PM +0100, Maciej S. Szmigiero wrote:
> On 29.10.2024 21:35, Peter Xu wrote:
> > On Tue, Oct 29, 2024 at 03:58:16PM +0100, Maciej S. Szmigiero wrote:
> > > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> > >
> > > Some of these SaveVMHandlers were missing the BQL behavior annotation,
> > > making people wonder what it exactly is.
> > >
> > > Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> > > ---
> > > include/migration/register.h | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/include/migration/register.h b/include/migration/register.h
> > > index f60e797894e5..c411d84876ec 100644
> > > --- a/include/migration/register.h
> > > +++ b/include/migration/register.h
> > > @@ -210,6 +210,8 @@ typedef struct SaveVMHandlers {
> > > void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
> > > uint64_t *can_postcopy);
> > > + /* This runs inside the BQL. */
> > > +
> > > /**
> > > * @load_state
> > > *
> > > @@ -227,6 +229,8 @@ typedef struct SaveVMHandlers {
> > > */
> > > int (*load_state)(QEMUFile *f, void *opaque, int version_id);
> > > + /* The following handlers run inside the BQL. */
> >
> > If above also requires BQL, why not move this line upper?
>
> The reason for this is that my main patch set also adds
> "load_state_buffer" handler, which runs without BQL.
>
> That handler is ordered next after "load_state" and I tried
> to avoid further comment churn here.
>
> But if you prefer to change these comments in the patch
> introducing "load_state_buffer" handler instead then it's
> fine.
Considering the current change is so small to start benefit readers.. I
think it's ok we do this in one shot after load_state_buffer() change.
> > OTOH, I think resume_prepare() doesn't require BQL..
>
> Yes, it seems like resume_prepare() is only called outside BQL.
> Will update the patch.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/4] Trivial patches from multifd device state transfer support patch set
2024-10-29 14:58 [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Maciej S. Szmigiero
` (3 preceding siblings ...)
2024-10-29 14:58 ` [PATCH 4/4] migration: Document the BQL behavior of load SaveVMHandlers Maciej S. Szmigiero
@ 2024-10-29 20:40 ` Peter Xu
2024-10-29 20:46 ` Maciej S. Szmigiero
4 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2024-10-29 20:40 UTC (permalink / raw)
To: Maciej S. Szmigiero
Cc: Fabiano Rosas, Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P . Berrangé, Avihai Horon,
Joao Martins, qemu-devel
On Tue, Oct 29, 2024 at 03:58:12PM +0100, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> A new version of the multifd device state transfer support with VFIO consumer
> patch set is being prepared, the previous version and the associated
> discussion is available here:
> https://lore.kernel.org/qemu-devel/cover.1724701542.git.maciej.szmigiero@oracle.com/
>
> This new version was originally targeting QEMU 9.2 but such schedule proved
> to be too optimistic due to sheer number of invasive changes/rework required,
> especially with respect to the VFIO internal threads management and their
> synchronization with the migration core.
>
> In addition to these changes, recently merged commit 3b5948f808e3
> ("vfio/migration: Report only stop-copy size in vfio_state_pending_exact()")
> seems to have uncovered a race between multifd RAM and device state transfers:
> RAM transfer sender finishes the multifd stream with a SYNC in
> ram_save_complete() but the multifd receive channels are only released
> from this SYNC after the migration is wholly complete in
> process_incoming_migration_bh().
>
> The above causes problems if the multifd channels need to still be
> running after the RAM transfer is completed, for example because
> there is still remaining device state to be transferred.
>
> Since QEMU 9.2 code freeze is coming I've separated small uncontroversial
> commits from that WiP main patch set here, some of which were already
> reviewed during previous main patch set iterations.
>
> This way at least future code conflicts can be reduced and the amount
> of patches that need to be carried in the future versions of the main
> patch set is reduced.
>
>
> Maciej S. Szmigiero (4):
> vfio/migration: Add save_{iterate,complete_precopy}_started trace
> events
> migration/ram: Add load start trace event
> migration/multifd: Zero p->flags before starting filling a packet
> migration: Document the BQL behavior of load SaveVMHandlers
I queued patch 2-3. Patch 4 is ok to be merged even after softfreeze if
it's a doc only change, but we don't need to rush either..
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/4] Trivial patches from multifd device state transfer support patch set
2024-10-29 20:40 ` [PATCH 0/4] Trivial patches from multifd device state transfer support patch set Peter Xu
@ 2024-10-29 20:46 ` Maciej S. Szmigiero
0 siblings, 0 replies; 18+ messages in thread
From: Maciej S. Szmigiero @ 2024-10-29 20:46 UTC (permalink / raw)
To: Peter Xu
Cc: Fabiano Rosas, Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Daniel P. Berrangé, Avihai Horon,
Joao Martins, qemu-devel
On 29.10.2024 21:40, Peter Xu wrote:
> On Tue, Oct 29, 2024 at 03:58:12PM +0100, Maciej S. Szmigiero wrote:
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> A new version of the multifd device state transfer support with VFIO consumer
>> patch set is being prepared, the previous version and the associated
>> discussion is available here:
>> https://lore.kernel.org/qemu-devel/cover.1724701542.git.maciej.szmigiero@oracle.com/
>>
>> This new version was originally targeting QEMU 9.2 but such schedule proved
>> to be too optimistic due to sheer number of invasive changes/rework required,
>> especially with respect to the VFIO internal threads management and their
>> synchronization with the migration core.
>>
>> In addition to these changes, recently merged commit 3b5948f808e3
>> ("vfio/migration: Report only stop-copy size in vfio_state_pending_exact()")
>> seems to have uncovered a race between multifd RAM and device state transfers:
>> RAM transfer sender finishes the multifd stream with a SYNC in
>> ram_save_complete() but the multifd receive channels are only released
>> from this SYNC after the migration is wholly complete in
>> process_incoming_migration_bh().
>>
>> The above causes problems if the multifd channels need to still be
>> running after the RAM transfer is completed, for example because
>> there is still remaining device state to be transferred.
>>
>> Since QEMU 9.2 code freeze is coming I've separated small uncontroversial
>> commits from that WiP main patch set here, some of which were already
>> reviewed during previous main patch set iterations.
>>
>> This way at least future code conflicts can be reduced and the amount
>> of patches that need to be carried in the future versions of the main
>> patch set is reduced.
>>
>>
>> Maciej S. Szmigiero (4):
>> vfio/migration: Add save_{iterate,complete_precopy}_started trace
>> events
>> migration/ram: Add load start trace event
>> migration/multifd: Zero p->flags before starting filling a packet
>> migration: Document the BQL behavior of load SaveVMHandlers
>
> I queued patch 2-3. Patch 4 is ok to be merged even after softfreeze if
> it's a doc only change, but we don't need to rush either..
>
> Thanks,
>
Thanks!
Maciej
^ permalink raw reply [flat|nested] 18+ messages in thread