* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-02 17:34 [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition Avihai Horon
@ 2026-02-03 16:27 ` Cédric Le Goater
2026-02-03 16:48 ` Cédric Le Goater
2026-02-04 13:22 ` Markus Armbruster
2 siblings, 0 replies; 18+ messages in thread
From: Cédric Le Goater @ 2026-02-03 16:27 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Eric Blake, Markus Armbruster, Peter Xu,
Fabiano Rosas, Maor Gottlieb
On 2/2/26 18:34, Avihai Horon wrote:
> The VFIO_MIGRATION event notifies users when a VFIO device transitions
> to a new state.
>
> One use case for this event is to prevent timeouts for RDMA connections
> to the migrated device. In this case, an external management application
> (not libvirt) consumes the events and disables the RDMA timeout
> mechanism when receiving the event for PRE_COPY_P2P state, which
> indicates that the device is non-responsive.
>
> This is essential because RDMA connections typically have very low
> timeouts (tens of milliseconds), which can be far below migration
> downtime.
>
> However, under heavy resource utilization, the device transition to
> PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the
> VFIO_MIGRATION event is currently sent only after the transition
> completes, it arrives too late, after RDMA connections have already
> timed out.
>
> To address this, send an additional "prepare" event immediately before
> initiating the PRE_COPY_P2P transition. This guarantees timely event
> delivery regardless of how long the actual state transition takes.
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> Changes from v2 (https://lore.kernel.org/qemu-devel/20260201122348.28478-1-avihaih@nvidia.com/):
> * Renamed prepare-pre-copy-p2p to pre-copy-p2p-prepare
> * Renamed prep parameter to prepare in mig_state_to_qapi_state() and
> vfio_migration_send_event()
> * Added short explanatory comment before sending the prepare event in
> vfio_migration_set_state()
> * Explicitly used VFIO_DEVICE_STATE_PRE_COPY_P2P as parameter for
> vfio_migration_send_event()
>
> Changes from v1 (https://lore.kernel.org/qemu-devel/20260128105159.10282-1-avihaih@nvidia.com/):
> * Removed VFIO_MIGRATION_PREPARE event and instead added a new
> PREPARE_PRE_COPY_P2P state which is sent before PRE_COPY_P2P
> transition
> * Added details to commit message
> ---
> qapi/vfio.json | 13 +++++++++++--
> hw/vfio/migration.c | 26 +++++++++++++++++++-------
> 2 files changed, 30 insertions(+), 9 deletions(-)
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> diff --git a/qapi/vfio.json b/qapi/vfio.json
> index a1a9c5b673..17b6046871 100644
> --- a/qapi/vfio.json
> +++ b/qapi/vfio.json
> @@ -11,7 +11,13 @@
> ##
> # @QapiVfioMigrationState:
> #
> -# An enumeration of the VFIO device migration states.
> +# An enumeration of the VFIO device migration states. In addition to
> +# the regular states, there are prepare states (with 'prepare' suffix)
> +# which indicate that the device is just about to transition to the
> +# corresponding state. Note that seeing a prepare state for state X
> +# doesn't guarantee that the next state will be X, as the state
> +# transition can fail and the device may transition to a different
> +# state instead.
> #
> # @stop: The device is stopped.
> #
> @@ -32,11 +38,14 @@
> # tracking its internal state and its internal state is available
> # for reading.
> #
> +# @pre-copy-p2p-prepare: The device is just about to move to
> +# pre-copy-p2p state. (since 11.0)
> +#
> # Since: 9.1
> ##
> { 'enum': 'QapiVfioMigrationState',
> 'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p',
> - 'pre-copy', 'pre-copy-p2p' ] }
> + 'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] }
>
> ##
> # @VFIO_MIGRATION:
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index b4695030c7..4bd8e24699 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -68,7 +68,7 @@ static const char *mig_state_to_str(enum vfio_device_mig_state state)
> }
>
> static QapiVfioMigrationState
> -mig_state_to_qapi_state(enum vfio_device_mig_state state)
> +mig_state_to_qapi_state(enum vfio_device_mig_state state, bool prepare)
> {
> switch (state) {
> case VFIO_DEVICE_STATE_STOP:
> @@ -84,15 +84,17 @@ mig_state_to_qapi_state(enum vfio_device_mig_state state)
> case VFIO_DEVICE_STATE_PRE_COPY:
> return QAPI_VFIO_MIGRATION_STATE_PRE_COPY;
> case VFIO_DEVICE_STATE_PRE_COPY_P2P:
> - return QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P;
> + return prepare ? QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P_PREPARE :
> + QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P;
> default:
> g_assert_not_reached();
> }
> }
>
> -static void vfio_migration_send_event(VFIODevice *vbasedev)
> +static void vfio_migration_send_event(VFIODevice *vbasedev,
> + enum vfio_device_mig_state state,
> + bool prepare)
> {
> - VFIOMigration *migration = vbasedev->migration;
> DeviceState *dev = vbasedev->dev;
> g_autofree char *qom_path = NULL;
> Object *obj;
> @@ -106,8 +108,8 @@ static void vfio_migration_send_event(VFIODevice *vbasedev)
> g_assert(obj);
> qom_path = object_get_canonical_path(obj);
>
> - qapi_event_send_vfio_migration(
> - dev->id, qom_path, mig_state_to_qapi_state(migration->device_state));
> + qapi_event_send_vfio_migration(dev->id, qom_path,
> + mig_state_to_qapi_state(state, prepare));
> }
>
> static void vfio_migration_set_device_state(VFIODevice *vbasedev,
> @@ -119,7 +121,7 @@ static void vfio_migration_set_device_state(VFIODevice *vbasedev,
> mig_state_to_str(state));
>
> migration->device_state = state;
> - vfio_migration_send_event(vbasedev);
> + vfio_migration_send_event(vbasedev, state, false);
> }
>
> int vfio_migration_set_state(VFIODevice *vbasedev,
> @@ -146,6 +148,16 @@ int vfio_migration_set_state(VFIODevice *vbasedev,
> return 0;
> }
>
> + /*
> + * Send a prepare event before initiating the PRE_COPY_P2P transition to
> + * ensure timely event delivery regardless of how long the state transition
> + * takes.
> + */
> + if (new_state == VFIO_DEVICE_STATE_PRE_COPY_P2P) {
> + vfio_migration_send_event(vbasedev, VFIO_DEVICE_STATE_PRE_COPY_P2P,
> + true);
> + }
> +
> feature->argsz = sizeof(buf);
> feature->flags =
> VFIO_DEVICE_FEATURE_SET | VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE;
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-02 17:34 [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition Avihai Horon
2026-02-03 16:27 ` Cédric Le Goater
@ 2026-02-03 16:48 ` Cédric Le Goater
2026-02-03 17:02 ` Peter Xu
2026-02-04 13:22 ` Markus Armbruster
2 siblings, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2026-02-03 16:48 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Eric Blake, Markus Armbruster, Peter Xu,
Fabiano Rosas, Maor Gottlieb
On 2/2/26 18:34, Avihai Horon wrote:
> The VFIO_MIGRATION event notifies users when a VFIO device transitions
> to a new state.
>
> One use case for this event is to prevent timeouts for RDMA connections
> to the migrated device. In this case, an external management application
> (not libvirt) consumes the events and disables the RDMA timeout
> mechanism when receiving the event for PRE_COPY_P2P state, which
> indicates that the device is non-responsive.
>
> This is essential because RDMA connections typically have very low
> timeouts (tens of milliseconds), which can be far below migration
> downtime.
>
> However, under heavy resource utilization, the device transition to
> PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the
> VFIO_MIGRATION event is currently sent only after the transition
> completes, it arrives too late, after RDMA connections have already
> timed out.
>
> To address this, send an additional "prepare" event immediately before
> initiating the PRE_COPY_P2P transition. This guarantees timely event
> delivery regardless of how long the actual state transition takes.
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> Changes from v2 (https://lore.kernel.org/qemu-devel/20260201122348.28478-1-avihaih@nvidia.com/):
> * Renamed prepare-pre-copy-p2p to pre-copy-p2p-prepare
> * Renamed prep parameter to prepare in mig_state_to_qapi_state() and
> vfio_migration_send_event()
> * Added short explanatory comment before sending the prepare event in
> vfio_migration_set_state()
> * Explicitly used VFIO_DEVICE_STATE_PRE_COPY_P2P as parameter for
> vfio_migration_send_event()
>
> Changes from v1 (https://lore.kernel.org/qemu-devel/20260128105159.10282-1-avihaih@nvidia.com/):
> * Removed VFIO_MIGRATION_PREPARE event and instead added a new
> PREPARE_PRE_COPY_P2P state which is sent before PRE_COPY_P2P
> transition
> * Added details to commit message
> ---
> qapi/vfio.json | 13 +++++++++++--
> hw/vfio/migration.c | 26 +++++++++++++++++++-------
> 2 files changed, 30 insertions(+), 9 deletions(-)
>
> diff --git a/qapi/vfio.json b/qapi/vfio.json
> index a1a9c5b673..17b6046871 100644
> --- a/qapi/vfio.json
> +++ b/qapi/vfio.json
> @@ -11,7 +11,13 @@
> ##
> # @QapiVfioMigrationState:
(I had forgotten about the vfio-pci "migration-events" property)
Peter, Fabiano,
Do you think it would be interesting to send VFIO migration events
by default ?
Thanks,
C.
> #
> -# An enumeration of the VFIO device migration states.
> +# An enumeration of the VFIO device migration states. In addition to
> +# the regular states, there are prepare states (with 'prepare' suffix)
> +# which indicate that the device is just about to transition to the
> +# corresponding state. Note that seeing a prepare state for state X
> +# doesn't guarantee that the next state will be X, as the state
> +# transition can fail and the device may transition to a different
> +# state instead.
> #
> # @stop: The device is stopped.
> #
> @@ -32,11 +38,14 @@
> # tracking its internal state and its internal state is available
> # for reading.
> #
> +# @pre-copy-p2p-prepare: The device is just about to move to
> +# pre-copy-p2p state. (since 11.0)
> +#
> # Since: 9.1
> ##
> { 'enum': 'QapiVfioMigrationState',
> 'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p',
> - 'pre-copy', 'pre-copy-p2p' ] }
> + 'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] }
>
> ##
> # @VFIO_MIGRATION:
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index b4695030c7..4bd8e24699 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -68,7 +68,7 @@ static const char *mig_state_to_str(enum vfio_device_mig_state state)
> }
>
> static QapiVfioMigrationState
> -mig_state_to_qapi_state(enum vfio_device_mig_state state)
> +mig_state_to_qapi_state(enum vfio_device_mig_state state, bool prepare)
> {
> switch (state) {
> case VFIO_DEVICE_STATE_STOP:
> @@ -84,15 +84,17 @@ mig_state_to_qapi_state(enum vfio_device_mig_state state)
> case VFIO_DEVICE_STATE_PRE_COPY:
> return QAPI_VFIO_MIGRATION_STATE_PRE_COPY;
> case VFIO_DEVICE_STATE_PRE_COPY_P2P:
> - return QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P;
> + return prepare ? QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P_PREPARE :
> + QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P;
> default:
> g_assert_not_reached();
> }
> }
>
> -static void vfio_migration_send_event(VFIODevice *vbasedev)
> +static void vfio_migration_send_event(VFIODevice *vbasedev,
> + enum vfio_device_mig_state state,
> + bool prepare)
> {
> - VFIOMigration *migration = vbasedev->migration;
> DeviceState *dev = vbasedev->dev;
> g_autofree char *qom_path = NULL;
> Object *obj;
> @@ -106,8 +108,8 @@ static void vfio_migration_send_event(VFIODevice *vbasedev)
> g_assert(obj);
> qom_path = object_get_canonical_path(obj);
>
> - qapi_event_send_vfio_migration(
> - dev->id, qom_path, mig_state_to_qapi_state(migration->device_state));
> + qapi_event_send_vfio_migration(dev->id, qom_path,
> + mig_state_to_qapi_state(state, prepare));
> }
>
> static void vfio_migration_set_device_state(VFIODevice *vbasedev,
> @@ -119,7 +121,7 @@ static void vfio_migration_set_device_state(VFIODevice *vbasedev,
> mig_state_to_str(state));
>
> migration->device_state = state;
> - vfio_migration_send_event(vbasedev);
> + vfio_migration_send_event(vbasedev, state, false);
> }
>
> int vfio_migration_set_state(VFIODevice *vbasedev,
> @@ -146,6 +148,16 @@ int vfio_migration_set_state(VFIODevice *vbasedev,
> return 0;
> }
>
> + /*
> + * Send a prepare event before initiating the PRE_COPY_P2P transition to
> + * ensure timely event delivery regardless of how long the state transition
> + * takes.
> + */
> + if (new_state == VFIO_DEVICE_STATE_PRE_COPY_P2P) {
> + vfio_migration_send_event(vbasedev, VFIO_DEVICE_STATE_PRE_COPY_P2P,
> + true);
> + }
> +
> feature->argsz = sizeof(buf);
> feature->flags =
> VFIO_DEVICE_FEATURE_SET | VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE;
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-03 16:48 ` Cédric Le Goater
@ 2026-02-03 17:02 ` Peter Xu
2026-02-03 17:25 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2026-02-03 17:02 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Avihai Horon, qemu-devel, Alex Williamson, Eric Blake,
Markus Armbruster, Fabiano Rosas, Maor Gottlieb
On Tue, Feb 03, 2026 at 05:48:11PM +0100, Cédric Le Goater wrote:
> Peter, Fabiano,
>
> Do you think it would be interesting to send VFIO migration events
> by default ?
No objection here.
IIUC it's a matter of why it got introduced with default off in the middle
of 2024? If it's about compatibility of any old mgmt which may be
surprised by these events, then we want to know if they're ready, and then
if we need a compat field for it in older machine types (or enable it even
with old machines).
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-03 17:02 ` Peter Xu
@ 2026-02-03 17:25 ` Avihai Horon
2026-02-11 13:01 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2026-02-03 17:25 UTC (permalink / raw)
To: Peter Xu, Cédric Le Goater
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/3/2026 7:02 PM, Peter Xu wrote:
> External email: Use caution opening links or attachments
>
>
> On Tue, Feb 03, 2026 at 05:48:11PM +0100, Cédric Le Goater wrote:
>> Peter, Fabiano,
>>
>> Do you think it would be interesting to send VFIO migration events
>> by default ?
> No objection here.
>
> IIUC it's a matter of why it got introduced with default off in the middle
> of 2024? If it's about compatibility of any old mgmt which may be
> surprised by these events, then we want to know if they're ready, and then
> if we need a compat field for it in older machine types (or enable it even
> with old machines).
I disabled it by default back then because it was only needed for the
use case I mentioned here (specifically, it wasn't needed by libvirt).
I guess you can either leave it as is or enable it by default, it won't
matter.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-03 17:25 ` Avihai Horon
@ 2026-02-11 13:01 ` Avihai Horon
2026-02-11 13:16 ` Cédric Le Goater
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2026-02-11 13:01 UTC (permalink / raw)
To: Peter Xu, Cédric Le Goater
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/3/2026 7:25 PM, Avihai Horon wrote:
>
> On 2/3/2026 7:02 PM, Peter Xu wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On Tue, Feb 03, 2026 at 05:48:11PM +0100, Cédric Le Goater wrote:
>>> Peter, Fabiano,
>>>
>>> Do you think it would be interesting to send VFIO migration events
>>> by default ?
>> No objection here.
>>
>> IIUC it's a matter of why it got introduced with default off in the
>> middle
>> of 2024? If it's about compatibility of any old mgmt which may be
>> surprised by these events, then we want to know if they're ready, and
>> then
>> if we need a compat field for it in older machine types (or enable it
>> even
>> with old machines).
>
> I disabled it by default back then because it was only needed for the
> use case I mentioned here (specifically, it wasn't needed by libvirt).
>
> I guess you can either leave it as is or enable it by default, it
> won't matter.
>
> Thanks.
>
Hi Cedric,
Are we good here?
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-11 13:01 ` Avihai Horon
@ 2026-02-11 13:16 ` Cédric Le Goater
2026-02-11 17:45 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2026-02-11 13:16 UTC (permalink / raw)
To: Avihai Horon, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
Hello,
On 2/11/26 14:01, Avihai Horon wrote:
>
> On 2/3/2026 7:25 PM, Avihai Horon wrote:
>>
>> On 2/3/2026 7:02 PM, Peter Xu wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, Feb 03, 2026 at 05:48:11PM +0100, Cédric Le Goater wrote:
>>>> Peter, Fabiano,
>>>>
>>>> Do you think it would be interesting to send VFIO migration events
>>>> by default ?
>>> No objection here.
>>>
>>> IIUC it's a matter of why it got introduced with default off in the middle
>>> of 2024? If it's about compatibility of any old mgmt which may be
>>> surprised by these events, then we want to know if they're ready, and then
>>> if we need a compat field for it in older machine types (or enable it even
>>> with old machines).
>>
>> I disabled it by default back then because it was only needed for the use case I mentioned here (specifically, it wasn't needed by libvirt).
>>
>> I guess you can either leave it as is or enable it by default, it won't matter.
>>
>> Thanks.
>>
> Hi Cedric,
>
> Are we good here?
Yes. Sorry I should have said so. It's queued. I don't have much
for a PR, so this will wait a bit.
As for enabling the VFIO migration events by default, I guess we
all agree it could be done. We simply need a stakeholder to change
the default behavior.
Thanks,
C.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-11 13:16 ` Cédric Le Goater
@ 2026-02-11 17:45 ` Avihai Horon
2026-02-16 12:22 ` Cédric Le Goater
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2026-02-11 17:45 UTC (permalink / raw)
To: Cédric Le Goater, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/11/2026 3:16 PM, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> Hello,
>
> On 2/11/26 14:01, Avihai Horon wrote:
>>
>> On 2/3/2026 7:25 PM, Avihai Horon wrote:
>>>
>>> On 2/3/2026 7:02 PM, Peter Xu wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On Tue, Feb 03, 2026 at 05:48:11PM +0100, Cédric Le Goater wrote:
>>>>> Peter, Fabiano,
>>>>>
>>>>> Do you think it would be interesting to send VFIO migration events
>>>>> by default ?
>>>> No objection here.
>>>>
>>>> IIUC it's a matter of why it got introduced with default off in the
>>>> middle
>>>> of 2024? If it's about compatibility of any old mgmt which may be
>>>> surprised by these events, then we want to know if they're ready,
>>>> and then
>>>> if we need a compat field for it in older machine types (or enable
>>>> it even
>>>> with old machines).
>>>
>>> I disabled it by default back then because it was only needed for
>>> the use case I mentioned here (specifically, it wasn't needed by
>>> libvirt).
>>>
>>> I guess you can either leave it as is or enable it by default, it
>>> won't matter.
>>>
>>> Thanks.
>>>
>> Hi Cedric,
>>
>> Are we good here?
> Yes. Sorry I should have said so. It's queued. I don't have much
> for a PR, so this will wait a bit.
Great thanks!
>
>
> As for enabling the VFIO migration events by default, I guess we
> all agree it could be done. We simply need a stakeholder to change
> the default behavior.
Ack.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-11 17:45 ` Avihai Horon
@ 2026-02-16 12:22 ` Cédric Le Goater
2026-02-16 13:25 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2026-02-16 12:22 UTC (permalink / raw)
To: Avihai Horon, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
Peter, Avihai,
>> As for enabling the VFIO migration events by default, I guess we
>> all agree it could be done. We simply need a stakeholder to change
>> the default behavior.
>
> Ack.
There could be value for the sysmgmt layers to receive a VFIO
migration event reporting progress. For example, an event could
be sent from vfio_save_iterate() including relevant size metrics
to indicate progress (or not).
Any opinions ?
Thanks,
C.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-16 12:22 ` Cédric Le Goater
@ 2026-02-16 13:25 ` Avihai Horon
2026-02-16 17:40 ` Cédric Le Goater
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2026-02-16 13:25 UTC (permalink / raw)
To: Cédric Le Goater, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/16/2026 2:22 PM, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> Peter, Avihai,
>
>>> As for enabling the VFIO migration events by default, I guess we
>>> all agree it could be done. We simply need a stakeholder to change
>>> the default behavior.
>>
>> Ack.
>
>
> There could be value for the sysmgmt layers to receive a VFIO
> migration event reporting progress. For example, an event could
> be sent from vfio_save_iterate() including relevant size metrics
> to indicate progress (or not).
>
> Any opinions ?
Amm, I am not sure an event is the right way to report VFIO stats, as
it's not some change that mgmt needs to be notified about promptly.
Mgmt layer can query migration stats when needed using the QMP
'query-migrate' command that returns, among other info, a VfioStats
struct with the total amount of VFIO devices data transferred. If
needed, this can be extended with more info about remaining data size, etc.
Makes sense?
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-16 13:25 ` Avihai Horon
@ 2026-02-16 17:40 ` Cédric Le Goater
2026-02-17 13:05 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2026-02-16 17:40 UTC (permalink / raw)
To: Avihai Horon, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/16/26 14:25, Avihai Horon wrote:
>
> On 2/16/2026 2:22 PM, Cédric Le Goater wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Peter, Avihai,
>>
>>>> As for enabling the VFIO migration events by default, I guess we
>>>> all agree it could be done. We simply need a stakeholder to change
>>>> the default behavior.
>>>
>>> Ack.
>>
>>
>> There could be value for the sysmgmt layers to receive a VFIO
>> migration event reporting progress. For example, an event could
>> be sent from vfio_save_iterate() including relevant size metrics
>> to indicate progress (or not).
>>
>> Any opinions ?
>
> Amm, I am not sure an event is the right way to report VFIO stats, as it's not some change that mgmt needs to be notified about promptly.
>
> Mgmt layer can query migration stats when needed using the QMP 'query-migrate' command that returns, among other info, a VfioStats struct with the total amount of VFIO devices data transferred. If needed, this can be extended with more info about remaining data size, etc.
>
> Makes sense?
so you would opt to extend migration_populate_vfio_info() with more
info on the VFIO devices. Fine. mgmt should poll then. My knowledge
on these layers is limited.
The extra info would be "precopy initial" and "precopy dirty" sizes
I suppose. Per device (more complex) or overall ?
Thanks,
C.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-16 17:40 ` Cédric Le Goater
@ 2026-02-17 13:05 ` Avihai Horon
2026-02-17 14:04 ` Cédric Le Goater
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2026-02-17 13:05 UTC (permalink / raw)
To: Cédric Le Goater, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/16/2026 7:40 PM, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> On 2/16/26 14:25, Avihai Horon wrote:
>>
>> On 2/16/2026 2:22 PM, Cédric Le Goater wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Peter, Avihai,
>>>
>>>>> As for enabling the VFIO migration events by default, I guess we
>>>>> all agree it could be done. We simply need a stakeholder to change
>>>>> the default behavior.
>>>>
>>>> Ack.
>>>
>>>
>>> There could be value for the sysmgmt layers to receive a VFIO
>>> migration event reporting progress. For example, an event could
>>> be sent from vfio_save_iterate() including relevant size metrics
>>> to indicate progress (or not).
>>>
>>> Any opinions ?
>>
>> Amm, I am not sure an event is the right way to report VFIO stats, as
>> it's not some change that mgmt needs to be notified about promptly.
>>
>> Mgmt layer can query migration stats when needed using the QMP
>> 'query-migrate' command that returns, among other info, a VfioStats
>> struct with the total amount of VFIO devices data transferred. If
>> needed, this can be extended with more info about remaining data
>> size, etc.
>>
>> Makes sense?
> so you would opt to extend migration_populate_vfio_info() with more
> info on the VFIO devices. Fine. mgmt should poll then. My knowledge
> on these layers is limited.
>
> The extra info would be "precopy initial" and "precopy dirty" sizes
> I suppose. Per device (more complex) or overall ?
I think per device would be more helpful.
Extending migration_populate_vfio_info() could be a nice add, but I am
not sure how is this related to this patch I sent?
Or we're discussing this because you initially thought that this could
be a reason to enable vfio events by default?
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-17 13:05 ` Avihai Horon
@ 2026-02-17 14:04 ` Cédric Le Goater
2026-02-17 14:39 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2026-02-17 14:04 UTC (permalink / raw)
To: Avihai Horon, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/17/26 14:05, Avihai Horon wrote:
>
> On 2/16/2026 7:40 PM, Cédric Le Goater wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2/16/26 14:25, Avihai Horon wrote:
>>>
>>> On 2/16/2026 2:22 PM, Cédric Le Goater wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Peter, Avihai,
>>>>
>>>>>> As for enabling the VFIO migration events by default, I guess we
>>>>>> all agree it could be done. We simply need a stakeholder to change
>>>>>> the default behavior.
>>>>>
>>>>> Ack.
>>>>
>>>>
>>>> There could be value for the sysmgmt layers to receive a VFIO
>>>> migration event reporting progress. For example, an event could
>>>> be sent from vfio_save_iterate() including relevant size metrics
>>>> to indicate progress (or not).
>>>>
>>>> Any opinions ?
>>>
>>> Amm, I am not sure an event is the right way to report VFIO stats, as it's not some change that mgmt needs to be notified about promptly.
>>>
>>> Mgmt layer can query migration stats when needed using the QMP 'query-migrate' command that returns, among other info, a VfioStats struct with the total amount of VFIO devices data transferred. If needed, this can be extended with more info about remaining data size, etc.
>>>
>>> Makes sense?
>> so you would opt to extend migration_populate_vfio_info() with more
>> info on the VFIO devices. Fine. mgmt should poll then. My knowledge
>> on these layers is limited.
>>
>> The extra info would be "precopy initial" and "precopy dirty" sizes
>> I suppose. Per device (more complex) or overall ?
>
> I think per device would be more helpful.
Yeah I think too.
> Extending migration_populate_vfio_info() could be a nice add, but
> I am not sure how is this related to this patch I sent?
I was just extending the discussion, thinking aloud. Sorry for the
confusion.
This patch is in vfio-next. I am waiting for Ankit to respin its
series and I will send a PR.
> Or we're discussing this because you initially thought that this
> could be a reason to enable vfio events by default?
It could be a reason but it really depends on how the sysmgmt layer
wants to consume the info.
Users have been requesting more fine-grained VFIO information to
improve the migration flow. Leveraging VfioStats seems to be the
safest approach for that today.
Thanks,
C.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-17 14:04 ` Cédric Le Goater
@ 2026-02-17 14:39 ` Avihai Horon
0 siblings, 0 replies; 18+ messages in thread
From: Avihai Horon @ 2026-02-17 14:39 UTC (permalink / raw)
To: Cédric Le Goater, Peter Xu
Cc: qemu-devel, Alex Williamson, Eric Blake, Markus Armbruster,
Fabiano Rosas, Maor Gottlieb
On 2/17/2026 4:04 PM, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> On 2/17/26 14:05, Avihai Horon wrote:
>>
>> On 2/16/2026 7:40 PM, Cédric Le Goater wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 2/16/26 14:25, Avihai Horon wrote:
>>>>
>>>> On 2/16/2026 2:22 PM, Cédric Le Goater wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> Peter, Avihai,
>>>>>
>>>>>>> As for enabling the VFIO migration events by default, I guess we
>>>>>>> all agree it could be done. We simply need a stakeholder to change
>>>>>>> the default behavior.
>>>>>>
>>>>>> Ack.
>>>>>
>>>>>
>>>>> There could be value for the sysmgmt layers to receive a VFIO
>>>>> migration event reporting progress. For example, an event could
>>>>> be sent from vfio_save_iterate() including relevant size metrics
>>>>> to indicate progress (or not).
>>>>>
>>>>> Any opinions ?
>>>>
>>>> Amm, I am not sure an event is the right way to report VFIO stats,
>>>> as it's not some change that mgmt needs to be notified about promptly.
>>>>
>>>> Mgmt layer can query migration stats when needed using the QMP
>>>> 'query-migrate' command that returns, among other info, a VfioStats
>>>> struct with the total amount of VFIO devices data transferred. If
>>>> needed, this can be extended with more info about remaining data
>>>> size, etc.
>>>>
>>>> Makes sense?
>>> so you would opt to extend migration_populate_vfio_info() with more
>>> info on the VFIO devices. Fine. mgmt should poll then. My knowledge
>>> on these layers is limited.
>>>
>>> The extra info would be "precopy initial" and "precopy dirty" sizes
>>> I suppose. Per device (more complex) or overall ?
>>
>> I think per device would be more helpful.
>
> Yeah I think too.
>
>> Extending migration_populate_vfio_info() could be a nice add, but
>> I am not sure how is this related to this patch I sent?
>
> I was just extending the discussion, thinking aloud. Sorry for the
> confusion.
Ah, I see.
>
> This patch is in vfio-next. I am waiting for Ankit to respin its
> series and I will send a PR.
Ack.
>
>> Or we're discussing this because you initially thought that this
>> could be a reason to enable vfio events by default?
>
> It could be a reason but it really depends on how the sysmgmt layer
> wants to consume the info.
>
> Users have been requesting more fine-grained VFIO information to
> improve the migration flow. Leveraging VfioStats seems to be the
> safest approach for that today.
I see.
Agree, VfioStats seems like the right place.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-02 17:34 [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition Avihai Horon
2026-02-03 16:27 ` Cédric Le Goater
2026-02-03 16:48 ` Cédric Le Goater
@ 2026-02-04 13:22 ` Markus Armbruster
2026-02-04 13:24 ` Markus Armbruster
2 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2026-02-04 13:22 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Eric Blake,
Markus Armbruster, Peter Xu, Fabiano Rosas, Maor Gottlieb
Avihai Horon <avihaih@nvidia.com> writes:
> The VFIO_MIGRATION event notifies users when a VFIO device transitions
> to a new state.
>
> One use case for this event is to prevent timeouts for RDMA connections
> to the migrated device. In this case, an external management application
> (not libvirt) consumes the events and disables the RDMA timeout
> mechanism when receiving the event for PRE_COPY_P2P state, which
> indicates that the device is non-responsive.
>
> This is essential because RDMA connections typically have very low
> timeouts (tens of milliseconds), which can be far below migration
> downtime.
>
> However, under heavy resource utilization, the device transition to
> PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the
> VFIO_MIGRATION event is currently sent only after the transition
> completes, it arrives too late, after RDMA connections have already
> timed out.
>
> To address this, send an additional "prepare" event immediately before
> initiating the PRE_COPY_P2P transition. This guarantees timely event
> delivery regardless of how long the actual state transition takes.
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
[...]
> diff --git a/qapi/vfio.json b/qapi/vfio.json
> index a1a9c5b673..17b6046871 100644
> --- a/qapi/vfio.json
> +++ b/qapi/vfio.json
> @@ -11,7 +11,13 @@
> ##
> # @QapiVfioMigrationState:
> #
> -# An enumeration of the VFIO device migration states.
> +# An enumeration of the VFIO device migration states. In addition to
> +# the regular states, there are prepare states (with 'prepare' suffix)
> +# which indicate that the device is just about to transition to the
> +# corresponding state. Note that seeing a prepare state for state X
> +# doesn't guarantee that the next state will be X, as the state
> +# transition can fail and the device may transition to a different
> +# state instead.
> #
> # @stop: The device is stopped.
> #
> @@ -32,11 +38,14 @@
> # tracking its internal state and its internal state is available
> # for reading.
> #
> +# @pre-copy-p2p-prepare: The device is just about to move to
> +# pre-copy-p2p state. (since 11.0)
> +#
> # Since: 9.1
> ##
> { 'enum': 'QapiVfioMigrationState',
> 'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p',
> - 'pre-copy', 'pre-copy-p2p' ] }
> + 'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] }
>
> ##
> # @VFIO_MIGRATION:
Acked-by: Markus Armbruster <armbru@redhat.com>
[...]
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-04 13:22 ` Markus Armbruster
@ 2026-02-04 13:24 ` Markus Armbruster
2026-02-04 13:39 ` Avihai Horon
0 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2026-02-04 13:24 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Eric Blake,
Peter Xu, Fabiano Rosas, Maor Gottlieb
Markus Armbruster <armbru@redhat.com> writes:
> Avihai Horon <avihaih@nvidia.com> writes:
>
>> The VFIO_MIGRATION event notifies users when a VFIO device transitions
>> to a new state.
>>
>> One use case for this event is to prevent timeouts for RDMA connections
>> to the migrated device. In this case, an external management application
>> (not libvirt) consumes the events and disables the RDMA timeout
>> mechanism when receiving the event for PRE_COPY_P2P state, which
>> indicates that the device is non-responsive.
>>
>> This is essential because RDMA connections typically have very low
>> timeouts (tens of milliseconds), which can be far below migration
>> downtime.
>>
>> However, under heavy resource utilization, the device transition to
>> PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the
>> VFIO_MIGRATION event is currently sent only after the transition
>> completes, it arrives too late, after RDMA connections have already
>> timed out.
>>
>> To address this, send an additional "prepare" event immediately before
>> initiating the PRE_COPY_P2P transition. This guarantees timely event
>> delivery regardless of how long the actual state transition takes.
>>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>
> [...]
>
>> diff --git a/qapi/vfio.json b/qapi/vfio.json
>> index a1a9c5b673..17b6046871 100644
>> --- a/qapi/vfio.json
>> +++ b/qapi/vfio.json
>> @@ -11,7 +11,13 @@
>> ##
>> # @QapiVfioMigrationState:
>> #
>> -# An enumeration of the VFIO device migration states.
>> +# An enumeration of the VFIO device migration states. In addition to
>> +# the regular states, there are prepare states (with 'prepare' suffix)
>> +# which indicate that the device is just about to transition to the
>> +# corresponding state. Note that seeing a prepare state for state X
>> +# doesn't guarantee that the next state will be X, as the state
>> +# transition can fail and the device may transition to a different
>> +# state instead.
>> #
>> # @stop: The device is stopped.
>> #
>> @@ -32,11 +38,14 @@
>> # tracking its internal state and its internal state is available
>> # for reading.
>> #
>> +# @pre-copy-p2p-prepare: The device is just about to move to
>> +# pre-copy-p2p state. (since 11.0)
>> +#
>> # Since: 9.1
>> ##
>> { 'enum': 'QapiVfioMigrationState',
>> 'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p',
>> - 'pre-copy', 'pre-copy-p2p' ] }
>> + 'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] }
>>
>> ##
>> # @VFIO_MIGRATION:
>
> Acked-by: Markus Armbruster <armbru@redhat.com>
Except for the subject line: "vfio/migration: Send VFIO_MIGRATION event
before PRE_COPY_P2P transition" become misleading in v2.
>
> [...]
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-04 13:24 ` Markus Armbruster
@ 2026-02-04 13:39 ` Avihai Horon
2026-02-04 14:29 ` Markus Armbruster
0 siblings, 1 reply; 18+ messages in thread
From: Avihai Horon @ 2026-02-04 13:39 UTC (permalink / raw)
To: Markus Armbruster
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Eric Blake,
Peter Xu, Fabiano Rosas, Maor Gottlieb
On 2/4/2026 3:24 PM, Markus Armbruster wrote:
> External email: Use caution opening links or attachments
>
>
> Markus Armbruster <armbru@redhat.com> writes:
>
>> Avihai Horon <avihaih@nvidia.com> writes:
>>
>>> The VFIO_MIGRATION event notifies users when a VFIO device transitions
>>> to a new state.
>>>
>>> One use case for this event is to prevent timeouts for RDMA connections
>>> to the migrated device. In this case, an external management application
>>> (not libvirt) consumes the events and disables the RDMA timeout
>>> mechanism when receiving the event for PRE_COPY_P2P state, which
>>> indicates that the device is non-responsive.
>>>
>>> This is essential because RDMA connections typically have very low
>>> timeouts (tens of milliseconds), which can be far below migration
>>> downtime.
>>>
>>> However, under heavy resource utilization, the device transition to
>>> PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the
>>> VFIO_MIGRATION event is currently sent only after the transition
>>> completes, it arrives too late, after RDMA connections have already
>>> timed out.
>>>
>>> To address this, send an additional "prepare" event immediately before
>>> initiating the PRE_COPY_P2P transition. This guarantees timely event
>>> delivery regardless of how long the actual state transition takes.
>>>
>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>> [...]
>>
>>> diff --git a/qapi/vfio.json b/qapi/vfio.json
>>> index a1a9c5b673..17b6046871 100644
>>> --- a/qapi/vfio.json
>>> +++ b/qapi/vfio.json
>>> @@ -11,7 +11,13 @@
>>> ##
>>> # @QapiVfioMigrationState:
>>> #
>>> -# An enumeration of the VFIO device migration states.
>>> +# An enumeration of the VFIO device migration states. In addition to
>>> +# the regular states, there are prepare states (with 'prepare' suffix)
>>> +# which indicate that the device is just about to transition to the
>>> +# corresponding state. Note that seeing a prepare state for state X
>>> +# doesn't guarantee that the next state will be X, as the state
>>> +# transition can fail and the device may transition to a different
>>> +# state instead.
>>> #
>>> # @stop: The device is stopped.
>>> #
>>> @@ -32,11 +38,14 @@
>>> # tracking its internal state and its internal state is available
>>> # for reading.
>>> #
>>> +# @pre-copy-p2p-prepare: The device is just about to move to
>>> +# pre-copy-p2p state. (since 11.0)
>>> +#
>>> # Since: 9.1
>>> ##
>>> { 'enum': 'QapiVfioMigrationState',
>>> 'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p',
>>> - 'pre-copy', 'pre-copy-p2p' ] }
>>> + 'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] }
>>>
>>> ##
>>> # @VFIO_MIGRATION:
>> Acked-by: Markus Armbruster <armbru@redhat.com>
> Except for the subject line: "vfio/migration: Send VFIO_MIGRATION event
> before PRE_COPY_P2P transition" become misleading in v2.
Can you explain why misleading?
Prior to this patch VFIO_MIGRATION event was sent only after
PRE_COPY_P2P transition.
Now with this patch VFIO_MIGRATION event is sent also before
PRE_COPY_P2P transition.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition
2026-02-04 13:39 ` Avihai Horon
@ 2026-02-04 14:29 ` Markus Armbruster
0 siblings, 0 replies; 18+ messages in thread
From: Markus Armbruster @ 2026-02-04 14:29 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Eric Blake,
Peter Xu, Fabiano Rosas, Maor Gottlieb
Avihai Horon <avihaih@nvidia.com> writes:
> On 2/4/2026 3:24 PM, Markus Armbruster wrote:
[...]
>> Except for the subject line: "vfio/migration: Send VFIO_MIGRATION event
>> before PRE_COPY_P2P transition" become misleading in v2.
>
> Can you explain why misleading?
>
> Prior to this patch VFIO_MIGRATION event was sent only after PRE_COPY_P2P transition.
> Now with this patch VFIO_MIGRATION event is sent also before PRE_COPY_P2P transition.
Nevermind, I got confused :)
^ permalink raw reply [flat|nested] 18+ messages in thread