* [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
@ 2025-08-18 13:11 Yanfei Xu
2025-08-18 14:56 ` Peter Xu
2025-09-29 15:51 ` Peter Xu
0 siblings, 2 replies; 5+ messages in thread
From: Yanfei Xu @ 2025-08-18 13:11 UTC (permalink / raw)
To: mst, pbonzini, peterx, farosas; +Cc: qemu-devel, yanfei.xu, shenyicong.1023
The load procedure of VFIO PCI devices involves setting up IRT
for each VFIO PCI devices. This requires determining whether an
interrupt is single-destination interrupt to decide between
Posted Interrupt(PI) or remapping mode for the IRTE. However,
determining this may require accessing the VM's APIC registers.
For example:
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
...
kvm_arch_irq_bypass_add_producer
kvm_x86_call(pi_update_irte)
vmx_pi_update_irte
kvm_intr_is_single_vcpu
If the LAPIC has not been loaded yet, interrupts will use remapping
mode. To prevent the fallback of interrupt mode, keep APIC is always
loaded prior to VFIO PCI devices.
Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
---
hw/intc/apic_common.c | 1 +
include/migration/vmstate.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 37a7a7019d..394fe02013 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
.pre_load = apic_pre_load,
.pre_save = apic_dispatch_pre_save,
.post_load = apic_dispatch_post_load,
+ .priority = MIG_PRI_APIC,
.fields = (const VMStateField[]) {
VMSTATE_UINT32(apicbase, APICCommonState),
VMSTATE_UINT8(id, APICCommonState),
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1ff7bd9ac4..22e988c5db 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -163,6 +163,7 @@ typedef enum {
MIG_PRI_IOMMU, /* Must happen before PCI devices */
MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
+ MIG_PRI_APIC, /* Must happen before PCI devices */
MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
MIG_PRI_GICV3, /* Must happen before the ITS */
MIG_PRI_MAX,
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu
@ 2025-08-18 14:56 ` Peter Xu
2025-09-29 15:51 ` Peter Xu
1 sibling, 0 replies; 5+ messages in thread
From: Peter Xu @ 2025-08-18 14:56 UTC (permalink / raw)
To: Yanfei Xu; +Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023
On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
> The load procedure of VFIO PCI devices involves setting up IRT
> for each VFIO PCI devices. This requires determining whether an
> interrupt is single-destination interrupt to decide between
> Posted Interrupt(PI) or remapping mode for the IRTE. However,
> determining this may require accessing the VM's APIC registers.
>
> For example:
> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
> ...
> kvm_arch_irq_bypass_add_producer
> kvm_x86_call(pi_update_irte)
> vmx_pi_update_irte
> kvm_intr_is_single_vcpu
>
> If the LAPIC has not been loaded yet, interrupts will use remapping
> mode. To prevent the fallback of interrupt mode, keep APIC is always
> loaded prior to VFIO PCI devices.
>
> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
> ---
> hw/intc/apic_common.c | 1 +
> include/migration/vmstate.h | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index 37a7a7019d..394fe02013 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
> .pre_load = apic_pre_load,
> .pre_save = apic_dispatch_pre_save,
> .post_load = apic_dispatch_post_load,
> + .priority = MIG_PRI_APIC,
> .fields = (const VMStateField[]) {
> VMSTATE_UINT32(apicbase, APICCommonState),
> VMSTATE_UINT8(id, APICCommonState),
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 1ff7bd9ac4..22e988c5db 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -163,6 +163,7 @@ typedef enum {
> MIG_PRI_IOMMU, /* Must happen before PCI devices */
> MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
> MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
> + MIG_PRI_APIC, /* Must happen before PCI devices */
As the list grows, maybe it's a good idea we start to document exactly the
reasons inline into the enum. Can sure be done on top. This patch alone
looks like stable material..
Reviewed-by: Peter Xu <peterx@redhat.com>
> MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
> MIG_PRI_GICV3, /* Must happen before the ITS */
> MIG_PRI_MAX,
> --
> 2.20.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu
2025-08-18 14:56 ` Peter Xu
@ 2025-09-29 15:51 ` Peter Xu
2025-09-30 8:27 ` Cédric Le Goater
1 sibling, 1 reply; 5+ messages in thread
From: Peter Xu @ 2025-09-29 15:51 UTC (permalink / raw)
To: Yanfei Xu
Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023,
Cédric Le Goater, Alex Williamson
On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
> The load procedure of VFIO PCI devices involves setting up IRT
> for each VFIO PCI devices. This requires determining whether an
> interrupt is single-destination interrupt to decide between
> Posted Interrupt(PI) or remapping mode for the IRTE. However,
> determining this may require accessing the VM's APIC registers.
>
> For example:
> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
> ...
> kvm_arch_irq_bypass_add_producer
> kvm_x86_call(pi_update_irte)
> vmx_pi_update_irte
> kvm_intr_is_single_vcpu
>
> If the LAPIC has not been loaded yet, interrupts will use remapping
> mode. To prevent the fallback of interrupt mode, keep APIC is always
> loaded prior to VFIO PCI devices.
>
> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
> ---
> hw/intc/apic_common.c | 1 +
> include/migration/vmstate.h | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index 37a7a7019d..394fe02013 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
> .pre_load = apic_pre_load,
> .pre_save = apic_dispatch_pre_save,
> .post_load = apic_dispatch_post_load,
> + .priority = MIG_PRI_APIC,
> .fields = (const VMStateField[]) {
> VMSTATE_UINT32(apicbase, APICCommonState),
> VMSTATE_UINT8(id, APICCommonState),
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 1ff7bd9ac4..22e988c5db 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -163,6 +163,7 @@ typedef enum {
> MIG_PRI_IOMMU, /* Must happen before PCI devices */
> MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
> MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
> + MIG_PRI_APIC, /* Must happen before PCI devices */
> MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
> MIG_PRI_GICV3, /* Must happen before the ITS */
> MIG_PRI_MAX,
> --
> 2.20.1
>
+Cedric, +Alex
queued.
--
Peter Xu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
2025-09-29 15:51 ` Peter Xu
@ 2025-09-30 8:27 ` Cédric Le Goater
2025-09-30 16:04 ` Peter Xu
0 siblings, 1 reply; 5+ messages in thread
From: Cédric Le Goater @ 2025-09-30 8:27 UTC (permalink / raw)
To: Peter Xu, Yanfei Xu
Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023,
Alex Williamson
On 9/29/25 17:51, Peter Xu wrote:
> On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
>> The load procedure of VFIO PCI devices involves setting up IRT
>> for each VFIO PCI devices. This requires determining whether an
>> interrupt is single-destination interrupt to decide between
>> Posted Interrupt(PI) or remapping mode for the IRTE. However,
>> determining this may require accessing the VM's APIC registers.
>>
>> For example:
>> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
>> ...
>> kvm_arch_irq_bypass_add_producer
>> kvm_x86_call(pi_update_irte)
>> vmx_pi_update_irte
>> kvm_intr_is_single_vcpu
>>
>> If the LAPIC has not been loaded yet, interrupts will use remapping
>> mode. To prevent the fallback of interrupt mode, keep APIC is always
>> loaded prior to VFIO PCI devices.
>>
>> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
>> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
>> ---
>> hw/intc/apic_common.c | 1 +
>> include/migration/vmstate.h | 1 +
>> 2 files changed, 2 insertions(+)
>>
>> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
>> index 37a7a7019d..394fe02013 100644
>> --- a/hw/intc/apic_common.c
>> +++ b/hw/intc/apic_common.c
>> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
>> .pre_load = apic_pre_load,
>> .pre_save = apic_dispatch_pre_save,
>> .post_load = apic_dispatch_post_load,
>> + .priority = MIG_PRI_APIC,
>> .fields = (const VMStateField[]) {
>> VMSTATE_UINT32(apicbase, APICCommonState),
>> VMSTATE_UINT8(id, APICCommonState),
>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>> index 1ff7bd9ac4..22e988c5db 100644
>> --- a/include/migration/vmstate.h
>> +++ b/include/migration/vmstate.h
>> @@ -163,6 +163,7 @@ typedef enum {
>> MIG_PRI_IOMMU, /* Must happen before PCI devices */
>> MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
>> MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
>> + MIG_PRI_APIC, /* Must happen before PCI devices */
>> MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
>> MIG_PRI_GICV3, /* Must happen before the ITS */
>> MIG_PRI_MAX,
>> --
>> 2.20.1
>>
>
> +Cedric, +Alex
>
> queued.
>
Perhaps we could group the interrupt controller priorities
under a common MIG_PRI_INTC ? PPC is very much the same,
although we managed to order restore from the machine load
handler IIRC.
Anyhow, LGTM.
Thanks,
C.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
2025-09-30 8:27 ` Cédric Le Goater
@ 2025-09-30 16:04 ` Peter Xu
0 siblings, 0 replies; 5+ messages in thread
From: Peter Xu @ 2025-09-30 16:04 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Yanfei Xu, mst, pbonzini, farosas, qemu-devel, shenyicong.1023,
Alex Williamson
On Tue, Sep 30, 2025 at 10:27:34AM +0200, Cédric Le Goater wrote:
> On 9/29/25 17:51, Peter Xu wrote:
> > On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
> > > The load procedure of VFIO PCI devices involves setting up IRT
> > > for each VFIO PCI devices. This requires determining whether an
> > > interrupt is single-destination interrupt to decide between
> > > Posted Interrupt(PI) or remapping mode for the IRTE. However,
> > > determining this may require accessing the VM's APIC registers.
> > >
> > > For example:
> > > ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
> > > ...
> > > kvm_arch_irq_bypass_add_producer
> > > kvm_x86_call(pi_update_irte)
> > > vmx_pi_update_irte
> > > kvm_intr_is_single_vcpu
> > >
> > > If the LAPIC has not been loaded yet, interrupts will use remapping
> > > mode. To prevent the fallback of interrupt mode, keep APIC is always
> > > loaded prior to VFIO PCI devices.
> > >
> > > Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
> > > Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
> > > ---
> > > hw/intc/apic_common.c | 1 +
> > > include/migration/vmstate.h | 1 +
> > > 2 files changed, 2 insertions(+)
> > >
> > > diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> > > index 37a7a7019d..394fe02013 100644
> > > --- a/hw/intc/apic_common.c
> > > +++ b/hw/intc/apic_common.c
> > > @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
> > > .pre_load = apic_pre_load,
> > > .pre_save = apic_dispatch_pre_save,
> > > .post_load = apic_dispatch_post_load,
> > > + .priority = MIG_PRI_APIC,
> > > .fields = (const VMStateField[]) {
> > > VMSTATE_UINT32(apicbase, APICCommonState),
> > > VMSTATE_UINT8(id, APICCommonState),
> > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > index 1ff7bd9ac4..22e988c5db 100644
> > > --- a/include/migration/vmstate.h
> > > +++ b/include/migration/vmstate.h
> > > @@ -163,6 +163,7 @@ typedef enum {
> > > MIG_PRI_IOMMU, /* Must happen before PCI devices */
> > > MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
> > > MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
> > > + MIG_PRI_APIC, /* Must happen before PCI devices */
> > > MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
> > > MIG_PRI_GICV3, /* Must happen before the ITS */
> > > MIG_PRI_MAX,
> > > --
> > > 2.20.1
> > >
> >
> > +Cedric, +Alex
> >
> > queued.
> >
>
> Perhaps we could group the interrupt controller priorities
> under a common MIG_PRI_INTC ? PPC is very much the same,
> although we managed to order restore from the machine load
> handler IIRC.
>
> Anyhow, LGTM.
Agreed. That, when introduced, can also take ARM's into account (that may
need a MIG_PRI_INTC_HIGH for ITS).
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-30 16:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu
2025-08-18 14:56 ` Peter Xu
2025-09-29 15:51 ` Peter Xu
2025-09-30 8:27 ` Cédric Le Goater
2025-09-30 16:04 ` Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).