* [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
@ 2025-08-18 13:11 Yanfei Xu
2025-08-18 14:56 ` Peter Xu
2025-09-29 15:51 ` Peter Xu
0 siblings, 2 replies; 5+ messages in thread
From: Yanfei Xu @ 2025-08-18 13:11 UTC (permalink / raw)
To: mst, pbonzini, peterx, farosas; +Cc: qemu-devel, yanfei.xu, shenyicong.1023
The load procedure of VFIO PCI devices involves setting up IRT
for each VFIO PCI devices. This requires determining whether an
interrupt is single-destination interrupt to decide between
Posted Interrupt(PI) or remapping mode for the IRTE. However,
determining this may require accessing the VM's APIC registers.
For example:
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
...
kvm_arch_irq_bypass_add_producer
kvm_x86_call(pi_update_irte)
vmx_pi_update_irte
kvm_intr_is_single_vcpu
If the LAPIC has not been loaded yet, interrupts will use remapping
mode. To prevent the fallback of interrupt mode, keep APIC is always
loaded prior to VFIO PCI devices.
Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
---
hw/intc/apic_common.c | 1 +
include/migration/vmstate.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 37a7a7019d..394fe02013 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
.pre_load = apic_pre_load,
.pre_save = apic_dispatch_pre_save,
.post_load = apic_dispatch_post_load,
+ .priority = MIG_PRI_APIC,
.fields = (const VMStateField[]) {
VMSTATE_UINT32(apicbase, APICCommonState),
VMSTATE_UINT8(id, APICCommonState),
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1ff7bd9ac4..22e988c5db 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -163,6 +163,7 @@ typedef enum {
MIG_PRI_IOMMU, /* Must happen before PCI devices */
MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
+ MIG_PRI_APIC, /* Must happen before PCI devices */
MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
MIG_PRI_GICV3, /* Must happen before the ITS */
MIG_PRI_MAX,
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices 2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu @ 2025-08-18 14:56 ` Peter Xu 2025-09-29 15:51 ` Peter Xu 1 sibling, 0 replies; 5+ messages in thread From: Peter Xu @ 2025-08-18 14:56 UTC (permalink / raw) To: Yanfei Xu; +Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023 On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote: > The load procedure of VFIO PCI devices involves setting up IRT > for each VFIO PCI devices. This requires determining whether an > interrupt is single-destination interrupt to decide between > Posted Interrupt(PI) or remapping mode for the IRTE. However, > determining this may require accessing the VM's APIC registers. > > For example: > ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs) > ... > kvm_arch_irq_bypass_add_producer > kvm_x86_call(pi_update_irte) > vmx_pi_update_irte > kvm_intr_is_single_vcpu > > If the LAPIC has not been loaded yet, interrupts will use remapping > mode. To prevent the fallback of interrupt mode, keep APIC is always > loaded prior to VFIO PCI devices. > > Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com> > Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com> > --- > hw/intc/apic_common.c | 1 + > include/migration/vmstate.h | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c > index 37a7a7019d..394fe02013 100644 > --- a/hw/intc/apic_common.c > +++ b/hw/intc/apic_common.c > @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = { > .pre_load = apic_pre_load, > .pre_save = apic_dispatch_pre_save, > .post_load = apic_dispatch_post_load, > + .priority = MIG_PRI_APIC, > .fields = (const VMStateField[]) { > VMSTATE_UINT32(apicbase, APICCommonState), > VMSTATE_UINT8(id, APICCommonState), > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h > index 1ff7bd9ac4..22e988c5db 100644 > --- a/include/migration/vmstate.h > +++ b/include/migration/vmstate.h > @@ -163,6 +163,7 @@ typedef enum { > MIG_PRI_IOMMU, /* Must happen before PCI devices */ > MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ > MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ > + MIG_PRI_APIC, /* Must happen before PCI devices */ As the list grows, maybe it's a good idea we start to document exactly the reasons inline into the enum. Can sure be done on top. This patch alone looks like stable material.. Reviewed-by: Peter Xu <peterx@redhat.com> > MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ > MIG_PRI_GICV3, /* Must happen before the ITS */ > MIG_PRI_MAX, > -- > 2.20.1 > -- Peter Xu ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices 2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu 2025-08-18 14:56 ` Peter Xu @ 2025-09-29 15:51 ` Peter Xu 2025-09-30 8:27 ` Cédric Le Goater 1 sibling, 1 reply; 5+ messages in thread From: Peter Xu @ 2025-09-29 15:51 UTC (permalink / raw) To: Yanfei Xu Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023, Cédric Le Goater, Alex Williamson On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote: > The load procedure of VFIO PCI devices involves setting up IRT > for each VFIO PCI devices. This requires determining whether an > interrupt is single-destination interrupt to decide between > Posted Interrupt(PI) or remapping mode for the IRTE. However, > determining this may require accessing the VM's APIC registers. > > For example: > ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs) > ... > kvm_arch_irq_bypass_add_producer > kvm_x86_call(pi_update_irte) > vmx_pi_update_irte > kvm_intr_is_single_vcpu > > If the LAPIC has not been loaded yet, interrupts will use remapping > mode. To prevent the fallback of interrupt mode, keep APIC is always > loaded prior to VFIO PCI devices. > > Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com> > Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com> > --- > hw/intc/apic_common.c | 1 + > include/migration/vmstate.h | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c > index 37a7a7019d..394fe02013 100644 > --- a/hw/intc/apic_common.c > +++ b/hw/intc/apic_common.c > @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = { > .pre_load = apic_pre_load, > .pre_save = apic_dispatch_pre_save, > .post_load = apic_dispatch_post_load, > + .priority = MIG_PRI_APIC, > .fields = (const VMStateField[]) { > VMSTATE_UINT32(apicbase, APICCommonState), > VMSTATE_UINT8(id, APICCommonState), > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h > index 1ff7bd9ac4..22e988c5db 100644 > --- a/include/migration/vmstate.h > +++ b/include/migration/vmstate.h > @@ -163,6 +163,7 @@ typedef enum { > MIG_PRI_IOMMU, /* Must happen before PCI devices */ > MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ > MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ > + MIG_PRI_APIC, /* Must happen before PCI devices */ > MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ > MIG_PRI_GICV3, /* Must happen before the ITS */ > MIG_PRI_MAX, > -- > 2.20.1 > +Cedric, +Alex queued. -- Peter Xu ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices 2025-09-29 15:51 ` Peter Xu @ 2025-09-30 8:27 ` Cédric Le Goater 2025-09-30 16:04 ` Peter Xu 0 siblings, 1 reply; 5+ messages in thread From: Cédric Le Goater @ 2025-09-30 8:27 UTC (permalink / raw) To: Peter Xu, Yanfei Xu Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023, Alex Williamson On 9/29/25 17:51, Peter Xu wrote: > On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote: >> The load procedure of VFIO PCI devices involves setting up IRT >> for each VFIO PCI devices. This requires determining whether an >> interrupt is single-destination interrupt to decide between >> Posted Interrupt(PI) or remapping mode for the IRTE. However, >> determining this may require accessing the VM's APIC registers. >> >> For example: >> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs) >> ... >> kvm_arch_irq_bypass_add_producer >> kvm_x86_call(pi_update_irte) >> vmx_pi_update_irte >> kvm_intr_is_single_vcpu >> >> If the LAPIC has not been loaded yet, interrupts will use remapping >> mode. To prevent the fallback of interrupt mode, keep APIC is always >> loaded prior to VFIO PCI devices. >> >> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com> >> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com> >> --- >> hw/intc/apic_common.c | 1 + >> include/migration/vmstate.h | 1 + >> 2 files changed, 2 insertions(+) >> >> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c >> index 37a7a7019d..394fe02013 100644 >> --- a/hw/intc/apic_common.c >> +++ b/hw/intc/apic_common.c >> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = { >> .pre_load = apic_pre_load, >> .pre_save = apic_dispatch_pre_save, >> .post_load = apic_dispatch_post_load, >> + .priority = MIG_PRI_APIC, >> .fields = (const VMStateField[]) { >> VMSTATE_UINT32(apicbase, APICCommonState), >> VMSTATE_UINT8(id, APICCommonState), >> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h >> index 1ff7bd9ac4..22e988c5db 100644 >> --- a/include/migration/vmstate.h >> +++ b/include/migration/vmstate.h >> @@ -163,6 +163,7 @@ typedef enum { >> MIG_PRI_IOMMU, /* Must happen before PCI devices */ >> MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ >> MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ >> + MIG_PRI_APIC, /* Must happen before PCI devices */ >> MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ >> MIG_PRI_GICV3, /* Must happen before the ITS */ >> MIG_PRI_MAX, >> -- >> 2.20.1 >> > > +Cedric, +Alex > > queued. > Perhaps we could group the interrupt controller priorities under a common MIG_PRI_INTC ? PPC is very much the same, although we managed to order restore from the machine load handler IIRC. Anyhow, LGTM. Thanks, C. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices 2025-09-30 8:27 ` Cédric Le Goater @ 2025-09-30 16:04 ` Peter Xu 0 siblings, 0 replies; 5+ messages in thread From: Peter Xu @ 2025-09-30 16:04 UTC (permalink / raw) To: Cédric Le Goater Cc: Yanfei Xu, mst, pbonzini, farosas, qemu-devel, shenyicong.1023, Alex Williamson On Tue, Sep 30, 2025 at 10:27:34AM +0200, Cédric Le Goater wrote: > On 9/29/25 17:51, Peter Xu wrote: > > On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote: > > > The load procedure of VFIO PCI devices involves setting up IRT > > > for each VFIO PCI devices. This requires determining whether an > > > interrupt is single-destination interrupt to decide between > > > Posted Interrupt(PI) or remapping mode for the IRTE. However, > > > determining this may require accessing the VM's APIC registers. > > > > > > For example: > > > ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs) > > > ... > > > kvm_arch_irq_bypass_add_producer > > > kvm_x86_call(pi_update_irte) > > > vmx_pi_update_irte > > > kvm_intr_is_single_vcpu > > > > > > If the LAPIC has not been loaded yet, interrupts will use remapping > > > mode. To prevent the fallback of interrupt mode, keep APIC is always > > > loaded prior to VFIO PCI devices. > > > > > > Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com> > > > Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com> > > > --- > > > hw/intc/apic_common.c | 1 + > > > include/migration/vmstate.h | 1 + > > > 2 files changed, 2 insertions(+) > > > > > > diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c > > > index 37a7a7019d..394fe02013 100644 > > > --- a/hw/intc/apic_common.c > > > +++ b/hw/intc/apic_common.c > > > @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = { > > > .pre_load = apic_pre_load, > > > .pre_save = apic_dispatch_pre_save, > > > .post_load = apic_dispatch_post_load, > > > + .priority = MIG_PRI_APIC, > > > .fields = (const VMStateField[]) { > > > VMSTATE_UINT32(apicbase, APICCommonState), > > > VMSTATE_UINT8(id, APICCommonState), > > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h > > > index 1ff7bd9ac4..22e988c5db 100644 > > > --- a/include/migration/vmstate.h > > > +++ b/include/migration/vmstate.h > > > @@ -163,6 +163,7 @@ typedef enum { > > > MIG_PRI_IOMMU, /* Must happen before PCI devices */ > > > MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ > > > MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ > > > + MIG_PRI_APIC, /* Must happen before PCI devices */ > > > MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ > > > MIG_PRI_GICV3, /* Must happen before the ITS */ > > > MIG_PRI_MAX, > > > -- > > > 2.20.1 > > > > > > > +Cedric, +Alex > > > > queued. > > > > Perhaps we could group the interrupt controller priorities > under a common MIG_PRI_INTC ? PPC is very much the same, > although we managed to order restore from the machine load > handler IIRC. > > Anyhow, LGTM. Agreed. That, when introduced, can also take ARM's into account (that may need a MIG_PRI_INTC_HIGH for ITS). Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-30 16:06 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu 2025-08-18 14:56 ` Peter Xu 2025-09-29 15:51 ` Peter Xu 2025-09-30 8:27 ` Cédric Le Goater 2025-09-30 16:04 ` Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).