qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
@ 2025-08-18 13:11 Yanfei Xu
  2025-08-18 14:56 ` Peter Xu
  2025-09-29 15:51 ` Peter Xu
  0 siblings, 2 replies; 5+ messages in thread
From: Yanfei Xu @ 2025-08-18 13:11 UTC (permalink / raw)
  To: mst, pbonzini, peterx, farosas; +Cc: qemu-devel, yanfei.xu, shenyicong.1023

The load procedure of VFIO PCI devices involves setting up IRT
for each VFIO PCI devices. This requires determining whether an
interrupt is single-destination interrupt to decide between
Posted Interrupt(PI) or remapping mode for the IRTE. However,
determining this may require accessing the VM's APIC registers.

For example:
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
  ...
    kvm_arch_irq_bypass_add_producer
      kvm_x86_call(pi_update_irte)
        vmx_pi_update_irte
          kvm_intr_is_single_vcpu

If the LAPIC has not been loaded yet, interrupts will use remapping
mode. To prevent the fallback of interrupt mode, keep APIC is always
loaded prior to VFIO PCI devices.

Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
---
 hw/intc/apic_common.c       | 1 +
 include/migration/vmstate.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 37a7a7019d..394fe02013 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
     .pre_load = apic_pre_load,
     .pre_save = apic_dispatch_pre_save,
     .post_load = apic_dispatch_post_load,
+    .priority = MIG_PRI_APIC,
     .fields = (const VMStateField[]) {
         VMSTATE_UINT32(apicbase, APICCommonState),
         VMSTATE_UINT8(id, APICCommonState),
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1ff7bd9ac4..22e988c5db 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -163,6 +163,7 @@ typedef enum {
     MIG_PRI_IOMMU,              /* Must happen before PCI devices */
     MIG_PRI_PCI_BUS,            /* Must happen before IOMMU */
     MIG_PRI_VIRTIO_MEM,         /* Must happen before IOMMU */
+    MIG_PRI_APIC,               /* Must happen before PCI devices */
     MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
     MIG_PRI_GICV3,              /* Must happen before the ITS */
     MIG_PRI_MAX,
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
  2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu
@ 2025-08-18 14:56 ` Peter Xu
  2025-09-29 15:51 ` Peter Xu
  1 sibling, 0 replies; 5+ messages in thread
From: Peter Xu @ 2025-08-18 14:56 UTC (permalink / raw)
  To: Yanfei Xu; +Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023

On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
> The load procedure of VFIO PCI devices involves setting up IRT
> for each VFIO PCI devices. This requires determining whether an
> interrupt is single-destination interrupt to decide between
> Posted Interrupt(PI) or remapping mode for the IRTE. However,
> determining this may require accessing the VM's APIC registers.
> 
> For example:
> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
>   ...
>     kvm_arch_irq_bypass_add_producer
>       kvm_x86_call(pi_update_irte)
>         vmx_pi_update_irte
>           kvm_intr_is_single_vcpu
> 
> If the LAPIC has not been loaded yet, interrupts will use remapping
> mode. To prevent the fallback of interrupt mode, keep APIC is always
> loaded prior to VFIO PCI devices.
> 
> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
> ---
>  hw/intc/apic_common.c       | 1 +
>  include/migration/vmstate.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index 37a7a7019d..394fe02013 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
>      .pre_load = apic_pre_load,
>      .pre_save = apic_dispatch_pre_save,
>      .post_load = apic_dispatch_post_load,
> +    .priority = MIG_PRI_APIC,
>      .fields = (const VMStateField[]) {
>          VMSTATE_UINT32(apicbase, APICCommonState),
>          VMSTATE_UINT8(id, APICCommonState),
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 1ff7bd9ac4..22e988c5db 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -163,6 +163,7 @@ typedef enum {
>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
>      MIG_PRI_PCI_BUS,            /* Must happen before IOMMU */
>      MIG_PRI_VIRTIO_MEM,         /* Must happen before IOMMU */
> +    MIG_PRI_APIC,               /* Must happen before PCI devices */

As the list grows, maybe it's a good idea we start to document exactly the
reasons inline into the enum.  Can sure be done on top.  This patch alone
looks like stable material..

Reviewed-by: Peter Xu <peterx@redhat.com>

>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>      MIG_PRI_MAX,
> -- 
> 2.20.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
  2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu
  2025-08-18 14:56 ` Peter Xu
@ 2025-09-29 15:51 ` Peter Xu
  2025-09-30  8:27   ` Cédric Le Goater
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Xu @ 2025-09-29 15:51 UTC (permalink / raw)
  To: Yanfei Xu
  Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023,
	Cédric Le Goater, Alex Williamson

On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
> The load procedure of VFIO PCI devices involves setting up IRT
> for each VFIO PCI devices. This requires determining whether an
> interrupt is single-destination interrupt to decide between
> Posted Interrupt(PI) or remapping mode for the IRTE. However,
> determining this may require accessing the VM's APIC registers.
> 
> For example:
> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
>   ...
>     kvm_arch_irq_bypass_add_producer
>       kvm_x86_call(pi_update_irte)
>         vmx_pi_update_irte
>           kvm_intr_is_single_vcpu
> 
> If the LAPIC has not been loaded yet, interrupts will use remapping
> mode. To prevent the fallback of interrupt mode, keep APIC is always
> loaded prior to VFIO PCI devices.
> 
> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
> ---
>  hw/intc/apic_common.c       | 1 +
>  include/migration/vmstate.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index 37a7a7019d..394fe02013 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
>      .pre_load = apic_pre_load,
>      .pre_save = apic_dispatch_pre_save,
>      .post_load = apic_dispatch_post_load,
> +    .priority = MIG_PRI_APIC,
>      .fields = (const VMStateField[]) {
>          VMSTATE_UINT32(apicbase, APICCommonState),
>          VMSTATE_UINT8(id, APICCommonState),
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 1ff7bd9ac4..22e988c5db 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -163,6 +163,7 @@ typedef enum {
>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
>      MIG_PRI_PCI_BUS,            /* Must happen before IOMMU */
>      MIG_PRI_VIRTIO_MEM,         /* Must happen before IOMMU */
> +    MIG_PRI_APIC,               /* Must happen before PCI devices */
>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>      MIG_PRI_MAX,
> -- 
> 2.20.1
> 

+Cedric, +Alex

queued.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
  2025-09-29 15:51 ` Peter Xu
@ 2025-09-30  8:27   ` Cédric Le Goater
  2025-09-30 16:04     ` Peter Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Cédric Le Goater @ 2025-09-30  8:27 UTC (permalink / raw)
  To: Peter Xu, Yanfei Xu
  Cc: mst, pbonzini, farosas, qemu-devel, shenyicong.1023,
	Alex Williamson

On 9/29/25 17:51, Peter Xu wrote:
> On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
>> The load procedure of VFIO PCI devices involves setting up IRT
>> for each VFIO PCI devices. This requires determining whether an
>> interrupt is single-destination interrupt to decide between
>> Posted Interrupt(PI) or remapping mode for the IRTE. However,
>> determining this may require accessing the VM's APIC registers.
>>
>> For example:
>> ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
>>    ...
>>      kvm_arch_irq_bypass_add_producer
>>        kvm_x86_call(pi_update_irte)
>>          vmx_pi_update_irte
>>            kvm_intr_is_single_vcpu
>>
>> If the LAPIC has not been loaded yet, interrupts will use remapping
>> mode. To prevent the fallback of interrupt mode, keep APIC is always
>> loaded prior to VFIO PCI devices.
>>
>> Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
>> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
>> ---
>>   hw/intc/apic_common.c       | 1 +
>>   include/migration/vmstate.h | 1 +
>>   2 files changed, 2 insertions(+)
>>
>> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
>> index 37a7a7019d..394fe02013 100644
>> --- a/hw/intc/apic_common.c
>> +++ b/hw/intc/apic_common.c
>> @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
>>       .pre_load = apic_pre_load,
>>       .pre_save = apic_dispatch_pre_save,
>>       .post_load = apic_dispatch_post_load,
>> +    .priority = MIG_PRI_APIC,
>>       .fields = (const VMStateField[]) {
>>           VMSTATE_UINT32(apicbase, APICCommonState),
>>           VMSTATE_UINT8(id, APICCommonState),
>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>> index 1ff7bd9ac4..22e988c5db 100644
>> --- a/include/migration/vmstate.h
>> +++ b/include/migration/vmstate.h
>> @@ -163,6 +163,7 @@ typedef enum {
>>       MIG_PRI_IOMMU,              /* Must happen before PCI devices */
>>       MIG_PRI_PCI_BUS,            /* Must happen before IOMMU */
>>       MIG_PRI_VIRTIO_MEM,         /* Must happen before IOMMU */
>> +    MIG_PRI_APIC,               /* Must happen before PCI devices */
>>       MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>>       MIG_PRI_GICV3,              /* Must happen before the ITS */
>>       MIG_PRI_MAX,
>> -- 
>> 2.20.1
>>
> 
> +Cedric, +Alex
> 
> queued.
> 

Perhaps we could group the interrupt controller priorities
under a common MIG_PRI_INTC ? PPC is very much the same,
although we managed to order restore from the machine load
handler IIRC.

Anyhow, LGTM.

Thanks,

C.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices
  2025-09-30  8:27   ` Cédric Le Goater
@ 2025-09-30 16:04     ` Peter Xu
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Xu @ 2025-09-30 16:04 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Yanfei Xu, mst, pbonzini, farosas, qemu-devel, shenyicong.1023,
	Alex Williamson

On Tue, Sep 30, 2025 at 10:27:34AM +0200, Cédric Le Goater wrote:
> On 9/29/25 17:51, Peter Xu wrote:
> > On Mon, Aug 18, 2025 at 09:11:27PM +0800, Yanfei Xu wrote:
> > > The load procedure of VFIO PCI devices involves setting up IRT
> > > for each VFIO PCI devices. This requires determining whether an
> > > interrupt is single-destination interrupt to decide between
> > > Posted Interrupt(PI) or remapping mode for the IRTE. However,
> > > determining this may require accessing the VM's APIC registers.
> > > 
> > > For example:
> > > ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
> > >    ...
> > >      kvm_arch_irq_bypass_add_producer
> > >        kvm_x86_call(pi_update_irte)
> > >          vmx_pi_update_irte
> > >            kvm_intr_is_single_vcpu
> > > 
> > > If the LAPIC has not been loaded yet, interrupts will use remapping
> > > mode. To prevent the fallback of interrupt mode, keep APIC is always
> > > loaded prior to VFIO PCI devices.
> > > 
> > > Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
> > > Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
> > > ---
> > >   hw/intc/apic_common.c       | 1 +
> > >   include/migration/vmstate.h | 1 +
> > >   2 files changed, 2 insertions(+)
> > > 
> > > diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> > > index 37a7a7019d..394fe02013 100644
> > > --- a/hw/intc/apic_common.c
> > > +++ b/hw/intc/apic_common.c
> > > @@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
> > >       .pre_load = apic_pre_load,
> > >       .pre_save = apic_dispatch_pre_save,
> > >       .post_load = apic_dispatch_post_load,
> > > +    .priority = MIG_PRI_APIC,
> > >       .fields = (const VMStateField[]) {
> > >           VMSTATE_UINT32(apicbase, APICCommonState),
> > >           VMSTATE_UINT8(id, APICCommonState),
> > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > index 1ff7bd9ac4..22e988c5db 100644
> > > --- a/include/migration/vmstate.h
> > > +++ b/include/migration/vmstate.h
> > > @@ -163,6 +163,7 @@ typedef enum {
> > >       MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > >       MIG_PRI_PCI_BUS,            /* Must happen before IOMMU */
> > >       MIG_PRI_VIRTIO_MEM,         /* Must happen before IOMMU */
> > > +    MIG_PRI_APIC,               /* Must happen before PCI devices */
> > >       MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> > >       MIG_PRI_GICV3,              /* Must happen before the ITS */
> > >       MIG_PRI_MAX,
> > > -- 
> > > 2.20.1
> > > 
> > 
> > +Cedric, +Alex
> > 
> > queued.
> > 
> 
> Perhaps we could group the interrupt controller priorities
> under a common MIG_PRI_INTC ? PPC is very much the same,
> although we managed to order restore from the machine load
> handler IIRC.
> 
> Anyhow, LGTM.

Agreed. That, when introduced, can also take ARM's into account (that may
need a MIG_PRI_INTC_HIGH for ITS).

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-30 16:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-18 13:11 [PATCH] migration: ensure APIC is loaded prior to VFIO PCI devices Yanfei Xu
2025-08-18 14:56 ` Peter Xu
2025-09-29 15:51 ` Peter Xu
2025-09-30  8:27   ` Cédric Le Goater
2025-09-30 16:04     ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).