qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
@ 2025-07-16 18:07 Steve Sistare
  2025-07-21 10:24 ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Steve Sistare @ 2025-07-16 18:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Cedric Le Goater, Zhenzhong Duan, Alex Williamson,
	Steve Sistare

Close a race condition that causes cpr-transfer to lose VFIO
interrupts on ARM.

CPR stops VCPUs but does not disable VFIO interrupts, which may continue
to arrive throughout the transition to new QEMU.

CPR calls kvm_irqchip_remove_irqfd_notifier_gsi in old QEMU to force
future interrupts to the producer eventfd, where they are preserved.
Old QEMU then destroys the old KVM instance.  However, interrupts may
already be pending in KVM state.  To preserve them, call ioctl
KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES to flush them to guest RAM, where
they will be picked up when the new KVM+VCPU instance is created.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 hw/intc/arm_gicv3_kvm.c            | 18 +++++++++++++++++-
 include/hw/intc/arm_gicv3_common.h |  3 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 3be3bf6c28..3d12c846ce 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -30,6 +30,7 @@
 #include "gicv3_internal.h"
 #include "vgic_common.h"
 #include "migration/blocker.h"
+#include "migration/misc.h"
 #include "qom/object.h"
 #include "target/arm/cpregs.h"
 
@@ -777,6 +778,17 @@ static void vm_change_state_handler(void *opaque, bool running,
     }
 }
 
+static int kvm_arm_gicv3_notifier(NotifierWithReturn *notifier,
+                                  MigrationEvent *e, Error **errp)
+{
+    if (e->type == MIG_EVENT_PRECOPY_DONE) {
+        GICv3State *s = container_of(notifier, GICv3State, cpr_notifier);
+        return kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+                                 KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES,
+                                 NULL, true, errp);
+    }
+    return 0;
+}
 
 static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
 {
@@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
                                GICD_CTLR)) {
         error_setg(&s->migration_blocker, "This operating system kernel does "
                                           "not support vGICv3 migration");
-        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
+        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
+                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {
             return;
         }
     }
     if (kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
                               KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES)) {
         qemu_add_vm_change_state_handler(vm_change_state_handler, s);
+        migration_add_notifier_mode(&s->cpr_notifier,
+                                    kvm_arm_gicv3_notifier,
+                                    MIG_MODE_CPR_TRANSFER);
     }
 }
 
diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h
index a3d6a0e507..75bbc403c7 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -27,6 +27,7 @@
 #include "hw/sysbus.h"
 #include "hw/intc/arm_gic_common.h"
 #include "qom/object.h"
+#include "qemu/notify.h"
 
 /*
  * Maximum number of possible interrupts, determined by the GIC architecture.
@@ -270,6 +271,8 @@ struct GICv3State {
     GICv3CPUState *cpu;
     /* List of all ITSes connected to this GIC */
     GPtrArray *itslist;
+
+    NotifierWithReturn cpr_notifier;
 };
 
 #define GICV3_BITMAP_ACCESSORS(BMP)                                     \
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
  2025-07-16 18:07 [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr Steve Sistare
@ 2025-07-21 10:24 ` Peter Maydell
  2025-08-04 14:23   ` Steven Sistare
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2025-07-21 10:24 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, qemu-arm, Cedric Le Goater, Zhenzhong Duan,
	Alex Williamson

On Wed, 16 Jul 2025 at 19:07, Steve Sistare <steven.sistare@oracle.com> wrote:
>
> Close a race condition that causes cpr-transfer to lose VFIO
> interrupts on ARM.
>
> CPR stops VCPUs but does not disable VFIO interrupts, which may continue
> to arrive throughout the transition to new QEMU.
>
> CPR calls kvm_irqchip_remove_irqfd_notifier_gsi in old QEMU to force
> future interrupts to the producer eventfd, where they are preserved.
> Old QEMU then destroys the old KVM instance.  However, interrupts may
> already be pending in KVM state.  To preserve them, call ioctl
> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES to flush them to guest RAM, where
> they will be picked up when the new KVM+VCPU instance is created.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

> +static int kvm_arm_gicv3_notifier(NotifierWithReturn *notifier,
> +                                  MigrationEvent *e, Error **errp)
> +{
> +    if (e->type == MIG_EVENT_PRECOPY_DONE) {
> +        GICv3State *s = container_of(notifier, GICv3State, cpr_notifier);
> +        return kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
> +                                 KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES,
> +                                 NULL, true, errp);
> +    }
> +    return 0;
> +}
>
>  static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>  {
> @@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>                                 GICD_CTLR)) {
>          error_setg(&s->migration_blocker, "This operating system kernel does "
>                                            "not support vGICv3 migration");
> -        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
> +        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
> +                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {

Why did you change this? It's the general "if no support, can't
migrate at all" check, which seems unrelated to cpr-transfer.

>              return;
>          }
>      }
>      if (kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
>                                KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES)) {
>          qemu_add_vm_change_state_handler(vm_change_state_handler, s);
> +        migration_add_notifier_mode(&s->cpr_notifier,
> +                                    kvm_arm_gicv3_notifier,
> +                                    MIG_MODE_CPR_TRANSFER);
>      }
>  }

Otherwise the patch looks OK in general shape, but I know
nothing about cpr-transfer so a review from somebody on the
migration side would be helpful.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
  2025-07-21 10:24 ` Peter Maydell
@ 2025-08-04 14:23   ` Steven Sistare
  2025-08-04 14:47     ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Sistare @ 2025-08-04 14:23 UTC (permalink / raw)
  To: Peter Maydell, Fabiano Rosas
  Cc: qemu-devel, qemu-arm, Cedric Le Goater, Zhenzhong Duan,
	Alex Williamson

Fabiano, could you sanity check this patch? Thanks!

Peter, more below:

On 7/21/2025 6:24 AM, Peter Maydell wrote:
> On Wed, 16 Jul 2025 at 19:07, Steve Sistare <steven.sistare@oracle.com> wrote:
>>
>> Close a race condition that causes cpr-transfer to lose VFIO
>> interrupts on ARM.
>>
>> CPR stops VCPUs but does not disable VFIO interrupts, which may continue
>> to arrive throughout the transition to new QEMU.
>>
>> CPR calls kvm_irqchip_remove_irqfd_notifier_gsi in old QEMU to force
>> future interrupts to the producer eventfd, where they are preserved.
>> Old QEMU then destroys the old KVM instance.  However, interrupts may
>> already be pending in KVM state.  To preserve them, call ioctl
>> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES to flush them to guest RAM, where
>> they will be picked up when the new KVM+VCPU instance is created.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> 
>> +static int kvm_arm_gicv3_notifier(NotifierWithReturn *notifier,
>> +                                  MigrationEvent *e, Error **errp)
>> +{
>> +    if (e->type == MIG_EVENT_PRECOPY_DONE) {
>> +        GICv3State *s = container_of(notifier, GICv3State, cpr_notifier);
>> +        return kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
>> +                                 KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES,
>> +                                 NULL, true, errp);
>> +    }
>> +    return 0;
>> +}
>>
>>   static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>>   {
>> @@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>>                                  GICD_CTLR)) {
>>           error_setg(&s->migration_blocker, "This operating system kernel does "
>>                                             "not support vGICv3 migration");
>> -        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
>> +        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
>> +                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {
> 
> Why did you change this? It's the general "if no support, can't
> migrate at all" check, which seems unrelated to cpr-transfer.

"If no support", then cpr-transfer should also be blocked.
It should have been added earlier.

- Steve

>>               return;
>>           }
>>       }
>>       if (kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
>>                                 KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES)) {
>>           qemu_add_vm_change_state_handler(vm_change_state_handler, s);
>> +        migration_add_notifier_mode(&s->cpr_notifier,
>> +                                    kvm_arm_gicv3_notifier,
>> +                                    MIG_MODE_CPR_TRANSFER);
>>       }
>>   }
> 
> Otherwise the patch looks OK in general shape, but I know
> nothing about cpr-transfer so a review from somebody on the
> migration side would be helpful.
> 
> thanks
> -- PMM



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
  2025-08-04 14:23   ` Steven Sistare
@ 2025-08-04 14:47     ` Peter Maydell
  2025-08-04 15:09       ` Steven Sistare
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2025-08-04 14:47 UTC (permalink / raw)
  To: Steven Sistare
  Cc: Fabiano Rosas, qemu-devel, qemu-arm, Cedric Le Goater,
	Zhenzhong Duan, Alex Williamson

On Mon, 4 Aug 2025 at 15:23, Steven Sistare <steven.sistare@oracle.com> wrote:
>
> Fabiano, could you sanity check this patch? Thanks!
>
> Peter, more below:
>
> On 7/21/2025 6:24 AM, Peter Maydell wrote:
> >> @@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
> >>                                  GICD_CTLR)) {
> >>           error_setg(&s->migration_blocker, "This operating system kernel does "
> >>                                             "not support vGICv3 migration");
> >> -        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
> >> +        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
> >> +                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {
> >
> > Why did you change this? It's the general "if no support, can't
> > migrate at all" check, which seems unrelated to cpr-transfer.
>
> "If no support", then cpr-transfer should also be blocked.

But migrate_add_blocker() is a wrapper for
migrate_add_blocker_modes(..., MIG_MODE_ALL). So doesn't
this change go from "block migration for normal, and cpr-transfer,
and everything else" to "block migration for normal and
cpr-transfer but let the rest through"?

That doesn't seem like the right thing; if it *is* the right
thing then it should be a separate patch with a commit message
that explains why we do it; and we would probably want to
audit all the other uses of plain migrate_add_blocker() and/or
change that function's name or API...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
  2025-08-04 14:47     ` Peter Maydell
@ 2025-08-04 15:09       ` Steven Sistare
  2025-08-04 18:52         ` Fabiano Rosas
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Sistare @ 2025-08-04 15:09 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Fabiano Rosas, qemu-devel, qemu-arm, Cedric Le Goater,
	Zhenzhong Duan, Alex Williamson

On 8/4/2025 10:47 AM, Peter Maydell wrote:
> On Mon, 4 Aug 2025 at 15:23, Steven Sistare <steven.sistare@oracle.com> wrote:
>>
>> Fabiano, could you sanity check this patch? Thanks!
>>
>> Peter, more below:
>>
>> On 7/21/2025 6:24 AM, Peter Maydell wrote:
>>>> @@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>>>>                                   GICD_CTLR)) {
>>>>            error_setg(&s->migration_blocker, "This operating system kernel does "
>>>>                                              "not support vGICv3 migration");
>>>> -        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
>>>> +        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
>>>> +                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {
>>>
>>> Why did you change this? It's the general "if no support, can't
>>> migrate at all" check, which seems unrelated to cpr-transfer.
>>
>> "If no support", then cpr-transfer should also be blocked.
> 
> But migrate_add_blocker() is a wrapper for
> migrate_add_blocker_modes(..., MIG_MODE_ALL). So doesn't
> this change go from "block migration for normal, and cpr-transfer,
> and everything else" to "block migration for normal and
> cpr-transfer but let the rest through"?
> 
> That doesn't seem like the right thing; if it *is* the right
> thing then it should be a separate patch with a commit message
> that explains why we do it; and we would probably want to
> audit all the other uses of plain migrate_add_blocker() and/or
> change that function's name or API...

Yup, my bad. I will revert this hunk.

- Steve


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
  2025-08-04 15:09       ` Steven Sistare
@ 2025-08-04 18:52         ` Fabiano Rosas
  2025-08-11 18:20           ` Steven Sistare
  0 siblings, 1 reply; 7+ messages in thread
From: Fabiano Rosas @ 2025-08-04 18:52 UTC (permalink / raw)
  To: Steven Sistare, Peter Maydell
  Cc: qemu-devel, qemu-arm, Cedric Le Goater, Zhenzhong Duan,
	Alex Williamson

Steven Sistare <steven.sistare@oracle.com> writes:

> On 8/4/2025 10:47 AM, Peter Maydell wrote:
>> On Mon, 4 Aug 2025 at 15:23, Steven Sistare <steven.sistare@oracle.com> wrote:
>>>
>>> Fabiano, could you sanity check this patch? Thanks!
>>>
>>> Peter, more below:
>>>
>>> On 7/21/2025 6:24 AM, Peter Maydell wrote:
>>>>> @@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>>>>>                                   GICD_CTLR)) {
>>>>>            error_setg(&s->migration_blocker, "This operating system kernel does "
>>>>>                                              "not support vGICv3 migration");
>>>>> -        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
>>>>> +        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
>>>>> +                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {
>>>>
>>>> Why did you change this? It's the general "if no support, can't
>>>> migrate at all" check, which seems unrelated to cpr-transfer.
>>>
>>> "If no support", then cpr-transfer should also be blocked.
>> 
>> But migrate_add_blocker() is a wrapper for
>> migrate_add_blocker_modes(..., MIG_MODE_ALL). So doesn't
>> this change go from "block migration for normal, and cpr-transfer,
>> and everything else" to "block migration for normal and
>> cpr-transfer but let the rest through"?
>> 
>> That doesn't seem like the right thing; if it *is* the right
>> thing then it should be a separate patch with a commit message
>> that explains why we do it; and we would probably want to
>> audit all the other uses of plain migrate_add_blocker() and/or
>> change that function's name or API...
>
> Yup, my bad. I will revert this hunk.
>

With that,

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr
  2025-08-04 18:52         ` Fabiano Rosas
@ 2025-08-11 18:20           ` Steven Sistare
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Sistare @ 2025-08-11 18:20 UTC (permalink / raw)
  To: Fabiano Rosas, Peter Maydell
  Cc: qemu-devel, qemu-arm, Cedric Le Goater, Zhenzhong Duan,
	Alex Williamson

On 8/4/2025 2:52 PM, Fabiano Rosas wrote:
> Steven Sistare <steven.sistare@oracle.com> writes:
> 
>> On 8/4/2025 10:47 AM, Peter Maydell wrote:
>>> On Mon, 4 Aug 2025 at 15:23, Steven Sistare <steven.sistare@oracle.com> wrote:
>>>>
>>>> Fabiano, could you sanity check this patch? Thanks!
>>>>
>>>> Peter, more below:
>>>>
>>>> On 7/21/2025 6:24 AM, Peter Maydell wrote:
>>>>>> @@ -883,13 +895,17 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>>>>>>                                    GICD_CTLR)) {
>>>>>>             error_setg(&s->migration_blocker, "This operating system kernel does "
>>>>>>                                               "not support vGICv3 migration");
>>>>>> -        if (migrate_add_blocker(&s->migration_blocker, errp) < 0) {
>>>>>> +        if (migrate_add_blocker_modes(&s->migration_blocker, MIG_MODE_NORMAL,
>>>>>> +                                      MIG_MODE_CPR_TRANSFER, errp) < 0) {
>>>>>
>>>>> Why did you change this? It's the general "if no support, can't
>>>>> migrate at all" check, which seems unrelated to cpr-transfer.
>>>>
>>>> "If no support", then cpr-transfer should also be blocked.
>>>
>>> But migrate_add_blocker() is a wrapper for
>>> migrate_add_blocker_modes(..., MIG_MODE_ALL). So doesn't
>>> this change go from "block migration for normal, and cpr-transfer,
>>> and everything else" to "block migration for normal and
>>> cpr-transfer but let the rest through"?
>>>
>>> That doesn't seem like the right thing; if it *is* the right
>>> thing then it should be a separate patch with a commit message
>>> that explains why we do it; and we would probably want to
>>> audit all the other uses of plain migrate_add_blocker() and/or
>>> change that function's name or API...
>>
>> Yup, my bad. I will revert this hunk.
> 
> With that,
> 
> Reviewed-by: Fabiano Rosas <farosas@suse.de>

I just sent V3.  The only change is the revert.

- Steve




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-08-11 18:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 18:07 [PATCH V2 1/1] hw/intc/arm_gicv3_kvm: preserve pending interrupts during cpr Steve Sistare
2025-07-21 10:24 ` Peter Maydell
2025-08-04 14:23   ` Steven Sistare
2025-08-04 14:47     ` Peter Maydell
2025-08-04 15:09       ` Steven Sistare
2025-08-04 18:52         ` Fabiano Rosas
2025-08-11 18:20           ` Steven Sistare

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).