kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] KVM: Add arch hooks for KVM syscore ops
@ 2025-06-23 13:27 David Woodhouse
  2025-06-23 13:27 ` [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown David Woodhouse
  0 siblings, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2025-06-23 13:27 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Paolo Bonzini,
	Sebastian Ott, Andre Przywara, Thorsten Blum, Shameer Kolothum,
	David Woodhouse, linux-arm-kernel, kvmarm, linux-kernel, kvm

From: David Woodhouse <dwmw@amazon.co.uk>

Allow the architecture to hook kvm_shutdown(), kvm_suspend() and kvm_resume()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 include/linux/kvm_host.h |  3 +++
 virt/kvm/kvm_main.c      | 17 +++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3bde4fb5c6aa..8e16b6c0d2ba 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1622,6 +1622,9 @@ void kvm_arch_disable_virtualization(void);
  */
 int kvm_arch_enable_virtualization_cpu(void);
 void kvm_arch_disable_virtualization_cpu(void);
+void kvm_arch_shutdown(void);
+void kvm_arch_suspend(void);
+void kvm_arch_resume(void);
 #endif
 bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index eec82775c5bf..4af1d9943d39 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5609,6 +5609,20 @@ static int kvm_offline_cpu(unsigned int cpu)
 	return 0;
 }
 
+__weak void kvm_arch_shutdown(void)
+{
+
+}
+
+__weak void kvm_arch_suspend(void)
+{
+
+}
+__weak void kvm_arch_resume(void)
+{
+
+}
+
 static void kvm_shutdown(void)
 {
 	/*
@@ -5625,6 +5639,7 @@ static void kvm_shutdown(void)
 	pr_info("kvm: exiting hardware virtualization\n");
 	kvm_rebooting = true;
 	on_each_cpu(kvm_disable_virtualization_cpu, NULL, 1);
+	kvm_arch_shutdown();
 }
 
 static int kvm_suspend(void)
@@ -5641,6 +5656,7 @@ static int kvm_suspend(void)
 	lockdep_assert_irqs_disabled();
 
 	kvm_disable_virtualization_cpu(NULL);
+	kvm_arch_suspend();
 	return 0;
 }
 
@@ -5649,6 +5665,7 @@ static void kvm_resume(void)
 	lockdep_assert_not_held(&kvm_usage_lock);
 	lockdep_assert_irqs_disabled();
 
+	kvm_arch_resume();
 	WARN_ON_ONCE(kvm_enable_virtualization_cpu());
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown
  2025-06-23 13:27 [RFC PATCH 1/2] KVM: Add arch hooks for KVM syscore ops David Woodhouse
@ 2025-06-23 13:27 ` David Woodhouse
  2025-06-23 16:38   ` David Woodhouse
  2025-07-22 22:46   ` Oliver Upton
  0 siblings, 2 replies; 6+ messages in thread
From: David Woodhouse @ 2025-06-23 13:27 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Paolo Bonzini,
	Sebastian Ott, Andre Przywara, Thorsten Blum, Shameer Kolothum,
	David Woodhouse, linux-arm-kernel, kvmarm, linux-kernel, kvm

From: David Woodhouse <dwmw@amazon.co.uk>

We observed systems going dark on kexec, due to corruption of the new
kernel's text (and sometimes the initrd). This was eventually determined
to be caused by the vLPI pending tables used by the GIC in the previous
kernel, which were not being quiesced properly.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/arm64/kvm/arm.c          |  5 +++++
 arch/arm64/kvm/vgic/vgic-v3.c | 14 ++++++++++++++
 include/kvm/arm_vgic.h        |  2 ++
 3 files changed, 21 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 38a91bb5d4c7..2b76f506bc2d 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2164,6 +2164,11 @@ void kvm_arch_disable_virtualization_cpu(void)
 		cpu_hyp_uninit(NULL);
 }
 
+void kvm_arch_shutdown(void)
+{
+	kvm_vgic_v3_shutdown();
+}
+
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index b9ad7c42c5b0..6591e8d84855 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -382,6 +382,20 @@ static void map_all_vpes(struct kvm *kvm)
 						dist->its_vm.vpes[i]->irq));
 }
 
+void kvm_vgic_v3_shutdown(void)
+{
+	struct kvm *kvm;
+
+	if (!kvm_vgic_global_state.has_gicv4_1)
+		return;
+
+	mutex_lock(&kvm_lock);
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		unmap_all_vpes(kvm);
+	}
+	mutex_unlock(&kvm_lock);
+}
+
 /*
  * vgic_v3_save_pending_tables - Save the pending tables into guest RAM
  * kvm lock and all vcpu lock must be held
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4a34f7f0a864..e850ee860238 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -442,6 +442,8 @@ int vgic_v4_put(struct kvm_vcpu *vcpu);
 
 bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
 
+void kvm_vgic_v3_shutdown(void);
+
 /* CPU HP callbacks */
 void kvm_vgic_cpu_up(void);
 void kvm_vgic_cpu_down(void);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown
  2025-06-23 13:27 ` [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown David Woodhouse
@ 2025-06-23 16:38   ` David Woodhouse
  2025-07-22 10:35     ` David Woodhouse
  2025-07-22 22:46   ` Oliver Upton
  1 sibling, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2025-06-23 16:38 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Paolo Bonzini,
	Sebastian Ott, Andre Przywara, Thorsten Blum, Shameer Kolothum,
	linux-arm-kernel, kvmarm, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 4693 bytes --]

On Mon, 2025-06-23 at 14:27 +0100, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> We observed systems going dark on kexec, due to corruption of the new
> kernel's text (and sometimes the initrd). This was eventually determined
> to be caused by the vLPI pending tables used by the GIC in the previous
> kernel, which were not being quiesced properly.

FWIW this is a previous hack we attempted which *didn't* work. (For
illustration only; ignore the syscore .kexec hook. We addressed that
differently in the end with
https://lore.kernel.org/kexec/20231213064004.2419447-1-jgowans@amazon.com/ )

At the point where the its_kexec() hook in this patch has completed, we
poisoned the (ex-) vLPI pending tables and then scanned for corruption
in them. We saw the same characteristic pattern of corruption which had
been breaking the next kernel after kexec: 32 bytes copied from offset
0 to offset 32 in a page, followed by bytes 0, 1, 32, 33, 34, 35 being
zeroed.

Adding a few milliseconds of sleep before the poisoning was enough to
make the problem go away. As is the patch which calls unmap_all_vpes()
∀ kvm.

Of course, if the GIC were behind an IOMMU as all DMA-capable devices
should be, this might never have happened...

diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index f407cce9ecaa..a4fde376d214 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -19,6 +19,12 @@ struct gic_quirk {
 	u32 mask;
 };
 
+struct redist_region {
+	void __iomem		*redist_base;
+	phys_addr_t		phys_base;
+	bool			single_redist;
+};
+
 int gic_configure_irq(unsigned int irq, unsigned int type,
                        void __iomem *base, void (*sync_access)(void));
 void gic_dist_config(void __iomem *base, int gic_irqs,
@@ -33,4 +39,6 @@ void gic_enable_of_quirks(const struct device_node *np,
 #define RDIST_FLAGS_RD_TABLES_PREALLOCATED     (1 << 1)
 #define RDIST_FLAGS_FORCE_NON_SHAREABLE        (1 << 2)
 
+int gic_iterate_rdists(int (*fn)(struct redist_region *, void __iomem *));
+
 #endif /* _IRQ_GIC_COMMON_H */
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 638f7eb033ad..d106b6ccca8b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -4902,6 +4902,51 @@ static void its_enable_quirks(struct its_node *its)
 				     its_quirks, its);
 }
 
+static int disable_vpes(struct redist_region *region, void __iomem *ptr)
+{
+	u64 typer;
+	u64 val;
+
+	typer = gic_read_typer(ptr + GICR_TYPER);
+
+	if (!((typer & GICR_TYPER_VLPIS) && (typer & GICR_TYPER_RVPEID)))
+		return 1;
+
+	/* Deactivate any present vPE */
+	its_clear_vpend_valid(ptr + SZ_128K, 0, GICR_VPENDBASER_PendingLast);
+
+	/* Mark the VPE table as invalid */
+	val = gicr_read_vpropbaser(ptr + SZ_128K + GICR_VPROPBASER);
+	val &= ~GICR_VPROPBASER_4_1_VALID;
+	gicr_write_vpropbaser(val, ptr + SZ_128K + GICR_VPROPBASER);
+
+	/* Disable next redistributor */
+	return 1;
+}
+
+static int its_kexec(void)
+{
+	int err = 0, err_return = 0;
+	struct its_node *its;
+
+	raw_spin_lock(&its_lock);
+
+	list_for_each_entry(its, &its_nodes, entry) {
+		err = its_force_quiescent(its->base);
+		if (err) {
+			pr_err("ITS@%pa: failed to quiesce: %d\n",
+			       &its->phys_base, err);
+			err_return = -EBUSY;
+		}
+	}
+
+	gic_iterate_rdists(disable_vpes);
+
+	raw_spin_unlock(&its_lock);
+
+	return err_return;
+}
+
 static int its_save_disable(void)
 {
 	struct its_node *its;
@@ -5001,6 +5046,7 @@ static void its_restore_enable(void)
 static struct syscore_ops its_syscore_ops = {
 	.suspend = its_save_disable,
 	.resume = its_restore_enable,
+	.kexec = its_kexec,
 };
 
 static void __init __iomem *its_map_one(struct resource *res, int *err)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 50143de1791d..2014c5a75a6e 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -46,12 +46,6 @@
 
 #define GIC_IRQ_TYPE_PARTITION	(GIC_IRQ_TYPE_LPI + 1)
 
-struct redist_region {
-	void __iomem		*redist_base;
-	phys_addr_t		phys_base;
-	bool			single_redist;
-};
-
 struct gic_chip_data {
 	struct fwnode_handle	*fwnode;
 	phys_addr_t		dist_phys_base;
@@ -968,7 +962,7 @@ static void __init gic_dist_init(void)
 		gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
 }
 
-static int gic_iterate_rdists(int (*fn)(struct redist_region *, void __iomem *))
+int gic_iterate_rdists(int (*fn)(struct redist_region *, void __iomem *))
 {
 	int ret = -ENODEV;
 	int i;


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown
  2025-06-23 16:38   ` David Woodhouse
@ 2025-07-22 10:35     ` David Woodhouse
  0 siblings, 0 replies; 6+ messages in thread
From: David Woodhouse @ 2025-07-22 10:35 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Paolo Bonzini,
	Sebastian Ott, Andre Przywara, Thorsten Blum, Shameer Kolothum,
	linux-arm-kernel, kvmarm, linux-kernel, kvm
  Cc: Saidi, Ali

[-- Attachment #1: Type: text/plain, Size: 1555 bytes --]

On Mon, 2025-06-23 at 18:38 +0200, David Woodhouse wrote:
> On Mon, 2025-06-23 at 14:27 +0100, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > We observed systems going dark on kexec, due to corruption of the new
> > kernel's text (and sometimes the initrd). This was eventually determined
> > to be caused by the vLPI pending tables used by the GIC in the previous
> > kernel, which were not being quiesced properly.
> 
> FWIW this is a previous hack we attempted which *didn't* work. (For
> illustration only; ignore the syscore .kexec hook. We addressed that
> differently in the end with
> https://lore.kernel.org/kexec/20231213064004.2419447-1-jgowans@amazon.com/ )
> 
> At the point where the its_kexec() hook in this patch has completed, we
> poisoned the (ex-) vLPI pending tables and then scanned for corruption
> in them. We saw the same characteristic pattern of corruption which had
> been breaking the next kernel after kexec: 32 bytes copied from offset
> 0 to offset 32 in a page, followed by bytes 0, 1, 32, 33, 34, 35 being
> zeroed.
> 
> Adding a few milliseconds of sleep before the poisoning was enough to
> make the problem go away. As is the patch which calls unmap_all_vpes()
> ∀ kvm.
> 
> Of course, if the GIC were behind an IOMMU as all DMA-capable devices
> should be, this might never have happened...

Any thoughts on this? I'd really appreciate some guidance from Arm on
precisely what is *expected* of the operating system here, to quiesce
the GIC correctly. 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown
  2025-06-23 13:27 ` [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown David Woodhouse
  2025-06-23 16:38   ` David Woodhouse
@ 2025-07-22 22:46   ` Oliver Upton
  2025-07-23  9:42     ` David Woodhouse
  1 sibling, 1 reply; 6+ messages in thread
From: Oliver Upton @ 2025-07-22 22:46 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Paolo Bonzini, Sebastian Ott,
	Andre Przywara, Thorsten Blum, Shameer Kolothum, David Woodhouse,
	linux-arm-kernel, kvmarm, linux-kernel, kvm

On Mon, Jun 23, 2025 at 02:27:14PM +0100, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> We observed systems going dark on kexec, due to corruption of the new
> kernel's text (and sometimes the initrd). This was eventually determined
> to be caused by the vLPI pending tables used by the GIC in the previous
> kernel, which were not being quiesced properly.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/arm64/kvm/arm.c          |  5 +++++
>  arch/arm64/kvm/vgic/vgic-v3.c | 14 ++++++++++++++
>  include/kvm/arm_vgic.h        |  2 ++
>  3 files changed, 21 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 38a91bb5d4c7..2b76f506bc2d 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -2164,6 +2164,11 @@ void kvm_arch_disable_virtualization_cpu(void)
>  		cpu_hyp_uninit(NULL);
>  }
>  
> +void kvm_arch_shutdown(void)
> +{
> +	kvm_vgic_v3_shutdown();
> +}
> +
>  #ifdef CONFIG_CPU_PM
>  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>  				    unsigned long cmd,
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index b9ad7c42c5b0..6591e8d84855 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -382,6 +382,20 @@ static void map_all_vpes(struct kvm *kvm)
>  						dist->its_vm.vpes[i]->irq));
>  }
>  
> +void kvm_vgic_v3_shutdown(void)
> +{
> +	struct kvm *kvm;
> +
> +	if (!kvm_vgic_global_state.has_gicv4_1)
> +		return;
> +
> +	mutex_lock(&kvm_lock);
> +	list_for_each_entry(kvm, &vm_list, vm_list) {
> +		unmap_all_vpes(kvm);
> +	}
> +	mutex_unlock(&kvm_lock);
> +}
> +

This presumes the vCPUs have already been quiesced which I'm guessing
is the case for you. The vPEs need to be made nonresident from the
redistributors prior to unmapping from the ITS to avoid consuming
unknown vPE state (IHI0069H.b 8.6.2).

So we'd probably need to deschedule the vPE in
kvm_arch_disable_virtualization_cpu() along with some awareness of
'kvm_rebooting'.

Thanks,
Oliver

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown
  2025-07-22 22:46   ` Oliver Upton
@ 2025-07-23  9:42     ` David Woodhouse
  0 siblings, 0 replies; 6+ messages in thread
From: David Woodhouse @ 2025-07-23  9:42 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Paolo Bonzini, Sebastian Ott,
	Andre Przywara, Thorsten Blum, Shameer Kolothum, linux-arm-kernel,
	kvmarm, linux-kernel, kvm, Saidi, Ali

[-- Attachment #1: Type: text/plain, Size: 4179 bytes --]

On Tue, 2025-07-22 at 15:46 -0700, Oliver Upton wrote:
> On Mon, Jun 23, 2025 at 02:27:14PM +0100, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > We observed systems going dark on kexec, due to corruption of the
> > new
> > kernel's text (and sometimes the initrd). This was eventually
> > determined
> > to be caused by the vLPI pending tables used by the GIC in the
> > previous
> > kernel, which were not being quiesced properly.
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> >  arch/arm64/kvm/arm.c          |  5 +++++
> >  arch/arm64/kvm/vgic/vgic-v3.c | 14 ++++++++++++++
> >  include/kvm/arm_vgic.h        |  2 ++
> >  3 files changed, 21 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 38a91bb5d4c7..2b76f506bc2d 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -2164,6 +2164,11 @@ void
> > kvm_arch_disable_virtualization_cpu(void)
> >  		cpu_hyp_uninit(NULL);
> >  }
> >  
> > +void kvm_arch_shutdown(void)
> > +{
> > +	kvm_vgic_v3_shutdown();
> > +}
> > +
> >  #ifdef CONFIG_CPU_PM
> >  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
> >  				    unsigned long cmd,
> > diff --git a/arch/arm64/kvm/vgic/vgic-v3.c
> > b/arch/arm64/kvm/vgic/vgic-v3.c
> > index b9ad7c42c5b0..6591e8d84855 100644
> > --- a/arch/arm64/kvm/vgic/vgic-v3.c
> > +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> > @@ -382,6 +382,20 @@ static void map_all_vpes(struct kvm *kvm)
> >  						dist-
> > >its_vm.vpes[i]->irq));
> >  }
> >  
> > +void kvm_vgic_v3_shutdown(void)
> > +{
> > +	struct kvm *kvm;
> > +
> > +	if (!kvm_vgic_global_state.has_gicv4_1)
> > +		return;
> > +
> > +	mutex_lock(&kvm_lock);
> > +	list_for_each_entry(kvm, &vm_list, vm_list) {
> > +		unmap_all_vpes(kvm);
> > +	}
> > +	mutex_unlock(&kvm_lock);
> > +}
> > +
> 
> This presumes the vCPUs have already been quiesced which I'm guessing
> is the case for you.

Yeah. With KHO we aspire to be able to do a kexec with some pCPUs
actually still *running* guest vCPUs instead of pointlessly taking them
offline just for *one* pCPU to do the kexec work. But that's a way off
yet, and in that case all these tables will need to be in memory which
persists across the kexec so we won't need to quiesce anything. But
those fantasies are a way off for now...

> The vPEs need to be made nonresident from the
> redistributors prior to unmapping from the ITS to avoid consuming
> unknown vPE state (IHI0069H.b 8.6.2).

Right, I think that's what's being done in the second patch I sent,
saying, "FWIW this is a previous hack we attempted which *didn't work".
To be clear, we do still *have* that hack, in addition to the explicit
unmap_all_vpes() call.

I would love a definitive answer about what the hypervisor is
*expected* to do here. It's very suboptimal that the GIC doesn't
actually stop accessing memory when it is quiesced, and that the GIC
doesn't live behind an IOMMU which would at least allow stray DMA to be
prevented.

> So we'd probably need to deschedule the vPE in
> kvm_arch_disable_virtualization_cpu() along with some awareness of
> 'kvm_rebooting'.

Yeah, I also pondered doing it *all* from there, but it looked like it
would have required some kind of counting to work out when the *last*
CPU was taken down as there's only a per-CPU arch hook. So I didn't
bother with that for the early RFC.

Note that this issue with the GIC's scattershot DMA doesn't only affect
KVM hosts and the vLPI pending tables. We *also* have similar issues on
the guest side with hibernate. The boot kernel sends a MAPD command to
set up an ITT, then transfers control back to the resumed kernel which
had previously set up that ITT at a *different* address, and nobody
ever tells the (v)GIC. Which means that if the host subsequently
serializes that guest for LU/LM, it corrupts memory that the running
kernel didn't expect it to. I guess this would happen for hibernate on
real hardware too? And maybe even kexec but that one just hasn't bitten
us yet?



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-23  9:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-23 13:27 [RFC PATCH 1/2] KVM: Add arch hooks for KVM syscore ops David Woodhouse
2025-06-23 13:27 ` [RFC PATCH 2/2] KVM: arm64: vgic-its: Unmap all vPEs on shutdown David Woodhouse
2025-06-23 16:38   ` David Woodhouse
2025-07-22 10:35     ` David Woodhouse
2025-07-22 22:46   ` Oliver Upton
2025-07-23  9:42     ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).