linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [bug report] GICv4.1: vSGI remains pending across the guest reset
@ 2023-12-14 12:13 Kunkun Jiang
  2023-12-17 11:26 ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Kunkun Jiang @ 2023-12-14 12:13 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Jean-Philippe Brucker
  Cc: moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

Hi list,

We have observed on GICv4.1 systems that, after a guest reset, the
secondary VCPU would receive an IPI_CPU_STOP accidently and failed to
come online eventually.

| Guest User-space
|
| reset (with a vSGI pending in the
| hardware) [0]
|
| disable the distributor (write 0
| into GICD_CTLR) [1]
|
| clear pending status (write 0 into
| GICR_ISPENDR0 for each RD) [2]
|
| disable the distributor (write 0
| into GICD_CTLR) [3]
|
| enable the distributor with ARE,
| Group1 and nASSGIreq [4]

The problem is that even if user-space tries to disable the distributor
and clear pending bits for all SGIs, we don't actually propogate it into
HW (we only record it via vgic_dist->{enabled, nassgireq} and
vgic_irq->pending_latch) and the vSGI remains pending across the guest
reset.

And when we're at [4], all vSGI's vgic_irq->hw are *true* and
vgic_v4_enable_vsgis() becomes a nop.. That's not good.

The following solution can solve the problem. Not sure if this is a good
solution.Sent out early for suggestions or solutions.

diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 89117ba2528a..3678ab33f5b9 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -374,6 +374,10 @@ static int vgic_v3_uaccess_write_pending(struct 
kvm_vcpu *vcpu,
              irq->pending_latch = true;
              vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
          } else {
+             if (irq->hw && vgic_irq_is_sgi(irq->intid))
+                 irq_set_irqchip_state(irq->host_irq,
+                              IRQCHIP_STATE_PENDING,
+                              false);
              irq->pending_latch = false;
              raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
          }


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-14 12:13 [bug report] GICv4.1: vSGI remains pending across the guest reset Kunkun Jiang
@ 2023-12-17 11:26 ` Marc Zyngier
  2023-12-17 17:33   ` Oliver Upton
  2023-12-19  7:24   ` Kunkun Jiang
  0 siblings, 2 replies; 9+ messages in thread
From: Marc Zyngier @ 2023-12-17 11:26 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On Thu, 14 Dec 2023 12:13:57 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> Hi list,
> 
> We have observed on GICv4.1 systems that, after a guest reset, the
> secondary VCPU would receive an IPI_CPU_STOP accidently and failed to
> come online eventually.
> 
> | Guest User-space

Guest userspace????

> |
> | reset (with a vSGI pending in the
> | hardware) [0]
> |
> | disable the distributor (write 0
> | into GICD_CTLR) [1]
> |
> | clear pending status (write 0 into
> | GICR_ISPENDR0 for each RD) [2]

This cannot clear the pending bits. Writing 0 to any of the
ISPEND/ICPEND registers is effectively a NOP.

If you want to remove the pending SGIs, you need to write a bunch of
1s to ICPENDR0.

> |
> | disable the distributor (write 0
> | into GICD_CTLR) [3]
> |
> | enable the distributor with ARE,
> | Group1 and nASSGIreq [4]
> 
> The problem is that even if user-space tries to disable the distributor

I don't understand what userspace does here. Why is it significant
that it is an EL0 access? I don't know of any SW doing that, but even
if it existed, this should make no difference.

> and clear pending bits for all SGIs, we don't actually propogate it into
> HW (we only record it via vgic_dist->{enabled, nassgireq} and
> vgic_irq->pending_latch) and the vSGI remains pending across the guest
> reset.
>
> And when we're at [4], all vSGI's vgic_irq->hw are *true* and
> vgic_v4_enable_vsgis() becomes a nop.. That's not good.
> 
> The following solution can solve the problem. Not sure if this is a good
> solution.Sent out early for suggestions or solutions.
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 89117ba2528a..3678ab33f5b9 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -374,6 +374,10 @@ static int vgic_v3_uaccess_write_pending(struct
> kvm_vcpu *vcpu,
>              irq->pending_latch = true;
>              vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
>          } else {
> +             if (irq->hw && vgic_irq_is_sgi(irq->intid))
> +                 irq_set_irqchip_state(irq->host_irq,
> +                              IRQCHIP_STATE_PENDING,
> +                              false);
>              irq->pending_latch = false;
>              raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
>          }
> 

But this has *nothing* to do with the guest. This is the *host*
userspace performing a write to the redistributor view, which has
different semantics. Which is why your earlier description made no
sense to me.

I think the problem is slightly larger than what you describe. A write
to ISPENDR0 should be propagated to the ITS for any values of the
latch, just like this happens on enabling HW-backed SGIs.

Can you please give this a go?

Thanks,

	M.

From 9932d74392d969057e84bc3c18bd15f1769b6211 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Sun, 17 Dec 2023 11:15:09 +0000
Subject: [PATCH] KVM: arm64: vgic-v4: Restore pending state on host userspace
 write

When the VMM writes to ISPENDR0 to set the state pending state of
an SGI, we fail to convey this to the HW if this SGI is already
backed by a GICv4.1 vSGI.

This is a bit of a corner case, as this would only occur if the
vgic state is changed on an already running VM, but this can
apparently happen across a guest reset driven by the VMM.

Fix this by always writing out the pending_latch value to the
HW, and reseting it to false.

Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/7e7f2c0c-448b-10a9-8929-4b8f4f6e2a32@huawei.com
---
 arch/arm64/kvm/vgic/vgic-mmio-v3.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index a764b0ab8bf9..2533f264b954 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -365,19 +365,26 @@ static int vgic_v3_uaccess_write_pending(struct kvm_vcpu *vcpu,
 		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
 
 		raw_spin_lock_irqsave(&irq->irq_lock, flags);
-		if (test_bit(i, &val)) {
-			/*
-			 * pending_latch is set irrespective of irq type
-			 * (level or edge) to avoid dependency that VM should
-			 * restore irq config before pending info.
-			 */
-			irq->pending_latch = true;
-			vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
-		} else {
+
+		/*
+		 * pending_latch is set irrespective of irq type
+		 * (level or edge) to avoid dependency that VM should
+		 * restore irq config before pending info.
+		 */
+		irq->pending_latch = test_bit(i, &val);
+
+		if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
+			irq_set_irqchip_state(irq->host_irq,
+					      IRQCHIP_STATE_PENDING,
+					      irq->pending_latch);
 			irq->pending_latch = false;
-			raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
 		}
 
+		if (irq->pending_latch)
+			vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
+		else
+			raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
+
 		vgic_put_irq(vcpu->kvm, irq);
 	}
 
-- 
2.39.2


-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-17 11:26 ` Marc Zyngier
@ 2023-12-17 17:33   ` Oliver Upton
  2023-12-17 17:34     ` Oliver Upton
  2023-12-19  7:24   ` Kunkun Jiang
  1 sibling, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2023-12-17 17:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Kunkun Jiang, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On Sun, Dec 17, 2023 at 11:26:15AM +0000, Marc Zyngier wrote:

[...]

> But this has *nothing* to do with the guest. This is the *host*
> userspace performing a write to the redistributor view, which has
> different semantics. Which is why your earlier description made no
> sense to me.
> 
> I think the problem is slightly larger than what you describe. A write
> to ISPENDR0 should be propagated to the ITS for any values of the
> latch, just like this happens on enabling HW-backed SGIs.
> 
> Can you please give this a go?

What do you think about using this as an opportunity for a bit of
cleanup? It'd be nice unify the various MMIO and uaccess handlers for
SPENDING + CPENDING while being careful about the arch_timer interrupt.

	clear = ~val;
	vgic_uaccess_write_spending(val);
	vgic_uaccess_write_cpending(clear);

Happy to take your fix too, especially in case I missed something
obvious :)

[*] https://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux.git/log/?h=kvm-arm64/vsgi-spending-fixes

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-17 17:33   ` Oliver Upton
@ 2023-12-17 17:34     ` Oliver Upton
  2023-12-17 18:52       ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2023-12-17 17:34 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Kunkun Jiang, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On Sun, Dec 17, 2023 at 05:33:16PM +0000, Oliver Upton wrote:
> On Sun, Dec 17, 2023 at 11:26:15AM +0000, Marc Zyngier wrote:
> 
> [...]
> 
> > But this has *nothing* to do with the guest. This is the *host*
> > userspace performing a write to the redistributor view, which has
> > different semantics. Which is why your earlier description made no
> > sense to me.
> > 
> > I think the problem is slightly larger than what you describe. A write
> > to ISPENDR0 should be propagated to the ITS for any values of the
> > latch, just like this happens on enabling HW-backed SGIs.
> > 
> > Can you please give this a go?
> 
> What do you think about using this as an opportunity for a bit of
> cleanup? It'd be nice unify the various MMIO and uaccess handlers for
> SPENDING + CPENDING while being careful about the arch_timer interrupt.

Cut myself off... Meant to say that user writes to SPENDING for GICv3
can then be treated as:

> 	clear = ~val;
> 	vgic_uaccess_write_spending(val);
> 	vgic_uaccess_write_cpending(clear);

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-17 17:34     ` Oliver Upton
@ 2023-12-17 18:52       ` Marc Zyngier
  2023-12-18 17:20         ` Oliver Upton
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2023-12-17 18:52 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Kunkun Jiang, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On Sun, 17 Dec 2023 17:34:38 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Sun, Dec 17, 2023 at 05:33:16PM +0000, Oliver Upton wrote:
> > On Sun, Dec 17, 2023 at 11:26:15AM +0000, Marc Zyngier wrote:
> > 
> > [...]
> > 
> > > But this has *nothing* to do with the guest. This is the *host*
> > > userspace performing a write to the redistributor view, which has
> > > different semantics. Which is why your earlier description made no
> > > sense to me.
> > > 
> > > I think the problem is slightly larger than what you describe. A write
> > > to ISPENDR0 should be propagated to the ITS for any values of the
> > > latch, just like this happens on enabling HW-backed SGIs.
> > > 
> > > Can you please give this a go?
> > 
> > What do you think about using this as an opportunity for a bit of
> > cleanup? It'd be nice unify the various MMIO and uaccess handlers for
> > SPENDING + CPENDING while being careful about the arch_timer interrupt.

What is special about the timer interrupt?

> Cut myself off... Meant to say that user writes to SPENDING for GICv3
> can then be treated as:
> 
> > 	clear = ~val;
> > 	vgic_uaccess_write_spending(val);
> > 	vgic_uaccess_write_cpending(clear);

Could be. But I'd rather have separate fixes from more invasive
reworks.  Specially given that we have had multiple ugly bugs around
this code in the past, which is why we ended up splitting userspace
from guest accessors.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-17 18:52       ` Marc Zyngier
@ 2023-12-18 17:20         ` Oliver Upton
  2023-12-18 17:29           ` Marc Zyngier
  2023-12-19  9:32           ` Zenghui Yu
  0 siblings, 2 replies; 9+ messages in thread
From: Oliver Upton @ 2023-12-18 17:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Kunkun Jiang, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On Sun, Dec 17, 2023 at 06:52:55PM +0000, Marc Zyngier wrote:
> On Sun, 17 Dec 2023 17:34:38 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > On Sun, Dec 17, 2023 at 05:33:16PM +0000, Oliver Upton wrote:
> > > On Sun, Dec 17, 2023 at 11:26:15AM +0000, Marc Zyngier wrote:
> > > 
> > > [...]
> > > 
> > > > But this has *nothing* to do with the guest. This is the *host*
> > > > userspace performing a write to the redistributor view, which has
> > > > different semantics. Which is why your earlier description made no
> > > > sense to me.
> > > > 
> > > > I think the problem is slightly larger than what you describe. A write
> > > > to ISPENDR0 should be propagated to the ITS for any values of the
> > > > latch, just like this happens on enabling HW-backed SGIs.
> > > > 
> > > > Can you please give this a go?
> > > 
> > > What do you think about using this as an opportunity for a bit of
> > > cleanup? It'd be nice unify the various MMIO and uaccess handlers for
> > > SPENDING + CPENDING while being careful about the arch_timer interrupt.
> 
> What is special about the timer interrupt?

Isn't that the case where we have a physical IRQ mapped and wind up
forwarding state to the physical GIC?

> Could be. But I'd rather have separate fixes from more invasive
> reworks.  Specially given that we have had multiple ugly bugs around
> this code in the past, which is why we ended up splitting userspace
> from guest accessors.

Fine by me. I had felt like a common helper w/ the user v. guest
exclusions is a bit easier to understand than diffing two very similar
functions, but it isn't a big deal.

Anyway, I'm happy with your fix. I'd like Kunkun to give it a go but
either way I can pick it up for 6.7.

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-18 17:20         ` Oliver Upton
@ 2023-12-18 17:29           ` Marc Zyngier
  2023-12-19  9:32           ` Zenghui Yu
  1 sibling, 0 replies; 9+ messages in thread
From: Marc Zyngier @ 2023-12-18 17:29 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Kunkun Jiang, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On Mon, 18 Dec 2023 17:20:04 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Sun, Dec 17, 2023 at 06:52:55PM +0000, Marc Zyngier wrote:
> > On Sun, 17 Dec 2023 17:34:38 +0000,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > 
> > > On Sun, Dec 17, 2023 at 05:33:16PM +0000, Oliver Upton wrote:
> > > > On Sun, Dec 17, 2023 at 11:26:15AM +0000, Marc Zyngier wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > But this has *nothing* to do with the guest. This is the *host*
> > > > > userspace performing a write to the redistributor view, which has
> > > > > different semantics. Which is why your earlier description made no
> > > > > sense to me.
> > > > > 
> > > > > I think the problem is slightly larger than what you describe. A write
> > > > > to ISPENDR0 should be propagated to the ITS for any values of the
> > > > > latch, just like this happens on enabling HW-backed SGIs.
> > > > > 
> > > > > Can you please give this a go?
> > > > 
> > > > What do you think about using this as an opportunity for a bit of
> > > > cleanup? It'd be nice unify the various MMIO and uaccess handlers for
> > > > SPENDING + CPENDING while being careful about the arch_timer interrupt.
> > 
> > What is special about the timer interrupt?
> 
> Isn't that the case where we have a physical IRQ mapped and wind up
> forwarding state to the physical GIC?

Indeed. But that's not specific to the timer. That's a general
infrastructure that NV also uses it for the GIC maintenance interrupt.

> 
> > Could be. But I'd rather have separate fixes from more invasive
> > reworks.  Specially given that we have had multiple ugly bugs around
> > this code in the past, which is why we ended up splitting userspace
> > from guest accessors.
> 
> Fine by me. I had felt like a common helper w/ the user v. guest
> exclusions is a bit easier to understand than diffing two very similar
> functions, but it isn't a big deal.

To be clear, I'm not objecting to that rework. It's just that we
should carefully review these patches on the list, as they would have
a wider ranging impact than a pure GICv4.1 change.

> Anyway, I'm happy with your fix. I'd like Kunkun to give it a go but
> either way I can pick it up for 6.7.

Yup, same here. But we also shouldn't hold the 6.7 fixes for too long.
If we don't get feedback from Kunkun shortly, I'd suggest to move this
patch to 6.8, and at this point your approach makes complete sense.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-17 11:26 ` Marc Zyngier
  2023-12-17 17:33   ` Oliver Upton
@ 2023-12-19  7:24   ` Kunkun Jiang
  1 sibling, 0 replies; 9+ messages in thread
From: Kunkun Jiang @ 2023-12-19  7:24 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton
  Cc: James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

Hi Marc and Oliver,

On 2023/12/17 19:26, Marc Zyngier wrote:
> On Thu, 14 Dec 2023 12:13:57 +0000,
> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>> Hi list,
>>
>> We have observed on GICv4.1 systems that, after a guest reset, the
>> secondary VCPU would receive an IPI_CPU_STOP accidently and failed to
>> come online eventually.
>>
>> | Guest User-space
> Guest userspace????
Sorry. The formatting is a bit messy.
|            Guest                               User-space
|
|  reset (with a vSGI pending in the
|  hardware) [0]
|
|                                     disable the distributor (write 0
|                                     into GICD_CTLR) [1]
|
|                                     clear pending status (write 0 into
|                                     GICR_ISPENDR0 for each RD) [2]
|
|  disable the distributor (write 0
|  into GICD_CTLR) [3]
|
|  enable the distributor with ARE,
|  Group1 and nASSGIreq [4]
>> |
>> | reset (with a vSGI pending in the
>> | hardware) [0]
>> |
>> | disable the distributor (write 0
>> | into GICD_CTLR) [1]
>> |
>> | clear pending status (write 0 into
>> | GICR_ISPENDR0 for each RD) [2]
> This cannot clear the pending bits. Writing 0 to any of the
> ISPEND/ICPEND registers is effectively a NOP.
>
> If you want to remove the pending SGIs, you need to write a bunch of
> 1s to ICPENDR0.
>
>> |
>> | disable the distributor (write 0
>> | into GICD_CTLR) [3]
>> |
>> | enable the distributor with ARE,
>> | Group1 and nASSGIreq [4]
>>
>> The problem is that even if user-space tries to disable the distributor
> I don't understand what userspace does here. Why is it significant
> that it is an EL0 access? I don't know of any SW doing that, but even
> if it existed, this should make no difference.
>
>> and clear pending bits for all SGIs, we don't actually propogate it into
>> HW (we only record it via vgic_dist->{enabled, nassgireq} and
>> vgic_irq->pending_latch) and the vSGI remains pending across the guest
>> reset.
>>
>> And when we're at [4], all vSGI's vgic_irq->hw are *true* and
>> vgic_v4_enable_vsgis() becomes a nop.. That's not good.
>>
>> The following solution can solve the problem. Not sure if this is a good
>> solution.Sent out early for suggestions or solutions.
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> index 89117ba2528a..3678ab33f5b9 100644
>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> @@ -374,6 +374,10 @@ static int vgic_v3_uaccess_write_pending(struct
>> kvm_vcpu *vcpu,
>>               irq->pending_latch = true;
>>               vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
>>           } else {
>> +             if (irq->hw && vgic_irq_is_sgi(irq->intid))
>> +                 irq_set_irqchip_state(irq->host_irq,
>> +                              IRQCHIP_STATE_PENDING,
>> +                              false);
>>               irq->pending_latch = false;
>>               raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
>>           }
>>
> But this has *nothing* to do with the guest. This is the *host*
> userspace performing a write to the redistributor view, which has
> different semantics. Which is why your earlier description made no
> sense to me.
>
> I think the problem is slightly larger than what you describe. A write
> to ISPENDR0 should be propagated to the ITS for any values of the
> latch, just like this happens on enabling HW-backed SGIs.
>
> Can you please give this a go?
I have given it a go. I tested reseting the guest 100 times in succession,
and the problem did not occur again.

Thanks,
Kunkun Jiang
> Thanks,
>
> 	M.
>
> >From 9932d74392d969057e84bc3c18bd15f1769b6211 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Sun, 17 Dec 2023 11:15:09 +0000
> Subject: [PATCH] KVM: arm64: vgic-v4: Restore pending state on host userspace
>   write
>
> When the VMM writes to ISPENDR0 to set the state pending state of
> an SGI, we fail to convey this to the HW if this SGI is already
> backed by a GICv4.1 vSGI.
>
> This is a bit of a corner case, as this would only occur if the
> vgic state is changed on an already running VM, but this can
> apparently happen across a guest reset driven by the VMM.
>
> Fix this by always writing out the pending_latch value to the
> HW, and reseting it to false.
>
> Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/7e7f2c0c-448b-10a9-8929-4b8f4f6e2a32@huawei.com
> ---
>   arch/arm64/kvm/vgic/vgic-mmio-v3.c | 27 +++++++++++++++++----------
>   1 file changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index a764b0ab8bf9..2533f264b954 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -365,19 +365,26 @@ static int vgic_v3_uaccess_write_pending(struct kvm_vcpu *vcpu,
>   		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
>   
>   		raw_spin_lock_irqsave(&irq->irq_lock, flags);
> -		if (test_bit(i, &val)) {
> -			/*
> -			 * pending_latch is set irrespective of irq type
> -			 * (level or edge) to avoid dependency that VM should
> -			 * restore irq config before pending info.
> -			 */
> -			irq->pending_latch = true;
> -			vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
> -		} else {
> +
> +		/*
> +		 * pending_latch is set irrespective of irq type
> +		 * (level or edge) to avoid dependency that VM should
> +		 * restore irq config before pending info.
> +		 */
> +		irq->pending_latch = test_bit(i, &val);
> +
> +		if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
> +			irq_set_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_PENDING,
> +					      irq->pending_latch);
>   			irq->pending_latch = false;
> -			raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
>   		}
>   
> +		if (irq->pending_latch)
> +			vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
> +		else
> +			raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
> +
>   		vgic_put_irq(vcpu->kvm, irq);
>   	}
>   

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] GICv4.1: vSGI remains pending across the guest reset
  2023-12-18 17:20         ` Oliver Upton
  2023-12-18 17:29           ` Marc Zyngier
@ 2023-12-19  9:32           ` Zenghui Yu
  1 sibling, 0 replies; 9+ messages in thread
From: Zenghui Yu @ 2023-12-19  9:32 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Marc Zyngier, Kunkun Jiang, James Morse, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Jean-Philippe Brucker,
	moderated list:ARM SMMU DRIVERS, kvmarm, open list,
	wanghaibin.wang@huawei.com

On 2023/12/19 1:20, Oliver Upton wrote:

> Anyway, I'm happy with your fix. I'd like Kunkun to give it a go but
> either way I can pick it up for 6.7.

+1. When applying, feel free to add my

Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>

and perhaps

Cc: stable@vger.kernel.org # 5.10+

Thanks,
Zenghui

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-19  9:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-14 12:13 [bug report] GICv4.1: vSGI remains pending across the guest reset Kunkun Jiang
2023-12-17 11:26 ` Marc Zyngier
2023-12-17 17:33   ` Oliver Upton
2023-12-17 17:34     ` Oliver Upton
2023-12-17 18:52       ` Marc Zyngier
2023-12-18 17:20         ` Oliver Upton
2023-12-18 17:29           ` Marc Zyngier
2023-12-19  9:32           ` Zenghui Yu
2023-12-19  7:24   ` Kunkun Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).