linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
@ 2024-01-26  9:17 Kunkun Jiang
  2024-01-26 10:19 ` Marc Zyngier
  0 siblings, 1 reply; 2+ messages in thread
From: Kunkun Jiang @ 2024-01-26  9:17 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Thomas Gleixner
  Cc: linux-arm-kernel, kvmarm, wanghaibin.wang, Kunkun Jiang

commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
of VMOVP") make an optimization, VMOVP can be skipped if moving
VPE to a cpu whose RD is sharing its VPE table with the current one.
But when skipping VMOVP, the affinity recorded in irq_data is still
updated. This causes the doorbell affinity recorfed in the irq_data
to be inconsistent with the actual.

In corner case, this may result in lost interrupts:
0. Each cpu die shares a VPE table and contains 32 CPUs
   die0(CPU0-31) die1(CPU32-63)...
1. VPE resides on CPU32, doorbell affinity to CPU32.
2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
   The affinity recorded in irq_data is CPU33.
3. Manually offline CPU32 on the host side:
   'echo 0 > /sys/devices/system/cpu/cpu32/online'
4. Core code cannot move the doorbell affinity to CPU32, since the
   record in irq_data is CPU33.
5. Subsequent doorbell interrupts will be lost.

So affinity recoreded in irq_data should not be updated when skipping
VMOVP.

Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/irqchip/irq-gic-v3-its.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index d097001c1e3e..4b1dbb697959 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	its_send_vmovp(vpe);
 	its_vpe_db_proxy_move(vpe, from, cpu);
 
-out:
 	irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+out:
 	vpe_to_cpuid_unlock(vpe, flags);
 
 	return IRQ_SET_MASK_OK_DONE;
-- 
2.33.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
  2024-01-26  9:17 [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity Kunkun Jiang
@ 2024-01-26 10:19 ` Marc Zyngier
  0 siblings, 0 replies; 2+ messages in thread
From: Marc Zyngier @ 2024-01-26 10:19 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang

On Fri, 26 Jan 2024 09:17:52 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
> of VMOVP") make an optimization, VMOVP can be skipped if moving
> VPE to a cpu whose RD is sharing its VPE table with the current one.
> But when skipping VMOVP, the affinity recorded in irq_data is still
> updated. This causes the doorbell affinity recorfed in the irq_data
> to be inconsistent with the actual.
> 
> In corner case, this may result in lost interrupts:
> 0. Each cpu die shares a VPE table and contains 32 CPUs
>    die0(CPU0-31) die1(CPU32-63)...
> 1. VPE resides on CPU32, doorbell affinity to CPU32.
> 2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
>    The affinity recorded in irq_data is CPU33.
> 3. Manually offline CPU32 on the host side:
>    'echo 0 > /sys/devices/system/cpu/cpu32/online'
> 4. Core code cannot move the doorbell affinity to CPU32, since the
>    record in irq_data is CPU33.
> 5. Subsequent doorbell interrupts will be lost.
> 
> So affinity recoreded in irq_data should not be updated when skipping
> VMOVP.
> 
> Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index d097001c1e3e..4b1dbb697959 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	its_send_vmovp(vpe);
>  	its_vpe_db_proxy_move(vpe, from, cpu);
>  
> -out:
>  	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +out:

That looks wrong. You are lying to the core code by saying that it's
all OK, and yet haven't done *anything*. This stuff is obviously
buggy, but I don't think this is right.

In your example, you don't even solve the problem: if CPUs 32 and 33
are part of the same ITS affinity group, you won't issue a VMOVP
either, so this doesn't fix anything.

At this stage, I think the VMOVP optimisation is wrong and that we
should drop it.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-01-26 10:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-26  9:17 [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity Kunkun Jiang
2024-01-26 10:19 ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).