All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
	<wanghaibin.wang@huawei.com>
Subject: Re: [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
Date: Fri, 26 Jan 2024 10:19:30 +0000	[thread overview]
Message-ID: <86v87g75xp.wl-maz@kernel.org> (raw)
In-Reply-To: <20240126091752.1102-1-jiangkunkun@huawei.com>

On Fri, 26 Jan 2024 09:17:52 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
> of VMOVP") make an optimization, VMOVP can be skipped if moving
> VPE to a cpu whose RD is sharing its VPE table with the current one.
> But when skipping VMOVP, the affinity recorded in irq_data is still
> updated. This causes the doorbell affinity recorfed in the irq_data
> to be inconsistent with the actual.
> 
> In corner case, this may result in lost interrupts:
> 0. Each cpu die shares a VPE table and contains 32 CPUs
>    die0(CPU0-31) die1(CPU32-63)...
> 1. VPE resides on CPU32, doorbell affinity to CPU32.
> 2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
>    The affinity recorded in irq_data is CPU33.
> 3. Manually offline CPU32 on the host side:
>    'echo 0 > /sys/devices/system/cpu/cpu32/online'
> 4. Core code cannot move the doorbell affinity to CPU32, since the
>    record in irq_data is CPU33.
> 5. Subsequent doorbell interrupts will be lost.
> 
> So affinity recoreded in irq_data should not be updated when skipping
> VMOVP.
> 
> Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index d097001c1e3e..4b1dbb697959 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	its_send_vmovp(vpe);
>  	its_vpe_db_proxy_move(vpe, from, cpu);
>  
> -out:
>  	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +out:

That looks wrong. You are lying to the core code by saying that it's
all OK, and yet haven't done *anything*. This stuff is obviously
buggy, but I don't think this is right.

In your example, you don't even solve the problem: if CPUs 32 and 33
are part of the same ITS affinity group, you won't issue a VMOVP
either, so this doesn't fix anything.

At this stage, I think the VMOVP optimisation is wrong and that we
should drop it.

	M.

-- 
Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
	<wanghaibin.wang@huawei.com>
Subject: Re: [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
Date: Fri, 26 Jan 2024 10:19:30 +0000	[thread overview]
Message-ID: <86v87g75xp.wl-maz@kernel.org> (raw)
In-Reply-To: <20240126091752.1102-1-jiangkunkun@huawei.com>

On Fri, 26 Jan 2024 09:17:52 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
> of VMOVP") make an optimization, VMOVP can be skipped if moving
> VPE to a cpu whose RD is sharing its VPE table with the current one.
> But when skipping VMOVP, the affinity recorded in irq_data is still
> updated. This causes the doorbell affinity recorfed in the irq_data
> to be inconsistent with the actual.
> 
> In corner case, this may result in lost interrupts:
> 0. Each cpu die shares a VPE table and contains 32 CPUs
>    die0(CPU0-31) die1(CPU32-63)...
> 1. VPE resides on CPU32, doorbell affinity to CPU32.
> 2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
>    The affinity recorded in irq_data is CPU33.
> 3. Manually offline CPU32 on the host side:
>    'echo 0 > /sys/devices/system/cpu/cpu32/online'
> 4. Core code cannot move the doorbell affinity to CPU32, since the
>    record in irq_data is CPU33.
> 5. Subsequent doorbell interrupts will be lost.
> 
> So affinity recoreded in irq_data should not be updated when skipping
> VMOVP.
> 
> Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index d097001c1e3e..4b1dbb697959 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	its_send_vmovp(vpe);
>  	its_vpe_db_proxy_move(vpe, from, cpu);
>  
> -out:
>  	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +out:

That looks wrong. You are lying to the core code by saying that it's
all OK, and yet haven't done *anything*. This stuff is obviously
buggy, but I don't think this is right.

In your example, you don't even solve the problem: if CPUs 32 and 33
are part of the same ITS affinity group, you won't issue a VMOVP
either, so this doesn't fix anything.

At this stage, I think the VMOVP optimisation is wrong and that we
should drop it.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-01-26 10:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26  9:17 [PATCH] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity Kunkun Jiang
2024-01-26  9:17 ` Kunkun Jiang
2024-01-26 10:19 ` Marc Zyngier [this message]
2024-01-26 10:19   ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86v87g75xp.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=james.morse@arm.com \
    --cc=jiangkunkun@huawei.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wanghaibin.wang@huawei.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.