From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Thomas Gleixner <tglx@linutronix.de>,
<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
<wanghaibin.wang@huawei.com>
Subject: Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
Date: Sun, 28 Jan 2024 13:28:16 +0000 [thread overview]
Message-ID: <86r0i17fkf.wl-maz@kernel.org> (raw)
In-Reply-To: <44051a41-f82e-57d9-001a-d4d195d59f2e@huawei.com>
On Sat, 27 Jan 2024 02:41:55 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> > At this stage, I think the VMOVP optimisation is wrong and that we
> > should drop it.
> This is the last resort. Maybe we can think of another way.
I think the only other way is to never elide the VMOVP from
set_affinity, but to only call into set_affinity when strictly
necessary:
- when making the VPE resident on a CPU that isn't part of the same
affinity group (VPE migration outside of the affinity group)
- when making the VPE non-resident on a CPU that isn't the doorbell
delivery CPU *AND* that there is a need for a doorbell (VPE
migration inside the affinity group)
Which means it is the responsibility of the upper layers of the stack
to make these decisions, and that we can keep the low-level driver as
simple as possible.
In the meantime, here's the patch I'm proposing to merge ASAP. Please
let me know if that works for you.
M.
From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Sun, 28 Jan 2024 13:05:24 +0000
Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
When updating the affinity of a VPE, we currently skip the VMOVP
command if the two CPUs are part of the same VPE affinity.
But this is wrong, as the doorbell corresponding to this VPE
is still delivered on the 'old' CPU, which screws up the balancing.
Furthermore, offlining that 'old' CPU results in doorbell interrupts
generated for this VPE being discarded.
The harsh reality is that we cannot easily elide VMOVP when
a set_affinity request occurs. It needs to be obeyed, and if
an optimisation is to be made, it is at the point where the affinity
change request is made (such as in KVM).
Drop the VMOVP elision altogether, and only use the vpe_table_mask
to try and stay within the same ITS affinity group if at all possible.
Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
---
drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 250b4562f308..78aece62bff5 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
bool force)
{
struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- int from, cpu = cpumask_first(mask_val);
+ struct cpumask common;
unsigned long flags;
+ int from, cpu;
/*
* Changing affinity is mega expensive, so let's be as lazy as
@@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
* taken on any vLPI handling path that evaluates vpe->col_idx.
*/
from = vpe_to_cpuid_lock(vpe, &flags);
- if (from == cpu)
- goto out;
-
- vpe->col_idx = cpu;
/*
- * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
- * is sharing its VPE table with the current one.
+ * If we are offered another CPU in the same GICv4.1 ITS
+ * affinity, pick this one. Otherwise, any CPU will do.
*/
- if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
- cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
+ if (gic_data_rdist_cpu(from)->vpe_table_mask &&
+ cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
+ cpu = cpumask_first(&common);
+ else
+ cpu = cpumask_first(mask_val);
+
+ if (from == cpu)
goto out;
+ vpe->col_idx = cpu;
+
its_send_vmovp(vpe);
its_vpe_db_proxy_move(vpe, from, cpu);
--
2.39.2
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Thomas Gleixner <tglx@linutronix.de>,
<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
<wanghaibin.wang@huawei.com>
Subject: Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
Date: Sun, 28 Jan 2024 13:28:16 +0000 [thread overview]
Message-ID: <86r0i17fkf.wl-maz@kernel.org> (raw)
In-Reply-To: <44051a41-f82e-57d9-001a-d4d195d59f2e@huawei.com>
On Sat, 27 Jan 2024 02:41:55 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> > At this stage, I think the VMOVP optimisation is wrong and that we
> > should drop it.
> This is the last resort. Maybe we can think of another way.
I think the only other way is to never elide the VMOVP from
set_affinity, but to only call into set_affinity when strictly
necessary:
- when making the VPE resident on a CPU that isn't part of the same
affinity group (VPE migration outside of the affinity group)
- when making the VPE non-resident on a CPU that isn't the doorbell
delivery CPU *AND* that there is a need for a doorbell (VPE
migration inside the affinity group)
Which means it is the responsibility of the upper layers of the stack
to make these decisions, and that we can keep the low-level driver as
simple as possible.
In the meantime, here's the patch I'm proposing to merge ASAP. Please
let me know if that works for you.
M.
From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Sun, 28 Jan 2024 13:05:24 +0000
Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
When updating the affinity of a VPE, we currently skip the VMOVP
command if the two CPUs are part of the same VPE affinity.
But this is wrong, as the doorbell corresponding to this VPE
is still delivered on the 'old' CPU, which screws up the balancing.
Furthermore, offlining that 'old' CPU results in doorbell interrupts
generated for this VPE being discarded.
The harsh reality is that we cannot easily elide VMOVP when
a set_affinity request occurs. It needs to be obeyed, and if
an optimisation is to be made, it is at the point where the affinity
change request is made (such as in KVM).
Drop the VMOVP elision altogether, and only use the vpe_table_mask
to try and stay within the same ITS affinity group if at all possible.
Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
---
drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 250b4562f308..78aece62bff5 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
bool force)
{
struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- int from, cpu = cpumask_first(mask_val);
+ struct cpumask common;
unsigned long flags;
+ int from, cpu;
/*
* Changing affinity is mega expensive, so let's be as lazy as
@@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
* taken on any vLPI handling path that evaluates vpe->col_idx.
*/
from = vpe_to_cpuid_lock(vpe, &flags);
- if (from == cpu)
- goto out;
-
- vpe->col_idx = cpu;
/*
- * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
- * is sharing its VPE table with the current one.
+ * If we are offered another CPU in the same GICv4.1 ITS
+ * affinity, pick this one. Otherwise, any CPU will do.
*/
- if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
- cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
+ if (gic_data_rdist_cpu(from)->vpe_table_mask &&
+ cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
+ cpu = cpumask_first(&common);
+ else
+ cpu = cpumask_first(mask_val);
+
+ if (from == cpu)
goto out;
+ vpe->col_idx = cpu;
+
its_send_vmovp(vpe);
its_vpe_db_proxy_move(vpe, from, cpu);
--
2.39.2
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-01-28 13:28 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-26 10:30 [PATCH v2 0/2] irqchip/gic-v4.1: Two bugfixes about doorbell affinity Kunkun Jiang
2024-01-26 10:30 ` Kunkun Jiang
2024-01-26 10:30 ` [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 " Kunkun Jiang
2024-01-26 10:30 ` Kunkun Jiang
2024-01-26 10:52 ` Marc Zyngier
2024-01-26 10:52 ` Marc Zyngier
2024-01-27 2:41 ` Kunkun Jiang
2024-01-27 2:41 ` Kunkun Jiang
2024-01-28 13:28 ` Marc Zyngier [this message]
2024-01-28 13:28 ` Marc Zyngier
2024-01-30 13:32 ` Kunkun Jiang
2024-01-30 13:32 ` Kunkun Jiang
2024-01-30 13:55 ` Marc Zyngier
2024-01-30 13:55 ` Marc Zyngier
2024-01-31 1:29 ` Kunkun Jiang
2024-01-31 1:29 ` Kunkun Jiang
2024-01-31 14:05 ` Marc Zyngier
2024-01-31 14:05 ` Marc Zyngier
2024-01-26 10:30 ` [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario Kunkun Jiang
2024-01-26 10:30 ` Kunkun Jiang
2024-01-26 10:52 ` Marc Zyngier
2024-01-26 10:52 ` Marc Zyngier
2024-01-27 2:59 ` Kunkun Jiang
2024-01-27 2:59 ` Kunkun Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86r0i17fkf.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=james.morse@arm.com \
--cc=jiangkunkun@huawei.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=wanghaibin.wang@huawei.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.