* [PATCH v2 0/2] irqchip/gic-v4.1: Two bugfixes about doorbell affinity
@ 2024-01-26 10:30 Kunkun Jiang
2024-01-26 10:30 ` [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 " Kunkun Jiang
2024-01-26 10:30 ` [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario Kunkun Jiang
0 siblings, 2 replies; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-26 10:30 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
Zenghui Yu, Thomas Gleixner
Cc: linux-arm-kernel, kvmarm, wanghaibin.wang, Kunkun Jiang
There are two bugs in current doorbell affinity settings. May cause lost
interrupts. See the patches commit message for details.
Kunkun Jiang (2):
irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario
drivers/irqchip/irq-gic-v3-its.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--
2.33.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-26 10:30 [PATCH v2 0/2] irqchip/gic-v4.1: Two bugfixes about doorbell affinity Kunkun Jiang
@ 2024-01-26 10:30 ` Kunkun Jiang
2024-01-26 10:52 ` Marc Zyngier
2024-01-26 10:30 ` [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario Kunkun Jiang
1 sibling, 1 reply; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-26 10:30 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
Zenghui Yu, Thomas Gleixner
Cc: linux-arm-kernel, kvmarm, wanghaibin.wang, Kunkun Jiang
dd3f050a216e make an optimization, VMOVP can be skipped if moving
VPE to a cpu whose RD is sharing its VPE table with the current one.
But when skipping VMOVP, the affinity recorded in irq_data is still
updated. This causes the doorbell affinity recorfed in the irq_data
to be inconsistent with the actual.
In corner case, this may result in lost interrupts:
0. Each cpu die shares a VPE table and contains 32 CPUs
die0(CPU0-31) die1(CPU32-63)...
1. VPE resides on CPU32, doorbell affinity to CPU32.
2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
The affinity recorded in irq_data is CPU33.
3. Manually offline CPU32 on the host side:
'echo 0 > /sys/devices/system/cpu/cpu32/online'
4. Core code cannot move the doorbell affinity to CPU32, since the
record in irq_data is CPU33.
5. Subsequent doorbell interrupts will be lost.
So affinity recoreded in irq_data should not be updated when skipping
VMOVP.
Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
drivers/irqchip/irq-gic-v3-its.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index d097001c1e3e..4b1dbb697959 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
its_send_vmovp(vpe);
its_vpe_db_proxy_move(vpe, from, cpu);
-out:
irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+out:
vpe_to_cpuid_unlock(vpe, flags);
return IRQ_SET_MASK_OK_DONE;
--
2.33.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario
2024-01-26 10:30 [PATCH v2 0/2] irqchip/gic-v4.1: Two bugfixes about doorbell affinity Kunkun Jiang
2024-01-26 10:30 ` [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 " Kunkun Jiang
@ 2024-01-26 10:30 ` Kunkun Jiang
2024-01-26 10:52 ` Marc Zyngier
1 sibling, 1 reply; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-26 10:30 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
Zenghui Yu, Thomas Gleixner
Cc: linux-arm-kernel, kvmarm, wanghaibin.wang, Kunkun Jiang
commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
of VMOVP") make an optimization, VMOVP can be skipped if moving
VPE to a cpu whose RD is sharing its VPE table with the current one.
In the cpu offline scenario, the mask_val is the entire cpu range,
except for the offline cpu. Therefore, the first cpu is CPU0. In
corner case, this may result in lost interrupts:
0. Each cpu die shares a VPE table and contains 32 CPUs
die0(CPU0-31) die1(CPU32-63)...
1. VPE resides on CPU32, doorbell affinity to CPU32.
2. Move VPE to CPU16, doorbell affinity to CPU16.
3. Manually offline CPU16 on the host side:
'echo 0 > /sys/devices/system/cpu/cpu16/online'
4. VMOVP will be skipped.
5. Subsequent doorbell interrupts will be lost.
So VMOVP cannot be skipped when the affinity CPU is not in mask_val.
Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
drivers/irqchip/irq-gic-v3-its.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 4b1dbb697959..bfb922f16bc6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3817,7 +3817,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
bool force)
{
struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- int from, cpu = cpumask_first(mask_val);
+ int cur, from, cpu = cpumask_first(mask_val);
unsigned long flags;
/*
@@ -3839,11 +3839,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
vpe->col_idx = cpu;
+ cur = cpumask_first(irq_data_get_effective_affinity_mask(d));
/*
* GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
* is sharing its VPE table with the current one.
*/
if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
+ cpumask_test_cpu(cur, mask_val) &&
cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
goto out;
--
2.33.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-26 10:30 ` [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 " Kunkun Jiang
@ 2024-01-26 10:52 ` Marc Zyngier
2024-01-27 2:41 ` Kunkun Jiang
0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2024-01-26 10:52 UTC (permalink / raw)
To: Kunkun Jiang
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
On Fri, 26 Jan 2024 10:30:11 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> dd3f050a216e make an optimization, VMOVP can be skipped if moving
> VPE to a cpu whose RD is sharing its VPE table with the current one.
> But when skipping VMOVP, the affinity recorded in irq_data is still
> updated. This causes the doorbell affinity recorfed in the irq_data
> to be inconsistent with the actual.
>
> In corner case, this may result in lost interrupts:
> 0. Each cpu die shares a VPE table and contains 32 CPUs
> die0(CPU0-31) die1(CPU32-63)...
> 1. VPE resides on CPU32, doorbell affinity to CPU32.
> 2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
> The affinity recorded in irq_data is CPU33.
> 3. Manually offline CPU32 on the host side:
> 'echo 0 > /sys/devices/system/cpu/cpu32/online'
> 4. Core code cannot move the doorbell affinity to CPU32, since the
> record in irq_data is CPU33.
> 5. Subsequent doorbell interrupts will be lost.
>
> So affinity recoreded in irq_data should not be updated when skipping
> VMOVP.
>
> Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index d097001c1e3e..4b1dbb697959 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
> its_send_vmovp(vpe);
> its_vpe_db_proxy_move(vpe, from, cpu);
>
> -out:
> irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +out:
> vpe_to_cpuid_unlock(vpe, flags);
>
> return IRQ_SET_MASK_OK_DONE;
This is becoming *very* annoying. I've already told you this patch was
wrong, yet you keep sending it (twice in just over an hour).
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario
2024-01-26 10:30 ` [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario Kunkun Jiang
@ 2024-01-26 10:52 ` Marc Zyngier
2024-01-27 2:59 ` Kunkun Jiang
0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2024-01-26 10:52 UTC (permalink / raw)
To: Kunkun Jiang
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
On Fri, 26 Jan 2024 10:30:12 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
> of VMOVP") make an optimization, VMOVP can be skipped if moving
> VPE to a cpu whose RD is sharing its VPE table with the current one.
>
> In the cpu offline scenario, the mask_val is the entire cpu range,
> except for the offline cpu. Therefore, the first cpu is CPU0. In
> corner case, this may result in lost interrupts:
> 0. Each cpu die shares a VPE table and contains 32 CPUs
> die0(CPU0-31) die1(CPU32-63)...
> 1. VPE resides on CPU32, doorbell affinity to CPU32.
> 2. Move VPE to CPU16, doorbell affinity to CPU16.
> 3. Manually offline CPU16 on the host side:
> 'echo 0 > /sys/devices/system/cpu/cpu16/online'
> 4. VMOVP will be skipped.
> 5. Subsequent doorbell interrupts will be lost.
>
> So VMOVP cannot be skipped when the affinity CPU is not in mask_val.
>
> Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 4b1dbb697959..bfb922f16bc6 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3817,7 +3817,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
> bool force)
> {
> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> - int from, cpu = cpumask_first(mask_val);
> + int cur, from, cpu = cpumask_first(mask_val);
> unsigned long flags;
>
> /*
> @@ -3839,11 +3839,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
>
> vpe->col_idx = cpu;
>
> + cur = cpumask_first(irq_data_get_effective_affinity_mask(d));
When is this not equal to 'from'?
> /*
> * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
> * is sharing its VPE table with the current one.
> */
> if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
> + cpumask_test_cpu(cur, mask_val) &&
> cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
> goto out;
This looks utterly pointless to me.
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-26 10:52 ` Marc Zyngier
@ 2024-01-27 2:41 ` Kunkun Jiang
2024-01-28 13:28 ` Marc Zyngier
0 siblings, 1 reply; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-27 2:41 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
Hi Marc,
On 2024/1/26 18:52, Marc Zyngier wrote:
> On Fri, 26 Jan 2024 10:30:11 +0000,
> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>> dd3f050a216e make an optimization, VMOVP can be skipped if moving
>> VPE to a cpu whose RD is sharing its VPE table with the current one.
>> But when skipping VMOVP, the affinity recorded in irq_data is still
>> updated. This causes the doorbell affinity recorfed in the irq_data
>> to be inconsistent with the actual.
>>
>> In corner case, this may result in lost interrupts:
>> 0. Each cpu die shares a VPE table and contains 32 CPUs
>> die0(CPU0-31) die1(CPU32-63)...
>> 1. VPE resides on CPU32, doorbell affinity to CPU32.
>> 2. Move VPE to CPU33, skip VMOVP, doorbell still affinity to CPU32.
>> The affinity recorded in irq_data is CPU33.
>> 3. Manually offline CPU32 on the host side:
>> 'echo 0 > /sys/devices/system/cpu/cpu32/online'
>> 4. Core code cannot move the doorbell affinity to CPU32, since the
>> record in irq_data is CPU33.
>> 5. Subsequent doorbell interrupts will be lost.
>>
>> So affinity recoreded in irq_data should not be updated when skipping
>> VMOVP.
>>
>> Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index d097001c1e3e..4b1dbb697959 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -3850,8 +3850,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
>> its_send_vmovp(vpe);
>> its_vpe_db_proxy_move(vpe, from, cpu);
>>
>> -out:
>> irq_data_update_effective_affinity(d, cpumask_of(cpu));
>> +
>> +out:
>> vpe_to_cpuid_unlock(vpe, flags);
>>
>> return IRQ_SET_MASK_OK_DONE;
> This is becoming *very* annoying. I've already told you this patch was
> wrong, yet you keep sending it (twice in just over an hour).
Sorry, I didn't see that you had replied to me, when I sent it for
the second time.I have no intention of doing this.Sincerely apologize
for you.
Please allow me to put your previous reply here for discussion:
> That looks wrong. You are lying to the core code by saying that it's
> all OK, and yet haven't done*anything*. This stuff is obviously
> buggy, but I don't think this is right.
Make scene. I understand that the root cause of this problem is that
there is a difference between the doorbell affinity recorded in the
kernel and the actual hardware. So I'm just thinking about keeping
both consistent.
> In your example, you don't even solve the problem: if CPUs 32 and 33
> are part of the same ITS affinity group, you won't issue a VMOVP
> either, so this doesn't fix anything.
No VMOVP is sent. But the affinity recorded in irq_data is correct.
So when offline CPU32, core code can move doorbell to CPU0.The
mask_val is the entire cpu range, except for the offline cpu.
> At this stage, I think the VMOVP optimisation is wrong and that we
> should drop it.
This is the last resort. Maybe we can think of another way.
Kunkun Jiang
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario
2024-01-26 10:52 ` Marc Zyngier
@ 2024-01-27 2:59 ` Kunkun Jiang
0 siblings, 0 replies; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-27 2:59 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
Hi Marc,
On 2024/1/26 18:52, Marc Zyngier wrote:
> On Fri, 26 Jan 2024 10:30:12 +0000,
> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>> commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
>> of VMOVP") make an optimization, VMOVP can be skipped if moving
>> VPE to a cpu whose RD is sharing its VPE table with the current one.
>>
>> In the cpu offline scenario, the mask_val is the entire cpu range,
>> except for the offline cpu. Therefore, the first cpu is CPU0. In
>> corner case, this may result in lost interrupts:
>> 0. Each cpu die shares a VPE table and contains 32 CPUs
>> die0(CPU0-31) die1(CPU32-63)...
>> 1. VPE resides on CPU32, doorbell affinity to CPU32.
>> 2. Move VPE to CPU16, doorbell affinity to CPU16.
>> 3. Manually offline CPU16 on the host side:
>> 'echo 0 > /sys/devices/system/cpu/cpu16/online'
>> 4. VMOVP will be skipped.
>> 5. Subsequent doorbell interrupts will be lost.
>>
>> So VMOVP cannot be skipped when the affinity CPU is not in mask_val.
>>
>> Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 4b1dbb697959..bfb922f16bc6 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -3817,7 +3817,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
>> bool force)
>> {
>> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
>> - int from, cpu = cpumask_first(mask_val);
>> + int cur, from, cpu = cpumask_first(mask_val);
>> unsigned long flags;
>>
>> /*
>> @@ -3839,11 +3839,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
>>
>> vpe->col_idx = cpu;
>>
>> + cur = cpumask_first(irq_data_get_effective_affinity_mask(d));
> When is this not equal to 'from'?
It's based on my last patch. When skipping VMOVP, it is not equal to 'from'.
>> /*
>> * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
>> * is sharing its VPE table with the current one.
>> */
>> if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
>> + cpumask_test_cpu(cur, mask_val) &&
>> cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
>> goto out;
> This looks utterly pointless to me.
Yes, there are some problems here. I'll think about it more carefully.
Kunkun Jiang
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-27 2:41 ` Kunkun Jiang
@ 2024-01-28 13:28 ` Marc Zyngier
2024-01-30 13:32 ` Kunkun Jiang
0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2024-01-28 13:28 UTC (permalink / raw)
To: Kunkun Jiang
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
On Sat, 27 Jan 2024 02:41:55 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> > At this stage, I think the VMOVP optimisation is wrong and that we
> > should drop it.
> This is the last resort. Maybe we can think of another way.
I think the only other way is to never elide the VMOVP from
set_affinity, but to only call into set_affinity when strictly
necessary:
- when making the VPE resident on a CPU that isn't part of the same
affinity group (VPE migration outside of the affinity group)
- when making the VPE non-resident on a CPU that isn't the doorbell
delivery CPU *AND* that there is a need for a doorbell (VPE
migration inside the affinity group)
Which means it is the responsibility of the upper layers of the stack
to make these decisions, and that we can keep the low-level driver as
simple as possible.
In the meantime, here's the patch I'm proposing to merge ASAP. Please
let me know if that works for you.
M.
From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Sun, 28 Jan 2024 13:05:24 +0000
Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
When updating the affinity of a VPE, we currently skip the VMOVP
command if the two CPUs are part of the same VPE affinity.
But this is wrong, as the doorbell corresponding to this VPE
is still delivered on the 'old' CPU, which screws up the balancing.
Furthermore, offlining that 'old' CPU results in doorbell interrupts
generated for this VPE being discarded.
The harsh reality is that we cannot easily elide VMOVP when
a set_affinity request occurs. It needs to be obeyed, and if
an optimisation is to be made, it is at the point where the affinity
change request is made (such as in KVM).
Drop the VMOVP elision altogether, and only use the vpe_table_mask
to try and stay within the same ITS affinity group if at all possible.
Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
---
drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 250b4562f308..78aece62bff5 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
bool force)
{
struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- int from, cpu = cpumask_first(mask_val);
+ struct cpumask common;
unsigned long flags;
+ int from, cpu;
/*
* Changing affinity is mega expensive, so let's be as lazy as
@@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
* taken on any vLPI handling path that evaluates vpe->col_idx.
*/
from = vpe_to_cpuid_lock(vpe, &flags);
- if (from == cpu)
- goto out;
-
- vpe->col_idx = cpu;
/*
- * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
- * is sharing its VPE table with the current one.
+ * If we are offered another CPU in the same GICv4.1 ITS
+ * affinity, pick this one. Otherwise, any CPU will do.
*/
- if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
- cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
+ if (gic_data_rdist_cpu(from)->vpe_table_mask &&
+ cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
+ cpu = cpumask_first(&common);
+ else
+ cpu = cpumask_first(mask_val);
+
+ if (from == cpu)
goto out;
+ vpe->col_idx = cpu;
+
its_send_vmovp(vpe);
its_vpe_db_proxy_move(vpe, from, cpu);
--
2.39.2
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-28 13:28 ` Marc Zyngier
@ 2024-01-30 13:32 ` Kunkun Jiang
2024-01-30 13:55 ` Marc Zyngier
0 siblings, 1 reply; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-30 13:32 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
Hi Marc,
On 2024/1/28 21:28, Marc Zyngier wrote:
> On Sat, 27 Jan 2024 02:41:55 +0000,
> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>>> At this stage, I think the VMOVP optimisation is wrong and that we
>>> should drop it.
>> This is the last resort. Maybe we can think of another way.
> I think the only other way is to never elide the VMOVP from
> set_affinity, but to only call into set_affinity when strictly
> necessary:
>
> - when making the VPE resident on a CPU that isn't part of the same
> affinity group (VPE migration outside of the affinity group)
>
> - when making the VPE non-resident on a CPU that isn't the doorbell
> delivery CPU *AND* that there is a need for a doorbell (VPE
> migration inside the affinity group)
>
> Which means it is the responsibility of the upper layers of the stack
> to make these decisions, and that we can keep the low-level driver as
> simple as possible.
>
> In the meantime, here's the patch I'm proposing to merge ASAP. Please
> let me know if that works for you.
This patch works. But I still have some questions, which I will
write down below.
>
> M.
>
> >From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Sun, 28 Jan 2024 13:05:24 +0000
> Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
>
> When updating the affinity of a VPE, we currently skip the VMOVP
> command if the two CPUs are part of the same VPE affinity.
>
> But this is wrong, as the doorbell corresponding to this VPE
> is still delivered on the 'old' CPU, which screws up the balancing.
> Furthermore, offlining that 'old' CPU results in doorbell interrupts
> generated for this VPE being discarded.
>
> The harsh reality is that we cannot easily elide VMOVP when
> a set_affinity request occurs. It needs to be obeyed, and if
> an optimisation is to be made, it is at the point where the affinity
> change request is made (such as in KVM).
>
> Drop the VMOVP elision altogether, and only use the vpe_table_mask
> to try and stay within the same ITS affinity group if at all possible.
>
> Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
> Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: stable@vger.kernel.org
> ---
> drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 250b4562f308..78aece62bff5 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
> bool force)
> {
> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> - int from, cpu = cpumask_first(mask_val);
> + struct cpumask common;
> unsigned long flags;
> + int from, cpu;
>
> /*
> * Changing affinity is mega expensive, so let's be as lazy as
> @@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
> * taken on any vLPI handling path that evaluates vpe->col_idx.
> */
> from = vpe_to_cpuid_lock(vpe, &flags);
> - if (from == cpu)
> - goto out;
> -
> - vpe->col_idx = cpu;
>
> /*
> - * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
> - * is sharing its VPE table with the current one.
> + * If we are offered another CPU in the same GICv4.1 ITS
> + * affinity, pick this one. Otherwise, any CPU will do.
> */
> - if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
> - cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
> + if (gic_data_rdist_cpu(from)->vpe_table_mask &&
> + cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
> + cpu = cpumask_first(&common);
> + else
> + cpu = cpumask_first(mask_val);
> +
> + if (from == cpu)
> goto out;
The meaning here should be that VMOVP will be skipped
only if from and target are the same. Then why do we
still to judge whether they are in the same GICv4.1
ITS affinity?
Kunkun Jiang
>
> + vpe->col_idx = cpu;
> +
> its_send_vmovp(vpe);
> its_vpe_db_proxy_move(vpe, from, cpu);
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-30 13:32 ` Kunkun Jiang
@ 2024-01-30 13:55 ` Marc Zyngier
2024-01-31 1:29 ` Kunkun Jiang
0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2024-01-30 13:55 UTC (permalink / raw)
To: Kunkun Jiang
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
On Tue, 30 Jan 2024 13:32:24 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> Hi Marc,
>
> On 2024/1/28 21:28, Marc Zyngier wrote:
> > On Sat, 27 Jan 2024 02:41:55 +0000,
> > Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> >>> At this stage, I think the VMOVP optimisation is wrong and that we
> >>> should drop it.
> >> This is the last resort. Maybe we can think of another way.
> > I think the only other way is to never elide the VMOVP from
> > set_affinity, but to only call into set_affinity when strictly
> > necessary:
> >
> > - when making the VPE resident on a CPU that isn't part of the same
> > affinity group (VPE migration outside of the affinity group)
> >
> > - when making the VPE non-resident on a CPU that isn't the doorbell
> > delivery CPU *AND* that there is a need for a doorbell (VPE
> > migration inside the affinity group)
> >
> > Which means it is the responsibility of the upper layers of the stack
> > to make these decisions, and that we can keep the low-level driver as
> > simple as possible.
> >
> > In the meantime, here's the patch I'm proposing to merge ASAP. Please
> > let me know if that works for you.
> This patch works. But I still have some questions, which I will
> write down below.
> >
> > M.
> >
> > >From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
> > From: Marc Zyngier <maz@kernel.org>
> > Date: Sun, 28 Jan 2024 13:05:24 +0000
> > Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
> >
> > When updating the affinity of a VPE, we currently skip the VMOVP
> > command if the two CPUs are part of the same VPE affinity.
> >
> > But this is wrong, as the doorbell corresponding to this VPE
> > is still delivered on the 'old' CPU, which screws up the balancing.
> > Furthermore, offlining that 'old' CPU results in doorbell interrupts
> > generated for this VPE being discarded.
> >
> > The harsh reality is that we cannot easily elide VMOVP when
> > a set_affinity request occurs. It needs to be obeyed, and if
> > an optimisation is to be made, it is at the point where the affinity
> > change request is made (such as in KVM).
> >
> > Drop the VMOVP elision altogether, and only use the vpe_table_mask
> > to try and stay within the same ITS affinity group if at all possible.
> >
> > Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
> > Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
> > 1 file changed, 13 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index 250b4562f308..78aece62bff5 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > bool force)
> > {
> > struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> > - int from, cpu = cpumask_first(mask_val);
> > + struct cpumask common;
> > unsigned long flags;
> > + int from, cpu;
> > /*
> > * Changing affinity is mega expensive, so let's be as lazy as
> > @@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > * taken on any vLPI handling path that evaluates vpe->col_idx.
> > */
> > from = vpe_to_cpuid_lock(vpe, &flags);
> > - if (from == cpu)
> > - goto out;
> > -
> > - vpe->col_idx = cpu;
> > /*
> > - * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
> > - * is sharing its VPE table with the current one.
> > + * If we are offered another CPU in the same GICv4.1 ITS
> > + * affinity, pick this one. Otherwise, any CPU will do.
> > */
> > - if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
> > - cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
> > + if (gic_data_rdist_cpu(from)->vpe_table_mask &&
> > + cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
> > + cpu = cpumask_first(&common);
> > + else
> > + cpu = cpumask_first(mask_val);
> > +
> > + if (from == cpu)
> > goto out;
> The meaning here should be that VMOVP will be skipped
> only if from and target are the same. Then why do we
> still to judge whether they are in the same GICv4.1
> ITS affinity?
Having the same ITS affinity is a performance optimisation, as this
minimises traffic between ITSs and keep cache locality. If the
proposed affinity contains CPUs that are in the same affinity as
current one, then it makes sense to stay there if possible in order to
keep the benefit of the warm translation caches.
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-30 13:55 ` Marc Zyngier
@ 2024-01-31 1:29 ` Kunkun Jiang
2024-01-31 14:05 ` Marc Zyngier
0 siblings, 1 reply; 12+ messages in thread
From: Kunkun Jiang @ 2024-01-31 1:29 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
Hi Marc,
On 2024/1/30 21:55, Marc Zyngier wrote:
> On Tue, 30 Jan 2024 13:32:24 +0000,
> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>> Hi Marc,
>>
>> On 2024/1/28 21:28, Marc Zyngier wrote:
>>> On Sat, 27 Jan 2024 02:41:55 +0000,
>>> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>>>>> At this stage, I think the VMOVP optimisation is wrong and that we
>>>>> should drop it.
>>>> This is the last resort. Maybe we can think of another way.
>>> I think the only other way is to never elide the VMOVP from
>>> set_affinity, but to only call into set_affinity when strictly
>>> necessary:
>>>
>>> - when making the VPE resident on a CPU that isn't part of the same
>>> affinity group (VPE migration outside of the affinity group)
>>>
>>> - when making the VPE non-resident on a CPU that isn't the doorbell
>>> delivery CPU *AND* that there is a need for a doorbell (VPE
>>> migration inside the affinity group)
>>>
>>> Which means it is the responsibility of the upper layers of the stack
>>> to make these decisions, and that we can keep the low-level driver as
>>> simple as possible.
>>>
>>> In the meantime, here's the patch I'm proposing to merge ASAP. Please
>>> let me know if that works for you.
>> This patch works. But I still have some questions, which I will
>> write down below.
>>> M.
>>>
>>> >From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
>>> From: Marc Zyngier <maz@kernel.org>
>>> Date: Sun, 28 Jan 2024 13:05:24 +0000
>>> Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
>>>
>>> When updating the affinity of a VPE, we currently skip the VMOVP
>>> command if the two CPUs are part of the same VPE affinity.
>>>
>>> But this is wrong, as the doorbell corresponding to this VPE
>>> is still delivered on the 'old' CPU, which screws up the balancing.
>>> Furthermore, offlining that 'old' CPU results in doorbell interrupts
>>> generated for this VPE being discarded.
>>>
>>> The harsh reality is that we cannot easily elide VMOVP when
>>> a set_affinity request occurs. It needs to be obeyed, and if
>>> an optimisation is to be made, it is at the point where the affinity
>>> change request is made (such as in KVM).
>>>
>>> Drop the VMOVP elision altogether, and only use the vpe_table_mask
>>> to try and stay within the same ITS affinity group if at all possible.
>>>
>>> Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
>>> Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
>>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>>> Cc: stable@vger.kernel.org
>>> ---
>>> drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
>>> 1 file changed, 13 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>>> index 250b4562f308..78aece62bff5 100644
>>> --- a/drivers/irqchip/irq-gic-v3-its.c
>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>> @@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
>>> bool force)
>>> {
>>> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
>>> - int from, cpu = cpumask_first(mask_val);
>>> + struct cpumask common;
>>> unsigned long flags;
>>> + int from, cpu;
>>> /*
>>> * Changing affinity is mega expensive, so let's be as lazy as
>>> @@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
>>> * taken on any vLPI handling path that evaluates vpe->col_idx.
>>> */
>>> from = vpe_to_cpuid_lock(vpe, &flags);
>>> - if (from == cpu)
>>> - goto out;
>>> -
>>> - vpe->col_idx = cpu;
>>> /*
>>> - * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
>>> - * is sharing its VPE table with the current one.
>>> + * If we are offered another CPU in the same GICv4.1 ITS
>>> + * affinity, pick this one. Otherwise, any CPU will do.
>>> */
>>> - if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
>>> - cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
>>> + if (gic_data_rdist_cpu(from)->vpe_table_mask &&
>>> + cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
>>> + cpu = cpumask_first(&common);
>>> + else
>>> + cpu = cpumask_first(mask_val);
>>> +
>>> + if (from == cpu)
>>> goto out;
>> The meaning here should be that VMOVP will be skipped
>> only if from and target are the same. Then why do we
>> still to judge whether they are in the same GICv4.1
>> ITS affinity?
> Having the same ITS affinity is a performance optimisation, as this
> minimises traffic between ITSs and keep cache locality. If the
> proposed affinity contains CPUs that are in the same affinity as
> current one, then it makes sense to stay there if possible in order to
> keep the benefit of the warm translation caches.
Ok, I see. In another case, is it possible to skip VMOVP?
The from belongs to common, but it is not the first one.
Kunkun Jiang
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 doorbell affinity
2024-01-31 1:29 ` Kunkun Jiang
@ 2024-01-31 14:05 ` Marc Zyngier
0 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2024-01-31 14:05 UTC (permalink / raw)
To: Kunkun Jiang
Cc: Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
Thomas Gleixner, linux-arm-kernel, kvmarm, wanghaibin.wang
On Wed, 31 Jan 2024 01:29:40 +0000,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>
> Hi Marc,
>
> On 2024/1/30 21:55, Marc Zyngier wrote:
> > On Tue, 30 Jan 2024 13:32:24 +0000,
> > Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> >> Hi Marc,
> >>
> >> On 2024/1/28 21:28, Marc Zyngier wrote:
> >>> On Sat, 27 Jan 2024 02:41:55 +0000,
> >>> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> >>>>> At this stage, I think the VMOVP optimisation is wrong and that we
> >>>>> should drop it.
> >>>> This is the last resort. Maybe we can think of another way.
> >>> I think the only other way is to never elide the VMOVP from
> >>> set_affinity, but to only call into set_affinity when strictly
> >>> necessary:
> >>>
> >>> - when making the VPE resident on a CPU that isn't part of the same
> >>> affinity group (VPE migration outside of the affinity group)
> >>>
> >>> - when making the VPE non-resident on a CPU that isn't the doorbell
> >>> delivery CPU *AND* that there is a need for a doorbell (VPE
> >>> migration inside the affinity group)
> >>>
> >>> Which means it is the responsibility of the upper layers of the stack
> >>> to make these decisions, and that we can keep the low-level driver as
> >>> simple as possible.
> >>>
> >>> In the meantime, here's the patch I'm proposing to merge ASAP. Please
> >>> let me know if that works for you.
> >> This patch works. But I still have some questions, which I will
> >> write down below.
> >>> M.
> >>>
> >>> >From 00c338e91d3b6600f402f56cdb64c9d370f911c8 Mon Sep 17 00:00:00 2001
> >>> From: Marc Zyngier <maz@kernel.org>
> >>> Date: Sun, 28 Jan 2024 13:05:24 +0000
> >>> Subject: [PATCH] irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
> >>>
> >>> When updating the affinity of a VPE, we currently skip the VMOVP
> >>> command if the two CPUs are part of the same VPE affinity.
> >>>
> >>> But this is wrong, as the doorbell corresponding to this VPE
> >>> is still delivered on the 'old' CPU, which screws up the balancing.
> >>> Furthermore, offlining that 'old' CPU results in doorbell interrupts
> >>> generated for this VPE being discarded.
> >>>
> >>> The harsh reality is that we cannot easily elide VMOVP when
> >>> a set_affinity request occurs. It needs to be obeyed, and if
> >>> an optimisation is to be made, it is at the point where the affinity
> >>> change request is made (such as in KVM).
> >>>
> >>> Drop the VMOVP elision altogether, and only use the vpe_table_mask
> >>> to try and stay within the same ITS affinity group if at all possible.
> >>>
> >>> Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP)
> >>> Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> >>> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >>> Cc: stable@vger.kernel.org
> >>> ---
> >>> drivers/irqchip/irq-gic-v3-its.c | 22 +++++++++++++---------
> >>> 1 file changed, 13 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> >>> index 250b4562f308..78aece62bff5 100644
> >>> --- a/drivers/irqchip/irq-gic-v3-its.c
> >>> +++ b/drivers/irqchip/irq-gic-v3-its.c
> >>> @@ -3826,8 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >>> bool force)
> >>> {
> >>> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> >>> - int from, cpu = cpumask_first(mask_val);
> >>> + struct cpumask common;
> >>> unsigned long flags;
> >>> + int from, cpu;
> >>> /*
> >>> * Changing affinity is mega expensive, so let's be as lazy as
> >>> @@ -3843,19 +3844,22 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >>> * taken on any vLPI handling path that evaluates vpe->col_idx.
> >>> */
> >>> from = vpe_to_cpuid_lock(vpe, &flags);
> >>> - if (from == cpu)
> >>> - goto out;
> >>> -
> >>> - vpe->col_idx = cpu;
> >>> /*
> >>> - * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
> >>> - * is sharing its VPE table with the current one.
> >>> + * If we are offered another CPU in the same GICv4.1 ITS
> >>> + * affinity, pick this one. Otherwise, any CPU will do.
> >>> */
> >>> - if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
> >>> - cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
> >>> + if (gic_data_rdist_cpu(from)->vpe_table_mask &&
> >>> + cpumask_and(&common, mask_val, gic_data_rdist_cpu(from)->vpe_table_mask))
> >>> + cpu = cpumask_first(&common);
> >>> + else
> >>> + cpu = cpumask_first(mask_val);
> >>> +
> >>> + if (from == cpu)
> >>> goto out;
> >> The meaning here should be that VMOVP will be skipped
> >> only if from and target are the same. Then why do we
> >> still to judge whether they are in the same GICv4.1
> >> ITS affinity?
> > Having the same ITS affinity is a performance optimisation, as this
> > minimises traffic between ITSs and keep cache locality. If the
> > proposed affinity contains CPUs that are in the same affinity as
> > current one, then it makes sense to stay there if possible in order to
> > keep the benefit of the warm translation caches.
> Ok, I see. In another case, is it possible to skip VMOVP?
> The from belongs to common, but it is not the first one.
You could have something like
if (cpumask_test_cpu(from, &common))
cpu = from;
else
cpu = cpumask_first(&common);
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-01-31 14:05 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-26 10:30 [PATCH v2 0/2] irqchip/gic-v4.1: Two bugfixes about doorbell affinity Kunkun Jiang
2024-01-26 10:30 ` [PATCH v2 1/2] irqchip/gic-v4.1: Fix GICv4.1 " Kunkun Jiang
2024-01-26 10:52 ` Marc Zyngier
2024-01-27 2:41 ` Kunkun Jiang
2024-01-28 13:28 ` Marc Zyngier
2024-01-30 13:32 ` Kunkun Jiang
2024-01-30 13:55 ` Marc Zyngier
2024-01-31 1:29 ` Kunkun Jiang
2024-01-31 14:05 ` Marc Zyngier
2024-01-26 10:30 ` [PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario Kunkun Jiang
2024-01-26 10:52 ` Marc Zyngier
2024-01-27 2:59 ` Kunkun Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).