From: Tangnianyao <tangnianyao@huawei.com>
To: Marc Zyngier <maz@kernel.org>
Cc: <tglx@linutronix.de>, <linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, <guoyang2@huawei.com>,
<wangwudi@hisilicon.com>
Subject: Re: [RESEND PATCH] irqchip/gic-v4.1: Use the ITS of the NUMA node where current cpu is located
Date: Wed, 26 Jun 2024 10:22:52 +0800 [thread overview]
Message-ID: <60de5bd6-51db-e327-5808-280407a6285d@huawei.com> (raw)
In-Reply-To: <86wmmdihkf.wl-maz@kernel.org>
On 6/25/2024 15:53, Marc Zyngier wrote:
> On Tue, 25 Jun 2024 02:40:19 +0100,
> Nianyao Tang <tangnianyao@huawei.com> wrote:
>> When GICv4.1 enabled, guest sending IPI use the last ITS reported.
>> On multi-NUMA environment with more than one ITS, it makes IPI performance
>> various from VM to VM, depending on which NUMA the VM is deployed on.
>> We can use closer ITS instead of the last ITS reported.
> Closer to *what*? the SGI sender? or the receiver? Something else?
VSGI sender.
VSGI sender use original find_4_1_its to inject vsgi, it always find the last reported
4_1 ITS, regardless of which NUMA the VSGI sender cpu is located on.
>
>> Modify find_4_1_its to find the ITS of the NUMA node where current
>> cpu is located and save it with per cpu variable.
> But find_4_1_its() isn't only used for SGIs. Is it valid to do this
> trick for all use cases?
To consider this case, I've implemented original find_4_1_its function, finding a
4_1 ITS in system and return, even NUMA is not match. Would it be enough to be
compatitable with other code ?
A new find_4_1_its can firstly select 4_1 ITS on the same NUMA as the current
cpu(VSGI sender), and if fail to find, then return 4_1 ITS on other NUMA.
>
>> (There's format issues with the previous patch, resend it)
> In the future, please move this sort of comment to a note after the
> --- delimiter.
ok, get it.
>
>> Signed-off-by: Nianyao Tang <tangnianyao@huawei.com>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 27 ++++++++++++++++++---------
>> 1 file changed, 18 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 3c755d5dad6e..d35b42f3b2af 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -193,6 +193,8 @@ static DEFINE_RAW_SPINLOCK(vmovp_lock);
>>
>> static DEFINE_IDA(its_vpeid_ida);
>>
>> +static DEFINE_PER_CPU(struct its_node *, its_on_cpu);
> I don't really get the "its_on_cpu" name. "local_its" would at least
> indicate a notion being "close".
I want to mean ITS on the current cpu NUMA node.
Yes, "local_its" is better.
>
>> +
>> #define gic_data_rdist() (raw_cpu_ptr(gic_rdists->rdist))
>> #define gic_data_rdist_cpu(cpu) (per_cpu_ptr(gic_rdists->rdist, cpu))
>> #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base)
>> @@ -4058,19 +4060,25 @@ static struct irq_chip its_vpe_irq_chip = {
>>
>> static struct its_node *find_4_1_its(void)
>> {
>> - static struct its_node *its = NULL;
>> + struct its_node *its = NULL;
>> + struct its_node *its_non_cpu_node = NULL;
>> + int cpu = smp_processor_id();
>>
>> - if (!its) {
>> - list_for_each_entry(its, &its_nodes, entry) {
>> - if (is_v4_1(its))
>> - return its;
>> - }
>> + if (per_cpu(its_on_cpu, cpu))
>> + return per_cpu(its_on_cpu, cpu);
>>
>> - /* Oops? */
>> - its = NULL;
>> - }
>> + list_for_each_entry(its, &its_nodes, entry) {
>> + if (is_v4_1(its) && its->numa_node == cpu_to_node(cpu)) {
>> + per_cpu(its_on_cpu, cpu) = its;
>> + return its;
>> + } else if (is_v4_1(its))
>> + its_non_cpu_node = its;
>> + }
> Why do you consider the NUMA node instead of the ITS' own affinity?
> SVPET gives you some notion of distance with the RDs, and that'd
> probably be useful.
I assumed BIOS should report NUMA node following real topology, use NUMA node
for simplicity.
>
>>
>> - return its;
>> + if (!per_cpu(its_on_cpu, cpu) && its_non_cpu_node)
>> + per_cpu(its_on_cpu, cpu) = its_non_cpu_node;
>> +
>> + return its_non_cpu_node;
>> }
> Urgh. Mixing init and runtime is awful. Why isn't this initialised
> when a CPU comes up? We already have all the infrastructure.
The original find_4_1_its use "static struct its_node *its" to save 4_1 ITS, and
it's init inside this function. So, to follow this, I tried to not modify this usage.
>
> But the biggest question is "what sort of performance improvement does
> this bring"? You give no numbers, no way to evaluate anything.
>
> I've asked for that times and times again: if your changes are
> claiming a performance improvement, please back it up. It's not that
> hard.
On a 2-socket environment, reported as 2-NUMA, each socket with one ITS
and 32 cpu, GICv4.1 enabled.
For performance, I deploy a 4U8G guest, 4 vcpu on same socket.
When I deploy guest on socket0, kvm-unit-tests ipi_hw result is 850ns. It
test the delay from one vcpu sending ipi to another vcpu receiving ipi in guest.
When I deploy guest on socket1, the result is 750ns.
The reason is VSGI sender always use lasted reported ITS to inject VSGI.
The access from cpu to other-socket ITS will cost 100ns more compared to cpu
to local ITS.
>
> Thanks,
>
> M.
>
next prev parent reply other threads:[~2024-06-26 2:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 1:40 [RESEND PATCH] irqchip/gic-v4.1: Use the ITS of the NUMA node where current cpu is located Nianyao Tang
2024-06-25 7:53 ` Marc Zyngier
2024-06-26 2:22 ` Tangnianyao [this message]
2024-06-26 8:41 ` Marc Zyngier
2024-07-01 3:24 ` Tangnianyao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=60de5bd6-51db-e327-5808-280407a6285d@huawei.com \
--to=tangnianyao@huawei.com \
--cc=guoyang2@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=tglx@linutronix.de \
--cc=wangwudi@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox