Re: [RESEND PATCH] irqchip/gic-v4.1: Use the ITS of the NUMA node where current cpu is located

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: Marc Zyngier <maz@kernel.org>
To: Nianyao Tang <tangnianyao@huawei.com>
Cc: <tglx@linutronix.de>, <linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, <guoyang2@huawei.com>,
	<wangwudi@hisilicon.com>
Subject: Re: [RESEND PATCH] irqchip/gic-v4.1: Use the ITS of the NUMA node where current  cpu is located
Date: Tue, 25 Jun 2024 08:53:52 +0100	[thread overview]
Message-ID: <86wmmdihkf.wl-maz@kernel.org> (raw)
In-Reply-To: <20240625014019.3914240-1-tangnianyao@huawei.com>

On Tue, 25 Jun 2024 02:40:19 +0100,
Nianyao Tang <tangnianyao@huawei.com> wrote:
> 
> When GICv4.1 enabled, guest sending IPI use the last ITS reported.
> On multi-NUMA environment with more than one ITS, it makes IPI performance
> various from VM to VM, depending on which NUMA the VM is deployed on.
> We can use closer ITS instead of the last ITS reported.

Closer to *what*? the SGI sender? or the receiver? Something else?

> 
> Modify find_4_1_its to find the ITS of the NUMA node where current
> cpu is located and save it with per cpu variable.

But find_4_1_its() isn't only used for SGIs. Is it valid to do this
trick for all use cases?

> (There's format issues with the previous patch, resend it)

In the future, please move this sort of comment to a note after the
--- delimiter.

> 
> Signed-off-by: Nianyao Tang <tangnianyao@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 27 ++++++++++++++++++---------
>  1 file changed, 18 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 3c755d5dad6e..d35b42f3b2af 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -193,6 +193,8 @@ static DEFINE_RAW_SPINLOCK(vmovp_lock);
>  
>  static DEFINE_IDA(its_vpeid_ida);
>  
> +static DEFINE_PER_CPU(struct its_node *, its_on_cpu);

I don't really get the "its_on_cpu" name. "local_its" would at least
indicate a notion being "close".

> +
>  #define gic_data_rdist()		(raw_cpu_ptr(gic_rdists->rdist))
>  #define gic_data_rdist_cpu(cpu)		(per_cpu_ptr(gic_rdists->rdist, cpu))
>  #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
> @@ -4058,19 +4060,25 @@ static struct irq_chip its_vpe_irq_chip = {
>  
>  static struct its_node *find_4_1_its(void)
>  {
> -	static struct its_node *its = NULL;
> +	struct its_node *its = NULL;
> +	struct its_node *its_non_cpu_node = NULL;
> +	int cpu = smp_processor_id();
>  
> -	if (!its) {
> -		list_for_each_entry(its, &its_nodes, entry) {
> -			if (is_v4_1(its))
> -				return its;
> -		}
> +	if (per_cpu(its_on_cpu, cpu))
> +		return per_cpu(its_on_cpu, cpu);
>  
> -		/* Oops? */
> -		its = NULL;
> -	}
> +	list_for_each_entry(its, &its_nodes, entry) {
> +		if (is_v4_1(its) && its->numa_node == cpu_to_node(cpu)) {
> +			per_cpu(its_on_cpu, cpu) = its;
> +			return its;
> +		} else if (is_v4_1(its))
> +			its_non_cpu_node = its;
> +	}

Why do you consider the NUMA node instead of the ITS' own affinity?
SVPET gives you some notion of distance with the RDs, and that'd
probably be useful.

>  
> -	return its;
> +	if (!per_cpu(its_on_cpu, cpu) && its_non_cpu_node)
> +		per_cpu(its_on_cpu, cpu) = its_non_cpu_node;
> +
> +	return its_non_cpu_node;
>  }

Urgh. Mixing init and runtime is awful. Why isn't this initialised
when a CPU comes up? We already have all the infrastructure.

But the biggest question is "what sort of performance improvement does
this bring"? You give no numbers, no way to evaluate anything.

I've asked for that times and times again: if your changes are
claiming a performance improvement, please back it up. It's not that
hard.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

next prev parent reply	other threads:[~2024-06-25  7:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-25  1:40 [RESEND PATCH] irqchip/gic-v4.1: Use the ITS of the NUMA node where current cpu is located Nianyao Tang
2024-06-25  7:53 ` Marc Zyngier [this message]
2024-06-26  2:22   ` Tangnianyao
2024-06-26  8:41     ` Marc Zyngier
2024-07-01  3:24       ` Tangnianyao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86wmmdihkf.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=guoyang2@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tangnianyao@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=wangwudi@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).