From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4F551C2BBCA for ; Tue, 25 Jun 2024 07:54:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=43y4qmPG7RTzNI7f4cURpa5NA0nJsdZENgamv9LZnyM=; b=ueQJIYyT0j7VZDQ4FDwHm1yU0l CzZhzgyYpb3YL+ksd9z98QCWtsPCzR1JIW2rcTrK9RcNrhL1zsju8ZCdBV6u42xE0F++aMCMnQxrf mvWaEB0yDsJuDXwxV+dAhn4JtMqAI/1a2oYQdGGhMydk0NggpTos58c0WRbLHF3SwLn51TeRF1J+r x3MmNr68l+M3B6CkF8JPzvil6oUBfaRTwIIVLB35hkpBWSu4uJHw6QH4eLXXPwU/GmuC8gixf7L63 LxWk5xMrSwJpxcDvCZXTXURGOjcXhks7ddJH2qX2fJvSNfPHtI3fk6CPS1pTGGSXotqRf2acuGCXe uFflQi0A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sM107-000000021vI-1Vw7; Tue, 25 Jun 2024 07:54:03 +0000 Received: from sin.source.kernel.org ([145.40.73.55]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sM101-000000021tp-3mok for linux-arm-kernel@lists.infradead.org; Tue, 25 Jun 2024 07:53:59 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id E66CACE18F0; Tue, 25 Jun 2024 07:53:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D85EC32781; Tue, 25 Jun 2024 07:53:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719302034; bh=qaV7tmtoxM02jcliQWMUlhg2i6RVNfMsQ+ov8eHckBE=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=PL/w650CIk20bjn5RVx3yuWhVAfm57YOwyZoslVIIo4vm7LliLy+Jnbf/iOH7i8J1 aQqpk9tlYgpdpNxMS/jLLk/SkANcMfCXzqBE6CicHqmkjTOJoVbAsbZd5zM9qFZe3S ATThtN2cTPtihBk+edoAaDaUZ/gT2hLFzzL3X8znfGh1Y3ow3yin04Bv4Qlg1B08ZR xl2Qgv1g9p9FSV80QQUOgo3P8Fj6rygwtcgw8VML7YpLwQ9c3espSitZsW9LA2pAGk iacAWXQPujJ3n+p3ePtlnFxDn6lTbx4Zk+n2jKUwyMGbKhIbR+OYmvCKFQlf9RezIi ImtyYKVGrZjyA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1sM0zw-0074Zr-HP; Tue, 25 Jun 2024 08:53:52 +0100 Date: Tue, 25 Jun 2024 08:53:52 +0100 Message-ID: <86wmmdihkf.wl-maz@kernel.org> From: Marc Zyngier To: Nianyao Tang Cc: , , , , Subject: Re: [RESEND PATCH] irqchip/gic-v4.1: Use the ITS of the NUMA node where current cpu is located In-Reply-To: <20240625014019.3914240-1-tangnianyao@huawei.com> References: <20240625014019.3914240-1-tangnianyao@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: tangnianyao@huawei.com, tglx@linutronix.de, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, guoyang2@huawei.com, wangwudi@hisilicon.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240625_005358_329537_363D6EC7 X-CRM114-Status: GOOD ( 31.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, 25 Jun 2024 02:40:19 +0100, Nianyao Tang wrote: > > When GICv4.1 enabled, guest sending IPI use the last ITS reported. > On multi-NUMA environment with more than one ITS, it makes IPI performance > various from VM to VM, depending on which NUMA the VM is deployed on. > We can use closer ITS instead of the last ITS reported. Closer to *what*? the SGI sender? or the receiver? Something else? > > Modify find_4_1_its to find the ITS of the NUMA node where current > cpu is located and save it with per cpu variable. But find_4_1_its() isn't only used for SGIs. Is it valid to do this trick for all use cases? > (There's format issues with the previous patch, resend it) In the future, please move this sort of comment to a note after the --- delimiter. > > Signed-off-by: Nianyao Tang > --- > drivers/irqchip/irq-gic-v3-its.c | 27 ++++++++++++++++++--------- > 1 file changed, 18 insertions(+), 9 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index 3c755d5dad6e..d35b42f3b2af 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -193,6 +193,8 @@ static DEFINE_RAW_SPINLOCK(vmovp_lock); > > static DEFINE_IDA(its_vpeid_ida); > > +static DEFINE_PER_CPU(struct its_node *, its_on_cpu); I don't really get the "its_on_cpu" name. "local_its" would at least indicate a notion being "close". > + > #define gic_data_rdist() (raw_cpu_ptr(gic_rdists->rdist)) > #define gic_data_rdist_cpu(cpu) (per_cpu_ptr(gic_rdists->rdist, cpu)) > #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base) > @@ -4058,19 +4060,25 @@ static struct irq_chip its_vpe_irq_chip = { > > static struct its_node *find_4_1_its(void) > { > - static struct its_node *its = NULL; > + struct its_node *its = NULL; > + struct its_node *its_non_cpu_node = NULL; > + int cpu = smp_processor_id(); > > - if (!its) { > - list_for_each_entry(its, &its_nodes, entry) { > - if (is_v4_1(its)) > - return its; > - } > + if (per_cpu(its_on_cpu, cpu)) > + return per_cpu(its_on_cpu, cpu); > > - /* Oops? */ > - its = NULL; > - } > + list_for_each_entry(its, &its_nodes, entry) { > + if (is_v4_1(its) && its->numa_node == cpu_to_node(cpu)) { > + per_cpu(its_on_cpu, cpu) = its; > + return its; > + } else if (is_v4_1(its)) > + its_non_cpu_node = its; > + } Why do you consider the NUMA node instead of the ITS' own affinity? SVPET gives you some notion of distance with the RDs, and that'd probably be useful. > > - return its; > + if (!per_cpu(its_on_cpu, cpu) && its_non_cpu_node) > + per_cpu(its_on_cpu, cpu) = its_non_cpu_node; > + > + return its_non_cpu_node; > } Urgh. Mixing init and runtime is awful. Why isn't this initialised when a CPU comes up? We already have all the infrastructure. But the biggest question is "what sort of performance improvement does this bring"? You give no numbers, no way to evaluate anything. I've asked for that times and times again: if your changes are claiming a performance improvement, please back it up. It's not that hard. Thanks, M. -- Without deviation from the norm, progress is not possible.