From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6CDD62ECE9A; Fri, 10 Oct 2025 09:48:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760089698; cv=none; b=EfQY4FTaNJRatIgyVk6VvtGt7vdmk6Jxjhc9gyvVPhRptuvaCHOLBVl3MuGAr1x78vv45N3iJ/FjYxqpYKO/YXOb243/OHgsvNQYXxNPiSbic185XARxWwyZIUdOe8j9t4IyFV8J2chly8nf2dZfoX1DZctJhyiFs9NcCdqaS6k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760089698; c=relaxed/simple; bh=JrWrN+FQZRgzUNzMiSaZNnbC3MC4bk658YLS+9m48g8=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=gPZ461WvRMjuIjK3Hh54CwyJfRqzfHGIrFDOKPHTzHB0F38sLKEOY1l/eK0gB9zUXN5X53PWF/ol8ipHabBzfFtsouqbGUKHu+to48L8+AlYNkfQiardmj3Lhf1MER+EBZQbou7/drOvd5ICUABVTKdX7PtKGPzIZhLbaTn6iLM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dvslxa6y; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dvslxa6y" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DBDA5C4CEF1; Fri, 10 Oct 2025 09:48:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760089697; bh=JrWrN+FQZRgzUNzMiSaZNnbC3MC4bk658YLS+9m48g8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=dvslxa6yxvt2X8m9j5W8vxa0CEpmn58MZW4GD8Bm90QDt4Li+sTdT51zb8UE0qeLb azDAoKjznhK8NsVXZ0yiVFtmkDfO4v3OP0darB8cUgpIS4xYA7XGtz9twXoN2m/2Iv Zgla+K3E0rJ87QbBw4pf2zqk4qvl+l8MiFs2THHroDkH9Duy8Rpn560n26a4X/x8mc y1cAca8VS22y3cTaYlvNauRTOqGmc6n5JZY5PyxnsPIL5DzzvT7zot84hDCxmeS4b2 Qz7IcfT+5G/3MKmXcCqcN4AOS7vciwONESBmCNrU278J4u6ZJDH3uR36EV0+E9bpF5 sYKnC7q9v6t1w== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1v79jT-0000000CsJY-3fzQ; Fri, 10 Oct 2025 09:48:16 +0000 Date: Fri, 10 Oct 2025 10:48:15 +0100 Message-ID: <86h5w7xf34.wl-maz@kernel.org> From: Marc Zyngier To: Ulf Hansson Cc: "Rafael J . Wysocki" , Catalin Marinas , Will Deacon , Mark Rutland , Thomas Gleixner , Maulik Shah , Sudeep Holla , Daniel Lezcano , Vincent Guittot , linux-pm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/3] arm64: smp: Implement cpus_has_pending_ipi() In-Reply-To: References: <20251003150251.520624-1-ulf.hansson@linaro.org> <20251003150251.520624-3-ulf.hansson@linaro.org> <865xcsyqgs.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: ulf.hansson@linaro.org, rafael@kernel.org, catalin.marinas@arm.com, will@kernel.org, mark.rutland@arm.com, tglx@linutronix.de, quic_mkshah@quicinc.com, sudeep.holla@arm.com, daniel.lezcano@linaro.org, vincent.guittot@linaro.org, linux-pm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Fri, 10 Oct 2025 09:30:11 +0100, Ulf Hansson wrote: > > On Mon, 6 Oct 2025 at 17:55, Marc Zyngier wrote: > > > > On Fri, 03 Oct 2025 16:02:44 +0100, > > Ulf Hansson wrote: > > > > > > To add support for keeping track of whether there may be a pending IPI > > > scheduled for a CPU or a group of CPUs, let's implement > > > cpus_has_pending_ipi() for arm64. > > > > > > Note, the implementation is intentionally lightweight and doesn't use any > > > additional lock. This is good enough for cpuidle based decisions. > > > > > > Signed-off-by: Ulf Hansson > > > --- > > > arch/arm64/kernel/smp.c | 20 ++++++++++++++++++++ > > > 1 file changed, 20 insertions(+) > > > > > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > > > index 68cea3a4a35c..dd1acfa91d44 100644 > > > --- a/arch/arm64/kernel/smp.c > > > +++ b/arch/arm64/kernel/smp.c > > > @@ -55,6 +55,8 @@ > > > > > > #include > > > > > > +static DEFINE_PER_CPU(bool, pending_ipi); > > > + > > > /* > > > * as from 2.5, kernels no longer have an init_tasks structure > > > * so we need some other way of telling a new secondary core > > > @@ -1012,6 +1014,8 @@ static void do_handle_IPI(int ipinr) > > > > > > if ((unsigned)ipinr < NR_IPI) > > > trace_ipi_exit(ipi_types[ipinr]); > > > + > > > + per_cpu(pending_ipi, cpu) = false; > > > } > > > > > > static irqreturn_t ipi_handler(int irq, void *data) > > > @@ -1024,10 +1028,26 @@ static irqreturn_t ipi_handler(int irq, void *data) > > > > > > static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) > > > { > > > + unsigned int cpu; > > > + > > > + for_each_cpu(cpu, target) > > > + per_cpu(pending_ipi, cpu) = true; > > > + > > > > Why isn't all of this part of the core IRQ management? We already > > track things like timers, I assume for similar reasons. If IPIs have > > to be singled out, I'd rather this is done in common code, and not on > > a per architecture basis. > > The idea was to start simple, avoid running code for architectures > that don't seem to need it, by using this opt-in and lightweight > approach. If this stuff is remotely useful, then it is useful to everyone, and I don't see the point in littering the arch code with it. We have plenty of buy-in features that can be selected by an architecture and ignored by others if they see fit. > > I guess we could do this in generic IRQ code too. Perhaps making it > conditional behind a Kconfig, if required. > > > > > > trace_ipi_raise(target, ipi_types[ipinr]); > > > arm64_send_ipi(target, ipinr); > > > } > > > > > > +bool cpus_has_pending_ipi(const struct cpumask *mask) > > > +{ > > > + unsigned int cpu; > > > + > > > + for_each_cpu(cpu, mask) { > > > + if (per_cpu(pending_ipi, cpu)) > > > + return true; > > > + } > > > + return false; > > > +} > > > + > > > > The lack of memory barriers makes me wonder how reliable this is. > > Maybe this is relying on the IPIs themselves acting as such, but > > that's extremely racy no matter how you look at it. > > It's deliberately lightweight. I am worried about introducing > locking/barriers, as those could be costly and introduce latencies in > these paths. "I've made this car 10% faster by removing the brakes. It's great! Try it!" > Still this is good enough to significantly improve cpuidle based > decisions in this regard. Please have a look at the commit message of > patch3. If I can't see how this thing is *correct*, I really don't care how fast it is. You might as well remove most locks and barriers from the kernel -- it will be even faster! M. -- Without deviation from the norm, progress is not possible.