From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH 00/36] cpuidle,rcu: Cleanup the mess Date: Tue, 14 Jun 2022 18:58:30 +0200 Message-ID: References: <20220608142723.103523089@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=sat3Ay5cLjo4d6tU+VjO2Gj4xa+yot4RzgDwqSc2j00=; b=oTTvzdrlzuM1YciKvYSz8vYCZ+ uwlSmBHlK5sxDBJV0dcMNalrGhrnqgMOKyXUZ+NoLLlhXJyR0fgCg0Vd7w7YR39O4rmwupvVnnt7z Pv2IpKHqOd9e1zUBiLabpdWMpnntVJGaogImROtBH0G7ylxF+ZmLkVrbSd96L8Hy9XgyJ+fsvwu03 MqdSZ0j3kwhR1jLAqEDq0pTMC9nvW7cYm2jiZab7OmZmrpb4+NKhzLJEVNFsIG3fWTmbi5xOWvkSk 63UMys0mwfZRWEzmZMP+o/065ihcNnT8/M/jWgxXAG4XceJ1zUAKP/sWTfpGygRyzdDuXmNAR06ut 6gQsA8sw==; Content-Disposition: inline In-Reply-To: List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" To: Mark Rutland Cc: juri.lelli@redhat.com, rafael@kernel.org, benh@kernel.crashing.org, linus.walleij@linaro.org, bsegall@google.com, guoren@kernel.org, pavel@ucw.cz, agordeev@linux.ibm.com, linux-arch@vger.kernel.org, vincent.guittot@linaro.org, mpe@ellerman.id.au, chenhuacai@kernel.org, linux-acpi@vger.kernel.org, agross@kernel.org, geert@linux-m68k.org, linux-imx@nxp.com, catalin.marinas@arm.com, xen-devel@lists.xenproject.org, mattst88@gmail.com, mturquette@baylibre.com, sammy@sammy.net, pmladek@suse.com, linux-pm@vger.kernel.org, jiangshanlai@gmail.com, Sascha Hauer , linux-um@lists.infradead.org, acme@kernel.org, tglx@linutronix.de, linux-omap@vger.kernel.org, dietmar.eggemann@arm.com, rth@twiddle.net, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, senozhatsky@chromium.org, svens@linux.ibm.com, jolsa@kernel.org, paul On Tue, Jun 14, 2022 at 12:19:29PM +0100, Mark Rutland wrote: > On Wed, Jun 08, 2022 at 04:27:23PM +0200, Peter Zijlstra wrote: > > Hi All! (omg so many) > > Hi Peter, > > Sorry for the delay; my plate has also been rather full recently. I'm beginning > to page this in now. No worries; we all have too much to do ;-) > > These here few patches mostly clear out the utter mess that is cpuidle vs rcuidle. > > > > At the end of the ride there's only 2 real RCU_NONIDLE() users left > > > > arch/arm64/kernel/suspend.c: RCU_NONIDLE(__cpu_suspend_exit()); > > drivers/perf/arm_pmu.c: RCU_NONIDLE(armpmu_start(event, PERF_EF_RELOAD)); > > The latter of these is necessary because apparently PM notifiers are called > with RCU not watching. Is that still the case today (or at the end of this > series)? If so, that feels like fertile land for more issues (yaey...). If not, > we should be able to drop this. That should be fixed; fingers crossed :-) > > kernel/cfi.c: RCU_NONIDLE({ > > > > (the CFI one is likely dead in the kCFI rewrite) and there's only a hand full > > of trace_.*_rcuidle() left: > > > > kernel/trace/trace_preemptirq.c: trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); > > kernel/trace/trace_preemptirq.c: trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); > > kernel/trace/trace_preemptirq.c: trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr); > > kernel/trace/trace_preemptirq.c: trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr); > > kernel/trace/trace_preemptirq.c: trace_preempt_enable_rcuidle(a0, a1); > > kernel/trace/trace_preemptirq.c: trace_preempt_disable_rcuidle(a0, a1); > > > > All of them are in 'deprecated' code that is unused for GENERIC_ENTRY. > > I think those are also unused on arm64 too? > > If not, I can go attack that. My grep spots: arch/arm64/kernel/entry-common.c: trace_hardirqs_on(); arch/arm64/include/asm/daifflags.h: trace_hardirqs_off(); arch/arm64/include/asm/daifflags.h: trace_hardirqs_off(); The _on thing should be replaced with something like: trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); instrumentation_end(); rcu_irq_exit(); lockdep_hardirqs_on(CALLER_ADDR0); (as I think you know, since you have some of that already). And something similar for the _off thing, but with _off_finish(). > > I've touched a _lot_ of code that I can't test and likely broken some of it :/ > > In particular, the whole ARM cpuidle stuff was quite involved with OMAP being > > the absolute 'winner'. > > > > I'm hoping Mark can help me sort the remaining ARM64 bits as he moves that to > > GENERIC_ENTRY. > > Moving to GENERIC_ENTRY as a whole is going to take a tonne of work > (refactoring both arm64 and the generic portion to be more amenable to each > other), but we can certainly move closer to that for the bits that matter here. I know ... been there etc.. :-) > Maybe we want a STRICT_ENTRY option to get rid of all the deprecated stuff that > we can select regardless of GENERIC_ENTRY to make that easier. Possible yeah. > > I've also got a note that says ARM64 can probably do a WFE based > > idle state and employ TIF_POLLING_NRFLAG to avoid some IPIs. > > Possibly; I'm not sure how much of a win that'll be given that by default we'll > have a ~10KHz WFE wakeup from the timer, but we could take a peek. Ohh.. I didn't know it woke up *that* often. I just know Will made use of it in things like smp_cond_load_relaxed() which would be somewhat similar to a very shallow idle state that looks at the TIF word.