From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69AC6396B8E for ; Fri, 20 Mar 2026 15:50:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774021807; cv=none; b=Wmq89vQexz5ofOxj6ncu470VysUg+2xm9T3kl5MExXVeTwehujP5BNn2OGZDaFP53uMlwEk7R+RUGiSfrJVZEIiHrDEpiGWYsgGwDo8xl5HVey/JQyzmITLRlyay7Z16baL72Va6dHJKQWdTzWDLFcbwcMDMU+q237uy3iC2SSc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774021807; c=relaxed/simple; bh=iStYPl5owYOHq6DDbCbeqEwQVN/Ca5rVrhfQiuk39DU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=tQqOpEc+tF7zhvk0MEw/VsLVK8zj2eV+V/OyiSFQM2Wb9ogCC57bCMDk00tVxTIxTSf82O18ta9+CnzYSGIMBLQTNP+J5U+ACTu/0MzL6d38nJ2j/UCAeQrGdWG43HLEMQVIgvoZyC+z0h5sMuk3M1t6MlxPnfSkMlL9m3S29No= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=u5WmjTdT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="u5WmjTdT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 62265C4CEF7; Fri, 20 Mar 2026 15:50:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774021807; bh=iStYPl5owYOHq6DDbCbeqEwQVN/Ca5rVrhfQiuk39DU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=u5WmjTdTUrJ4iPcX6gBK3//9wTkiWnr5LdGufGsxBmzNy9bz/nDjZ764dXN/uqdUX 3JrNQ0VS1JPYxC+A6gQGAi2XKIaBj4ALc5AON/ihYyNlBSG5iW/Az1qDQneXKdkNEE Fd9Y15CxReVK1KnKOOSaOGZU+9hd6eiKIPoupipzqpCV2F4K4+PSIn6AvHdd/ErZ1v CgS8fLwFZn2pUi0KSV/yeVuVJTxo7DTxhCicdeTcnb8Ay9P2mI75kbwX+dPtBlxVxj BvKL4642A9crLi+3h8kpmegZKMkCHQKC7NAwnnYmFxIwRz6PEA0FxXkia4vBXn9JdU NBOT/AsQWq9vQ== From: Thomas Gleixner To: Mark Rutland Cc: Peter Zijlstra , linux-arm-kernel@lists.infradead.org, ada.coupriediaz@arm.com, catalin.marinas@arm.com, linux-kernel@vger.kernel.org, luto@kernel.org, ruanjinjie@huawei.com, vladimir.murzin@arm.com, will@kernel.org Subject: Re: [PATCH 1/2] arm64/entry: Fix involuntary preemption exception masking In-Reply-To: References: <20260320113026.3219620-1-mark.rutland@arm.com> <20260320113026.3219620-2-mark.rutland@arm.com> <20260320130433.GV3738786@noisy.programming.kicks-ass.net> <87h5qak2uv.ffs@tglx> Date: Fri, 20 Mar 2026 16:50:03 +0100 Message-ID: <875x6qjyac.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Fri, Mar 20 2026 at 14:57, Mark Rutland wrote: > On Fri, Mar 20, 2026 at 03:11:20PM +0100, Thomas Gleixner wrote: >> Yes. It's not an optimization. It's a correctness issue. >> >> If the interrupted context is RCU idle then you have to carefully go >> back to that context. So that the context can tell RCU it is done with >> the idle state and RCU has to pay attention again. Otherwise all of this >> becomes imbalanced. >> >> This is about context-level nesting: >> >> ... >> L1.A ct_cpuidle_enter(); >> >> -> interrupt >> L2.A ct_irq_enter(); >> ... // Set NEED_RESCHED >> L2.B ct_irq_exit(); >> >> ... >> L1.B ct_cpuidle_exit(); >> >> Scheduling between #L2.B and #L1.B makes RCU rightfully upset. > > I suspect I'm missing something obvious here: > > * Regardless of nesting, I see that scheduling between L2.B and L1.B is > broken because RCU isn't watching. > > * I'm not sure whether there's a problem with scheduling between L2.A > and L2.B, which is what arm64 used to do, and what arm64 would do > after this patch. The only reason why it "works" is that the idle task has preemption permanently disabled, so it won't really schedule even if need_resched() is set. So it "works" by chance and not by design. Apply the patch below and watch the show. > Thanks for all of this. Even if I'm confused right now, it's very > helpful! RCU induced confusion is perfectly normal. Everyone suffers from that at some point. Welcome to the club. Thanks, tglx --- --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -187,9 +187,10 @@ static inline bool arch_irqentry_exit_ne void raw_irqentry_exit_cond_resched(void) { + rcu_irq_exit_check_preempt(); + if (!preempt_count()) { /* Sanity check RCU and thread stack */ - rcu_irq_exit_check_preempt(); if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) WARN_ON_ONCE(!on_thread_stack()); if (need_resched() && arch_irqentry_exit_need_resched())