From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A36AC25B0D for ; Thu, 11 Aug 2022 16:08:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237394AbiHKQIZ (ORCPT ); Thu, 11 Aug 2022 12:08:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237008AbiHKQIJ (ORCPT ); Thu, 11 Aug 2022 12:08:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBF029A9BC; Thu, 11 Aug 2022 08:54:07 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3E0A7601BF; Thu, 11 Aug 2022 15:54:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0AE4C433C1; Thu, 11 Aug 2022 15:54:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1660233246; bh=Z4ZBobhS6bDzJddUB9WrUzwDFCYOiM1tShexFh4oqY0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sBPLrN4MC5FlPhVHafmllYIAvVi1cy9YXHfaagd6FzqNEq9OwDfgJI1CCpURwpJnh mTbJLCy4HSuCLj8jW1Kx9WELW6LAopV2Ag6/9iS3UILgZY8LX2AYPrezYGymjz/qNt faH14hSGvAUa8N96JPjHbHmrAzk59xwMVVcW17aXQzsnaD6UGYY5CSgdrul4zfOVIm xtWcTN+eoMMMN5nO6TiOuzZ5E8B+VIgElIIZHwYrXxcu4LpfFU6/hqGL5HW+TpKgAQ FXIbCUxiGGMMkw4EWWZ3kFE61CQp2IuHveA2YDXTa3PdCtEhUUU9A86pqY7okbSb61 A5WrClc9xzZzg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Patrick Wang , Steven Rostedt , "Paul E . McKenney" , Neeraj Upadhyay , Sasha Levin , frederic@kernel.org, josh@joshtriplett.org, rcu@vger.kernel.org Subject: [PATCH AUTOSEL 5.18 70/93] rcu: Avoid tracing a few functions executed in stop machine Date: Thu, 11 Aug 2022 11:42:04 -0400 Message-Id: <20220811154237.1531313-70-sashal@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220811154237.1531313-1-sashal@kernel.org> References: <20220811154237.1531313-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Patrick Wang [ Upstream commit 48f8070f5dd8e13148ae4647780a452d53c457a2 ] Stop-machine recently started calling additional functions while waiting: ---------------------------------------------------------------- Former stop machine wait loop: do { cpu_relax(); => macro ... } while (curstate != STOPMACHINE_EXIT); ----------------------------------------------------------------- Current stop machine wait loop: do { stop_machine_yield(cpumask); => function (notraced) ... touch_nmi_watchdog(); => function (notraced, inside calls also notraced) ... rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced) } while (curstate != MULTI_STOP_EXIT); ------------------------------------------------------------------ These functions (and the functions that they call) must be marked notrace to prevent them from being updated while they are executing. The consequences of failing to mark these functions can be severe: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0 rcu: 3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0 (detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4) Task dump for CPU 1: task:migration/1 state:R running task stack: 0 pid: 19 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: Task dump for CPU 3: task:migration/3 state:R running task stack: 0 pid: 29 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 rcu: Possible timer handling issue on cpu=2 timer-softirq=594 rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack: 0 pid: 14 ppid: 2 flags:0x00000000 Call Trace: schedule+0x56/0xc2 schedule_timeout+0x82/0x184 rcu_gp_fqs_loop+0x19a/0x318 rcu_gp_kthread+0x11a/0x140 kthread+0xee/0x118 ret_from_exception+0x0/0x14 rcu: Stack dump where RCU GP kthread last ran: Task dump for CPU 2: task:migration/2 state:R running task stack: 0 pid: 24 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: This commit therefore marks these functions notrace: rcu_preempt_deferred_qs() rcu_preempt_need_deferred_qs() rcu_preempt_deferred_qs_irqrestore() [ paulmck: Apply feedback from Neeraj Upadhyay. ] Signed-off-by: Patrick Wang Acked-by: Steven Rostedt (Google) Signed-off-by: Paul E. McKenney Reviewed-by: Neeraj Upadhyay Signed-off-by: Sasha Levin --- kernel/rcu/tree_plugin.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 8360d86db1c0..bfcff6c97e59 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -460,7 +460,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp) * be quite short, for example, in the case of the call from * rcu_read_unlock_special(). */ -static void +static notrace void rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) { bool empty_exp; @@ -580,7 +580,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * is disabled. This function cannot be expected to understand these * nuances, so the caller must handle them. */ -static bool rcu_preempt_need_deferred_qs(struct task_struct *t) +static notrace bool rcu_preempt_need_deferred_qs(struct task_struct *t) { return (__this_cpu_read(rcu_data.cpu_no_qs.b.exp) || READ_ONCE(t->rcu_read_unlock_special.s)) && @@ -594,7 +594,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t) * evaluate safety in terms of interrupt, softirq, and preemption * disabling. */ -static void rcu_preempt_deferred_qs(struct task_struct *t) +static notrace void rcu_preempt_deferred_qs(struct task_struct *t) { unsigned long flags; @@ -919,7 +919,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp) * Because there is no preemptible RCU, there can be no deferred quiescent * states. */ -static bool rcu_preempt_need_deferred_qs(struct task_struct *t) +static notrace bool rcu_preempt_need_deferred_qs(struct task_struct *t) { return false; } @@ -928,7 +928,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t) // period for a quiescent state from this CPU. Note that requests from // tasks are handled when removing the task from the blocked-tasks list // below. -static void rcu_preempt_deferred_qs(struct task_struct *t) +static notrace void rcu_preempt_deferred_qs(struct task_struct *t) { struct rcu_data *rdp = this_cpu_ptr(&rcu_data); -- 2.35.1