From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87C9856B7C; Sun, 24 Mar 2024 17:07:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711300035; cv=none; b=dUFEGSkYqojpnOqd5GzedD4jmGsT7JFHQAPn/whtouKiddWVkO7se+2iILLRUevQT88HrUEkVSNmt9nJe86CEMNMHTiGZ+JL67nnK+m7WOQyd+nRjSTxy9+2ci0KHBhkqlfifViFJRHnPq5MutLfG37Rnto/4KF/VTOH3tV2qgA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711300035; c=relaxed/simple; bh=qIbERhrh3q9HnEdNRmoUwUHaCm1vO+G+MzpNOA2WQF0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jyGP1ukag8Zgj79NPjG40W0gk5okXgaFpgaxfA5Hygt88yLY1MsfzRJ5Vtm50XtR1lw6X4kpgbTD+5aDhX7KFcMsV7bNNcyc81+WcyMh4PPZVvd2HSziGSMHL6OHqTmtmFH5pKvWk9a0EwH2pQRQRbU98ybxtIOkintg6m1RQ3I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XU/eWq6I; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XU/eWq6I" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B997BC433F1; Sun, 24 Mar 2024 17:07:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711300035; bh=qIbERhrh3q9HnEdNRmoUwUHaCm1vO+G+MzpNOA2WQF0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XU/eWq6Ir+y+5WF5ggnFvNKeTeAHhOGoyOTDMhG/x9gjv0fiW0QkNSY9XdjXnxrTF XBjqqoqYqunOY/+td0D++pOi8webcxQwBotYtVhU8ZTH+xS9RN6p4GR/eWSiaMpqK5 EcE6AX34fzKPxkYWegCvCaUO5mQxqskKpq1r8+oInQAo9qjaToqpIbhHbnTWncE3UJ xnH1ic+7ta3qbxQixN3AgoH85/4SnBavQaRpbusfJkOk/CuEDbNyQM7FupdIHDMrLN o5CiyOvINtjugoY0j216DGCtFJgbY2zYDpJG025QnXY0nAR0HUidvFg3hPrJsqCBoB 1D/6d+JDRaG7Q== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: "Paul E. McKenney" , Chen Zhongjin , Yang Jihong , Frederic Weisbecker , Boqun Feng , Sasha Levin , mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, quic_neeraju@quicinc.com, joel@joelfernandes.org, josh@joshtriplett.org, rcu@vger.kernel.org Subject: [PATCH AUTOSEL 6.1 3/7] rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks Date: Sun, 24 Mar 2024 13:07:03 -0400 Message-ID: <20240324170709.546465-3-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240324170709.546465-1-sashal@kernel.org> References: <20240324170709.546465-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.1.82 Content-Transfer-Encoding: 8bit From: "Paul E. McKenney" [ Upstream commit bfe93930ea1ea3c6c115a7d44af6e4fea609067e ] Holding a mutex across synchronize_rcu_tasks() and acquiring that same mutex in code called from do_exit() after its call to exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() results in deadlock. This is by design, because tasks that are far enough into do_exit() are no longer present on the tasks list, making it a bit difficult for RCU Tasks to find them, let alone wait on them to do a voluntary context switch. However, such deadlocks are becoming more frequent. In addition, lockdep currently does not detect such deadlocks and they can be difficult to reproduce. In addition, if a task voluntarily context switches during that time (for example, if it blocks acquiring a mutex), then this task is in an RCU Tasks quiescent state. And with some adjustments, RCU Tasks could just as well take advantage of that fact. This commit therefore adds the data structures that will be needed to rely on these quiescent states and to eliminate these deadlocks. Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ Reported-by: Chen Zhongjin Reported-by: Yang Jihong Signed-off-by: Paul E. McKenney Tested-by: Yang Jihong Tested-by: Chen Zhongjin Reviewed-by: Frederic Weisbecker Signed-off-by: Boqun Feng Signed-off-by: Sasha Levin --- include/linux/sched.h | 2 ++ kernel/rcu/tasks.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 0cac69902ec58..ffcd100de169c 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -848,6 +848,8 @@ struct task_struct { u8 rcu_tasks_idx; int rcu_tasks_idle_cpu; struct list_head rcu_tasks_holdout_list; + int rcu_tasks_exit_cpu; + struct list_head rcu_tasks_exit_list; #endif /* #ifdef CONFIG_TASKS_RCU */ #ifdef CONFIG_TASKS_TRACE_RCU diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index b5d5b6cf093a7..919c22698569e 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -30,6 +30,7 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp); * @rtp_irq_work: IRQ work queue for deferred wakeups. * @barrier_q_head: RCU callback for barrier operation. * @rtp_blkd_tasks: List of tasks blocked as readers. + * @rtp_exit_list: List of tasks in the latter portion of do_exit(). * @cpu: CPU number corresponding to this entry. * @rtpp: Pointer to the rcu_tasks structure. */ @@ -42,6 +43,7 @@ struct rcu_tasks_percpu { struct irq_work rtp_irq_work; struct rcu_head barrier_q_head; struct list_head rtp_blkd_tasks; + struct list_head rtp_exit_list; int cpu; struct rcu_tasks *rtpp; }; -- 2.43.0