From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
rostedt@goodmis.org, "Paul E. McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 02/11] rcuscale: Dump stacks of stalled rcu_scale_writer() instances
Date: Thu, 1 Aug 2024 17:42:59 -0700 [thread overview]
Message-ID: <20240802004308.4134731-2-paulmck@kernel.org> (raw)
In-Reply-To: <917e8cc8-8688-428a-9122-25544c5cc101@paulmck-laptop>
This commit improves debuggability by dumping the stacks of
rcu_scale_writer() instances that have not completed in a reasonable
timeframe. These stacks are dumped remotely, but they will be accurate
in the thus-far common case where the stalled rcu_scale_writer() instances
are blocked.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
kernel/rcu/rcuscale.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 3269dd9c639f7..c34a8e64edc30 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -39,6 +39,7 @@
#include <linux/torture.h>
#include <linux/vmalloc.h>
#include <linux/rcupdate_trace.h>
+#include <linux/sched/debug.h>
#include "rcu.h"
@@ -111,6 +112,7 @@ static struct task_struct **reader_tasks;
static struct task_struct *shutdown_task;
static u64 **writer_durations;
+static bool *writer_done;
static int *writer_n_durations;
static atomic_t n_rcu_scale_reader_started;
static atomic_t n_rcu_scale_writer_started;
@@ -524,6 +526,7 @@ rcu_scale_writer(void *arg)
started = true;
if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) {
done = true;
+ WRITE_ONCE(writer_done[me], true);
sched_set_normal(current, 0);
pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n",
scale_type, SCALE_FLAG, me, MIN_MEAS);
@@ -549,6 +552,19 @@ rcu_scale_writer(void *arg)
if (done && !alldone &&
atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters)
alldone = true;
+ if (done && !alldone && time_after(jiffies, jdone + HZ * 60)) {
+ static atomic_t dumped;
+ int i;
+
+ if (!atomic_xchg(&dumped, 1)) {
+ for (i = 0; i < nrealwriters; i++) {
+ if (writer_done[i])
+ continue;
+ pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i);
+ sched_show_task(writer_tasks[i]);
+ }
+ }
+ }
if (started && !alldone && i < MAX_MEAS - 1)
i++;
rcu_scale_wait_shutdown();
@@ -1015,10 +1031,11 @@ rcu_scale_init(void)
}
while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders)
schedule_timeout_uninterruptible(1);
- writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]), GFP_KERNEL);
+ writer_tasks = kcalloc(nrealwriters, sizeof(writer_tasks[0]), GFP_KERNEL);
writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL);
writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL);
- if (!writer_tasks || !writer_durations || !writer_n_durations) {
+ writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL);
+ if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done) {
SCALEOUT_ERRSTRING("out of memory");
firsterr = -ENOMEM;
goto unwind;
--
2.40.1
next prev parent reply other threads:[~2024-08-02 0:43 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-02 0:43 [PATCH rcu 0/11] RCU update-side scalability update test Paul E. McKenney
2024-08-02 0:42 ` Paul E. McKenney [this message]
2024-08-02 0:43 ` [PATCH rcu 03/11] rcuscale: Dump grace-period statistics when rcu_scale_writer() stalls Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 04/11] rcu: Mark callbacks not currently participating in barrier operation Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 05/11] rcuscale: Print detailed grace-period and barrier diagnostics Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 06/11] rcuscale: Provide clear error when async specified without primitives Paul E. McKenney
2024-08-14 12:49 ` Neeraj Upadhyay
2024-08-14 15:09 ` Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 07/11] rcuscale: Make all writer tasks report upon hang Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 08/11] rcuscale: Make rcu_scale_writer() tolerate repeated GFP_KERNEL failure Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 09/11] rcuscale: Use special allocator for rcu_scale_writer() Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 10/11] rcuscale: NULL out top-level pointers to heap memory Paul E. McKenney
2024-08-02 0:43 ` [PATCH rcu 11/11] rcuscale: Count outstanding callbacks per-task rather than per-CPU Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240802004308.4134731-2-paulmck@kernel.org \
--to=paulmck@kernel.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox