public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH rcu 00/14] RCU scaling tests updates for v6.12
@ 2024-08-16  7:02 Neeraj Upadhyay
  2024-08-16  7:02 ` [PATCH rcu 01/14] refscale: Add TINY scenario neeraj.upadhyay
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Neeraj Upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: paulmck, joel, frederic, boqun.feng, urezki, linux-kernel,
	kernel-team, rostedt, christophe.jaillet

Hello,

Following are the rcuscale and refscale updates for v6.12:

1.      Add TINY scenario, courtesy of Paul E. McKenney.

2.      Optimize process_durations(), courtesy of Christophe JAILLET.

3.      Save a few lines with whitespace-only change, courtesy of
        Paul E. McKenney.

4.      Dump stacks of stalled rcu_scale_writer() instances,
        courtesy of Paul E. McKenney.

5.      Dump grace-period statistics when rcu_scale_writer() stalls,
        courtesy of Paul E. McKenney.

6.      Mark callbacks not currently participating in barrier operation,
        courtesy of Paul E. McKenney.

7.      Print detailed grace-period and barrier diagnostics, courtesy of
        Paul E. McKenney.

8.      Provide clear error when async specified without primitives,
        courtesy of Paul E. McKenney.

9.      Make all writer tasks report upon hang, courtesy of Paul E. McKenney.

10.     Make rcu_scale_writer() tolerate repeated GFP_KERNEL failure,
        courtesy of Paul E. McKenney.

11.     Use special allocator for rcu_scale_writer(), courtesy of
        Paul E. McKenney.

12.     NULL out top-level pointers to heap memory, courtesy of
        Paul E. McKenney.

13.     Count outstanding callbacks per-task rather than per-CPU,
        courtesy of Paul E. McKenney.

14.     Constify struct ref_scale_ops, courtesy of Christophe JAILLET.


Git tree: https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/log/?h=rcu_scaling_tests.15.08.24a
(rebased on top of rcu.tasks.14.08.24a due to commit dependency)



- Neeraj

------------------------------------------------------------------------

 kernel/rcu/rcuscale.c                         | 212 +++++++++++++++---
 kernel/rcu/refscale.c                         |  67 +++---
 kernel/rcu/tree.c                             |   3 +
 .../rcutorture/configs/refscale/TINY          |  20 ++
 4 files changed, 245 insertions(+), 57 deletions(-)
 create mode 100644 tools/testing/selftests/rcutorture/configs/refscale/TINY

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH rcu 01/14] refscale: Add TINY scenario
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 02/14] refscale: Optimize process_durations() neeraj.upadhyay
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit adds a TINY scenario in order to support tests of Tiny RCU
and Tiny SRCU.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 .../rcutorture/configs/refscale/TINY          | 20 +++++++++++++++++++
 1 file changed, 20 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/refscale/TINY

diff --git a/tools/testing/selftests/rcutorture/configs/refscale/TINY b/tools/testing/selftests/rcutorture/configs/refscale/TINY
new file mode 100644
index 000000000000..759343980b80
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/refscale/TINY
@@ -0,0 +1,20 @@
+CONFIG_SMP=n
+CONFIG_PREEMPT_NONE=y
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=n
+CONFIG_PREEMPT_DYNAMIC=n
+#CHECK#CONFIG_PREEMPT_RCU=n
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_KPROBES=n
+CONFIG_FTRACE=n
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 02/14] refscale: Optimize process_durations()
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
  2024-08-16  7:02 ` [PATCH rcu 01/14] refscale: Add TINY scenario neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 03/14] rcuscale: Save a few lines with whitespace-only change neeraj.upadhyay
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic,
	Christophe JAILLET, Davidlohr Bueso

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

process_durations() is not a hot path, but there is no good reason to
iterate over and over the data already in 'buf'.

Using a seq_buf saves some useless strcat() and the need of a temp buffer.
Data is written directly at the correct place.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/refscale.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index f4ea5b1ec068..cfec0648e141 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -28,6 +28,7 @@
 #include <linux/rcupdate_trace.h>
 #include <linux/reboot.h>
 #include <linux/sched.h>
+#include <linux/seq_buf.h>
 #include <linux/spinlock.h>
 #include <linux/smp.h>
 #include <linux/stat.h>
@@ -891,32 +892,34 @@ static u64 process_durations(int n)
 {
 	int i;
 	struct reader_task *rt;
-	char buf1[64];
+	struct seq_buf s;
 	char *buf;
 	u64 sum = 0;
 
 	buf = kmalloc(800 + 64, GFP_KERNEL);
 	if (!buf)
 		return 0;
-	buf[0] = 0;
-	sprintf(buf, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
-		exp_idx);
+	seq_buf_init(&s, buf, 800 + 64);
+
+	seq_buf_printf(&s, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
+		       exp_idx);
 
 	for (i = 0; i < n && !torture_must_stop(); i++) {
 		rt = &(reader_tasks[i]);
-		sprintf(buf1, "%d: %llu\t", i, rt->last_duration_ns);
 
 		if (i % 5 == 0)
-			strcat(buf, "\n");
-		if (strlen(buf) >= 800) {
-			pr_alert("%s", buf);
-			buf[0] = 0;
+			seq_buf_putc(&s, '\n');
+
+		if (seq_buf_used(&s) >= 800) {
+			pr_alert("%s", seq_buf_str(&s));
+			seq_buf_clear(&s);
 		}
-		strcat(buf, buf1);
+
+		seq_buf_printf(&s, "%d: %llu\t", i, rt->last_duration_ns);
 
 		sum += rt->last_duration_ns;
 	}
-	pr_alert("%s\n", buf);
+	pr_alert("%s\n", seq_buf_str(&s));
 
 	kfree(buf);
 	return sum;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 03/14] rcuscale: Save a few lines with whitespace-only change
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
  2024-08-16  7:02 ` [PATCH rcu 01/14] refscale: Add TINY scenario neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 02/14] refscale: Optimize process_durations() neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 04/14] rcuscale: Dump stacks of stalled rcu_scale_writer() instances neeraj.upadhyay
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

This whitespace-only commit fuses a few lines of code, taking advantage
of the newish 100-character-per-line limit to save a few lines of code.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index d534d4ec2314..3269dd9c639f 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -1015,13 +1015,9 @@ rcu_scale_init(void)
 	}
 	while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders)
 		schedule_timeout_uninterruptible(1);
-	writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]),
-			       GFP_KERNEL);
-	writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations),
-				   GFP_KERNEL);
-	writer_n_durations =
-		kcalloc(nrealwriters, sizeof(*writer_n_durations),
-			GFP_KERNEL);
+	writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]), GFP_KERNEL);
+	writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL);
+	writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL);
 	if (!writer_tasks || !writer_durations || !writer_n_durations) {
 		SCALEOUT_ERRSTRING("out of memory");
 		firsterr = -ENOMEM;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 04/14] rcuscale: Dump stacks of stalled rcu_scale_writer() instances
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (2 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 03/14] rcuscale: Save a few lines with whitespace-only change neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 05/14] rcuscale: Dump grace-period statistics when rcu_scale_writer() stalls neeraj.upadhyay
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit improves debuggability by dumping the stacks of
rcu_scale_writer() instances that have not completed in a reasonable
timeframe.  These stacks are dumped remotely, but they will be accurate
in the thus-far common case where the stalled rcu_scale_writer() instances
are blocked.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 3269dd9c639f..5087ca7062d9 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -39,6 +39,7 @@
 #include <linux/torture.h>
 #include <linux/vmalloc.h>
 #include <linux/rcupdate_trace.h>
+#include <linux/sched/debug.h>
 
 #include "rcu.h"
 
@@ -111,6 +112,7 @@ static struct task_struct **reader_tasks;
 static struct task_struct *shutdown_task;
 
 static u64 **writer_durations;
+static bool *writer_done;
 static int *writer_n_durations;
 static atomic_t n_rcu_scale_reader_started;
 static atomic_t n_rcu_scale_writer_started;
@@ -524,6 +526,7 @@ rcu_scale_writer(void *arg)
 			started = true;
 		if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) {
 			done = true;
+			WRITE_ONCE(writer_done[me], true);
 			sched_set_normal(current, 0);
 			pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n",
 				 scale_type, SCALE_FLAG, me, MIN_MEAS);
@@ -549,6 +552,19 @@ rcu_scale_writer(void *arg)
 		if (done && !alldone &&
 		    atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters)
 			alldone = true;
+		if (done && !alldone && time_after(jiffies, jdone + HZ * 60)) {
+			static atomic_t dumped;
+			int i;
+
+			if (!atomic_xchg(&dumped, 1)) {
+				for (i = 0; i < nrealwriters; i++) {
+					if (writer_done[i])
+						continue;
+					pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i);
+					sched_show_task(writer_tasks[i]);
+				}
+			}
+		}
 		if (started && !alldone && i < MAX_MEAS - 1)
 			i++;
 		rcu_scale_wait_shutdown();
@@ -921,6 +937,8 @@ rcu_scale_cleanup(void)
 		kfree(writer_tasks);
 		kfree(writer_durations);
 		kfree(writer_n_durations);
+		kfree(writer_done);
+		writer_done = NULL;
 	}
 
 	/* Do torture-type-specific cleanup operations.  */
@@ -1015,10 +1033,11 @@ rcu_scale_init(void)
 	}
 	while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders)
 		schedule_timeout_uninterruptible(1);
-	writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]), GFP_KERNEL);
+	writer_tasks = kcalloc(nrealwriters, sizeof(writer_tasks[0]), GFP_KERNEL);
 	writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL);
 	writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL);
-	if (!writer_tasks || !writer_durations || !writer_n_durations) {
+	writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL);
+	if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done) {
 		SCALEOUT_ERRSTRING("out of memory");
 		firsterr = -ENOMEM;
 		goto unwind;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 05/14] rcuscale: Dump grace-period statistics when rcu_scale_writer() stalls
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (3 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 04/14] rcuscale: Dump stacks of stalled rcu_scale_writer() instances neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 06/14] rcu: Mark callbacks not currently participating in barrier operation neeraj.upadhyay
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit adds a .stats function pointer to the rcu_scale_ops structure,
and if this is non-NULL, it is invoked after stack traces are dumped in
response to a rcu_scale_writer() stall.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 5087ca7062d9..1b9a43653d7e 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -145,6 +145,7 @@ struct rcu_scale_ops {
 	void (*sync)(void);
 	void (*exp_sync)(void);
 	struct task_struct *(*rso_gp_kthread)(void);
+	void (*stats)(void);
 	const char *name;
 };
 
@@ -226,6 +227,11 @@ static void srcu_scale_synchronize(void)
 	synchronize_srcu(srcu_ctlp);
 }
 
+static void srcu_scale_stats(void)
+{
+	srcu_torture_stats_print(srcu_ctlp, scale_type, SCALE_FLAG);
+}
+
 static void srcu_scale_synchronize_expedited(void)
 {
 	synchronize_srcu_expedited(srcu_ctlp);
@@ -243,6 +249,7 @@ static struct rcu_scale_ops srcu_ops = {
 	.gp_barrier	= srcu_rcu_barrier,
 	.sync		= srcu_scale_synchronize,
 	.exp_sync	= srcu_scale_synchronize_expedited,
+	.stats		= srcu_scale_stats,
 	.name		= "srcu"
 };
 
@@ -272,6 +279,7 @@ static struct rcu_scale_ops srcud_ops = {
 	.gp_barrier	= srcu_rcu_barrier,
 	.sync		= srcu_scale_synchronize,
 	.exp_sync	= srcu_scale_synchronize_expedited,
+	.stats		= srcu_scale_stats,
 	.name		= "srcud"
 };
 
@@ -563,6 +571,8 @@ rcu_scale_writer(void *arg)
 					pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i);
 					sched_show_task(writer_tasks[i]);
 				}
+				if (cur_ops->stats)
+					cur_ops->stats();
 			}
 		}
 		if (started && !alldone && i < MAX_MEAS - 1)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 06/14] rcu: Mark callbacks not currently participating in barrier operation
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (4 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 05/14] rcuscale: Dump grace-period statistics when rcu_scale_writer() stalls neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 07/14] rcuscale: Print detailed grace-period and barrier diagnostics neeraj.upadhyay
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

RCU keeps a count of the number of callbacks that the current
rcu_barrier() is waiting on, but there is currently no easy way to
work out which callback is stuck.  One way to do this is to mark idle
RCU-barrier callbacks by making the ->next pointer point to the callback
itself, and this commit does just that.

Later commits will use this for debug output.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/tree.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e641cc681901..f931171daecd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4403,6 +4403,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
 {
 	unsigned long __maybe_unused s = rcu_state.barrier_sequence;
 
+	rhp->next = rhp; // Mark the callback as having been invoked.
 	if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) {
 		rcu_barrier_trace(TPS("LastCB"), -1, s);
 		complete(&rcu_state.barrier_completion);
@@ -5424,6 +5425,8 @@ static void __init rcu_init_one(void)
 		while (i > rnp->grphi)
 			rnp++;
 		per_cpu_ptr(&rcu_data, i)->mynode = rnp;
+		per_cpu_ptr(&rcu_data, i)->barrier_head.next =
+			&per_cpu_ptr(&rcu_data, i)->barrier_head;
 		rcu_boot_init_percpu_data(i);
 	}
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 07/14] rcuscale: Print detailed grace-period and barrier diagnostics
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (5 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 06/14] rcu: Mark callbacks not currently participating in barrier operation neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 08/14] rcuscale: Provide clear error when async specified without primitives neeraj.upadhyay
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit uses the new  rcu_tasks_torture_stats_print(),
rcu_tasks_trace_torture_stats_print(), and
rcu_tasks_rude_torture_stats_print() functions in order to provide
detailed diagnostics on grace-period, callback, and barrier state when
rcu_scale_writer() hangs.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 1b9a43653d7e..c507750e94d8 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -298,6 +298,11 @@ static void tasks_scale_read_unlock(int idx)
 {
 }
 
+static void rcu_tasks_scale_stats(void)
+{
+	rcu_tasks_torture_stats_print(scale_type, SCALE_FLAG);
+}
+
 static struct rcu_scale_ops tasks_ops = {
 	.ptype		= RCU_TASKS_FLAVOR,
 	.init		= rcu_sync_scale_init,
@@ -310,6 +315,7 @@ static struct rcu_scale_ops tasks_ops = {
 	.sync		= synchronize_rcu_tasks,
 	.exp_sync	= synchronize_rcu_tasks,
 	.rso_gp_kthread	= get_rcu_tasks_gp_kthread,
+	.stats		= IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_scale_stats,
 	.name		= "tasks"
 };
 
@@ -336,6 +342,11 @@ static void tasks_rude_scale_read_unlock(int idx)
 {
 }
 
+static void rcu_tasks_rude_scale_stats(void)
+{
+	rcu_tasks_rude_torture_stats_print(scale_type, SCALE_FLAG);
+}
+
 static struct rcu_scale_ops tasks_rude_ops = {
 	.ptype		= RCU_TASKS_RUDE_FLAVOR,
 	.init		= rcu_sync_scale_init,
@@ -346,6 +357,7 @@ static struct rcu_scale_ops tasks_rude_ops = {
 	.sync		= synchronize_rcu_tasks_rude,
 	.exp_sync	= synchronize_rcu_tasks_rude,
 	.rso_gp_kthread	= get_rcu_tasks_rude_gp_kthread,
+	.stats		= IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_rude_scale_stats,
 	.name		= "tasks-rude"
 };
 
@@ -374,6 +386,11 @@ static void tasks_trace_scale_read_unlock(int idx)
 	rcu_read_unlock_trace();
 }
 
+static void rcu_tasks_trace_scale_stats(void)
+{
+	rcu_tasks_trace_torture_stats_print(scale_type, SCALE_FLAG);
+}
+
 static struct rcu_scale_ops tasks_tracing_ops = {
 	.ptype		= RCU_TASKS_FLAVOR,
 	.init		= rcu_sync_scale_init,
@@ -386,6 +403,7 @@ static struct rcu_scale_ops tasks_tracing_ops = {
 	.sync		= synchronize_rcu_tasks_trace,
 	.exp_sync	= synchronize_rcu_tasks_trace,
 	.rso_gp_kthread	= get_rcu_tasks_trace_gp_kthread,
+	.stats		= IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_trace_scale_stats,
 	.name		= "tasks-tracing"
 };
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 08/14] rcuscale: Provide clear error when async specified without primitives
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (6 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 07/14] rcuscale: Print detailed grace-period and barrier diagnostics neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 09/14] rcuscale: Make all writer tasks report upon hang neeraj.upadhyay
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, if the rcuscale module's async module parameter is specified
for RCU implementations that do not have async primitives such as RCU
Tasks Rude (which now lacks a call_rcu_tasks_rude() function), there
will be a series of splats due to calls to a NULL pointer.  This commit
therefore warns of this situation, but switches to non-async testing.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index c507750e94d8..79e1c32d5c0f 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -525,7 +525,7 @@ rcu_scale_writer(void *arg)
 			schedule_timeout_idle(torture_random(&tr) % writer_holdoff_jiffies + 1);
 		wdp = &wdpp[i];
 		*wdp = ktime_get_mono_fast_ns();
-		if (gp_async) {
+		if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) {
 retry:
 			if (!rhp)
 				rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
@@ -597,7 +597,7 @@ rcu_scale_writer(void *arg)
 			i++;
 		rcu_scale_wait_shutdown();
 	} while (!torture_must_stop());
-	if (gp_async) {
+	if (gp_async && cur_ops->async) {
 		cur_ops->gp_barrier();
 	}
 	writer_n_durations[me] = i_max + 1;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 09/14] rcuscale: Make all writer tasks report upon hang
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (7 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 08/14] rcuscale: Provide clear error when async specified without primitives neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 10/14] rcuscale: Make rcu_scale_writer() tolerate repeated GFP_KERNEL failure neeraj.upadhyay
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit causes all writer tasks to provide a brief report after a
hang has been reported, spaced at one-second intervals.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 79e1c32d5c0f..dfe8e0faa4d8 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -483,6 +483,7 @@ rcu_scale_writer(void *arg)
 	unsigned long jdone;
 	long me = (long)arg;
 	struct rcu_head *rhp = NULL;
+	bool selfreport = false;
 	bool started = false, done = false, alldone = false;
 	u64 t;
 	DEFINE_TORTURE_RANDOM(tr);
@@ -593,6 +594,11 @@ rcu_scale_writer(void *arg)
 					cur_ops->stats();
 			}
 		}
+		if (!selfreport && time_after(jiffies, jdone + HZ * (70 + me))) {
+			pr_info("%s: Writer %ld self-report: started %d done %d/%d->%d i %d jdone %lu.\n",
+				__func__, me, started, done, writer_done[me], atomic_read(&n_rcu_scale_writer_finished), i, jiffies - jdone);
+			selfreport = true;
+		}
 		if (started && !alldone && i < MAX_MEAS - 1)
 			i++;
 		rcu_scale_wait_shutdown();
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 10/14] rcuscale: Make rcu_scale_writer() tolerate repeated GFP_KERNEL failure
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (8 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 09/14] rcuscale: Make all writer tasks report upon hang neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 11/14] rcuscale: Use special allocator for rcu_scale_writer() neeraj.upadhyay
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

Under some conditions, kmalloc(GFP_KERNEL) allocations have been
observed to repeatedly fail.  This situation has been observed to
cause one of the rcu_scale_writer() instances to loop indefinitely
retrying memory allocation for an asynchronous grace-period primitive.
The problem is that if memory is short, all the other instances will
allocate all available memory before the looping task is awakened from
its rcu_barrier*() call.  This in turn results in hangs, so that rcuscale
fails to complete.

This commit therefore removes the tight retry loop, so that when this
condition occurs, the affected task is still passing through the full
loop with its full set of termination checks.  This spreads the risk
of indefinite memory-allocation retry failures across all instances of
rcu_scale_writer() tasks, which in turn prevents the hangs.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index dfe8e0faa4d8..80518662273b 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -520,6 +520,8 @@ rcu_scale_writer(void *arg)
 
 	jdone = jiffies + minruntime * HZ;
 	do {
+		bool gp_succeeded = false;
+
 		if (writer_holdoff)
 			udelay(writer_holdoff);
 		if (writer_holdoff_jiffies)
@@ -527,23 +529,24 @@ rcu_scale_writer(void *arg)
 		wdp = &wdpp[i];
 		*wdp = ktime_get_mono_fast_ns();
 		if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) {
-retry:
 			if (!rhp)
 				rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
 			if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) {
 				atomic_inc(this_cpu_ptr(&n_async_inflight));
 				cur_ops->async(rhp, rcu_scale_async_cb);
 				rhp = NULL;
+				gp_succeeded = true;
 			} else if (!kthread_should_stop()) {
 				cur_ops->gp_barrier();
-				goto retry;
 			} else {
 				kfree(rhp); /* Because we are stopping. */
 			}
 		} else if (gp_exp) {
 			cur_ops->exp_sync();
+			gp_succeeded = true;
 		} else {
 			cur_ops->sync();
+			gp_succeeded = true;
 		}
 		t = ktime_get_mono_fast_ns();
 		*wdp = t - *wdp;
@@ -599,7 +602,7 @@ rcu_scale_writer(void *arg)
 				__func__, me, started, done, writer_done[me], atomic_read(&n_rcu_scale_writer_finished), i, jiffies - jdone);
 			selfreport = true;
 		}
-		if (started && !alldone && i < MAX_MEAS - 1)
+		if (gp_succeeded && started && !alldone && i < MAX_MEAS - 1)
 			i++;
 		rcu_scale_wait_shutdown();
 	} while (!torture_must_stop());
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 11/14] rcuscale: Use special allocator for rcu_scale_writer()
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (9 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 10/14] rcuscale: Make rcu_scale_writer() tolerate repeated GFP_KERNEL failure neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 12/14] rcuscale: NULL out top-level pointers to heap memory neeraj.upadhyay
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

The rcu_scale_writer() function needs only a fixed number of rcu_head
structures per kthread, which means that a trivial allocator suffices.
This commit therefore uses an llist-based allocator using a fixed array of
structures per kthread.  This allows aggressive testing of RCU performance
without stressing the slab allocators.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 123 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 113 insertions(+), 10 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 80518662273b..bc7cca979c06 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -105,6 +105,19 @@ static char *scale_type = "rcu";
 module_param(scale_type, charp, 0444);
 MODULE_PARM_DESC(scale_type, "Type of RCU to scalability-test (rcu, srcu, ...)");
 
+// Structure definitions for custom fixed-per-task allocator.
+struct writer_mblock {
+	struct rcu_head wmb_rh;
+	struct llist_node wmb_node;
+	struct writer_freelist *wmb_wfl;
+};
+
+struct writer_freelist {
+	struct llist_head ws_lhg;
+	struct llist_head ____cacheline_internodealigned_in_smp ws_lhp;
+	struct writer_mblock *ws_mblocks;
+};
+
 static int nrealreaders;
 static int nrealwriters;
 static struct task_struct **writer_tasks;
@@ -113,6 +126,7 @@ static struct task_struct *shutdown_task;
 
 static u64 **writer_durations;
 static bool *writer_done;
+static struct writer_freelist *writer_freelists;
 static int *writer_n_durations;
 static atomic_t n_rcu_scale_reader_started;
 static atomic_t n_rcu_scale_writer_started;
@@ -463,13 +477,52 @@ rcu_scale_reader(void *arg)
 	return 0;
 }
 
+/*
+ * Allocate a writer_mblock structure for the specified rcu_scale_writer
+ * task.
+ */
+static struct writer_mblock *rcu_scale_alloc(long me)
+{
+	struct llist_node *llnp;
+	struct writer_freelist *wflp;
+	struct writer_mblock *wmbp;
+
+	if (WARN_ON_ONCE(!writer_freelists))
+		return NULL;
+	wflp = &writer_freelists[me];
+	if (llist_empty(&wflp->ws_lhp)) {
+		// ->ws_lhp is private to its rcu_scale_writer task.
+		wmbp = container_of(llist_del_all(&wflp->ws_lhg), struct writer_mblock, wmb_node);
+		wflp->ws_lhp.first = &wmbp->wmb_node;
+	}
+	llnp = llist_del_first(&wflp->ws_lhp);
+	if (!llnp)
+		return NULL;
+	return container_of(llnp, struct writer_mblock, wmb_node);
+}
+
+/*
+ * Free a writer_mblock structure to its rcu_scale_writer task.
+ */
+static void rcu_scale_free(struct writer_mblock *wmbp)
+{
+	struct writer_freelist *wflp;
+
+	if (!wmbp)
+		return;
+	wflp = wmbp->wmb_wfl;
+	llist_add(&wmbp->wmb_node, &wflp->ws_lhg);
+}
+
 /*
  * Callback function for asynchronous grace periods from rcu_scale_writer().
  */
 static void rcu_scale_async_cb(struct rcu_head *rhp)
 {
+	struct writer_mblock *wmbp = container_of(rhp, struct writer_mblock, wmb_rh);
+
 	atomic_dec(this_cpu_ptr(&n_async_inflight));
-	kfree(rhp);
+	rcu_scale_free(wmbp);
 }
 
 /*
@@ -482,13 +535,13 @@ rcu_scale_writer(void *arg)
 	int i_max;
 	unsigned long jdone;
 	long me = (long)arg;
-	struct rcu_head *rhp = NULL;
 	bool selfreport = false;
 	bool started = false, done = false, alldone = false;
 	u64 t;
 	DEFINE_TORTURE_RANDOM(tr);
 	u64 *wdp;
 	u64 *wdpp = writer_durations[me];
+	struct writer_mblock *wmbp = NULL;
 
 	VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started");
 	WARN_ON(!wdpp);
@@ -529,17 +582,18 @@ rcu_scale_writer(void *arg)
 		wdp = &wdpp[i];
 		*wdp = ktime_get_mono_fast_ns();
 		if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) {
-			if (!rhp)
-				rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
-			if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) {
+			if (!wmbp)
+				wmbp = rcu_scale_alloc(me);
+			if (wmbp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) {
 				atomic_inc(this_cpu_ptr(&n_async_inflight));
-				cur_ops->async(rhp, rcu_scale_async_cb);
-				rhp = NULL;
+				cur_ops->async(&wmbp->wmb_rh, rcu_scale_async_cb);
+				wmbp = NULL;
 				gp_succeeded = true;
 			} else if (!kthread_should_stop()) {
 				cur_ops->gp_barrier();
 			} else {
-				kfree(rhp); /* Because we are stopping. */
+				rcu_scale_free(wmbp); /* Because we are stopping. */
+				wmbp = NULL;
 			}
 		} else if (gp_exp) {
 			cur_ops->exp_sync();
@@ -607,6 +661,7 @@ rcu_scale_writer(void *arg)
 		rcu_scale_wait_shutdown();
 	} while (!torture_must_stop());
 	if (gp_async && cur_ops->async) {
+		rcu_scale_free(wmbp);
 		cur_ops->gp_barrier();
 	}
 	writer_n_durations[me] = i_max + 1;
@@ -970,12 +1025,30 @@ rcu_scale_cleanup(void)
 					schedule_timeout_uninterruptible(1);
 			}
 			kfree(writer_durations[i]);
+			if (writer_freelists) {
+				int ctr = 0;
+				struct llist_node *llnp;
+				struct writer_freelist *wflp = &writer_freelists[i];
+
+				if (wflp->ws_mblocks) {
+					llist_for_each(llnp, wflp->ws_lhg.first)
+						ctr++;
+					llist_for_each(llnp, wflp->ws_lhp.first)
+						ctr++;
+					WARN_ONCE(ctr != gp_async_max,
+						  "%s: ctr = %d gp_async_max = %d\n",
+						  __func__, ctr, gp_async_max);
+					kfree(wflp->ws_mblocks);
+				}
+			}
 		}
 		kfree(writer_tasks);
 		kfree(writer_durations);
 		kfree(writer_n_durations);
 		kfree(writer_done);
 		writer_done = NULL;
+		kfree(writer_freelists);
+		writer_freelists = NULL;
 	}
 
 	/* Do torture-type-specific cleanup operations.  */
@@ -1002,8 +1075,9 @@ rcu_scale_shutdown(void *arg)
 static int __init
 rcu_scale_init(void)
 {
-	long i;
 	int firsterr = 0;
+	long i;
+	long j;
 	static struct rcu_scale_ops *scale_ops[] = {
 		&rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS
 	};
@@ -1074,7 +1148,18 @@ rcu_scale_init(void)
 	writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL);
 	writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL);
 	writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL);
-	if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done) {
+	if (gp_async) {
+		if (gp_async_max <= 0) {
+			pr_warn("%s: gp_async_max = %d must be greater than zero.\n",
+				__func__, gp_async_max);
+			WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST));
+			firsterr = -EINVAL;
+			goto unwind;
+		}
+		writer_freelists = kcalloc(nrealwriters, sizeof(writer_freelists[0]), GFP_KERNEL);
+	}
+	if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done ||
+	    (gp_async && !writer_freelists)) {
 		SCALEOUT_ERRSTRING("out of memory");
 		firsterr = -ENOMEM;
 		goto unwind;
@@ -1087,6 +1172,24 @@ rcu_scale_init(void)
 			firsterr = -ENOMEM;
 			goto unwind;
 		}
+		if (writer_freelists) {
+			struct writer_freelist *wflp = &writer_freelists[i];
+
+			init_llist_head(&wflp->ws_lhg);
+			init_llist_head(&wflp->ws_lhp);
+			wflp->ws_mblocks = kcalloc(gp_async_max, sizeof(wflp->ws_mblocks[0]),
+						   GFP_KERNEL);
+			if (!wflp->ws_mblocks) {
+				firsterr = -ENOMEM;
+				goto unwind;
+			}
+			for (j = 0; j < gp_async_max; j++) {
+				struct writer_mblock *wmbp = &wflp->ws_mblocks[j];
+
+				wmbp->wmb_wfl = wflp;
+				llist_add(&wmbp->wmb_node, &wflp->ws_lhp);
+			}
+		}
 		firsterr = torture_create_kthread(rcu_scale_writer, (void *)i,
 						  writer_tasks[i]);
 		if (torture_init_error(firsterr))
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 12/14] rcuscale: NULL out top-level pointers to heap memory
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (10 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 11/14] rcuscale: Use special allocator for rcu_scale_writer() neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 13/14] rcuscale: Count outstanding callbacks per-task rather than per-CPU neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 14/14] refscale: Constify struct ref_scale_ops neeraj.upadhyay
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, if someone modprobes and rmmods rcuscale successfully, but
the next run errors out during the modprobe, non-NULL pointers to freed
memory will remain.  If the run after that also errors out during the
modprobe, there will be double-free bugs.

This commit therefore NULLs out top-level pointers to memory that has
just been freed.

Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index bc7cca979c06..61a178914256 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -819,6 +819,7 @@ kfree_scale_cleanup(void)
 			torture_stop_kthread(kfree_scale_thread,
 					     kfree_reader_tasks[i]);
 		kfree(kfree_reader_tasks);
+		kfree_reader_tasks = NULL;
 	}
 
 	torture_cleanup_end();
@@ -987,6 +988,7 @@ rcu_scale_cleanup(void)
 			torture_stop_kthread(rcu_scale_reader,
 					     reader_tasks[i]);
 		kfree(reader_tasks);
+		reader_tasks = NULL;
 	}
 
 	if (writer_tasks) {
@@ -1043,8 +1045,11 @@ rcu_scale_cleanup(void)
 			}
 		}
 		kfree(writer_tasks);
+		writer_tasks = NULL;
 		kfree(writer_durations);
+		writer_durations = NULL;
 		kfree(writer_n_durations);
+		writer_n_durations = NULL;
 		kfree(writer_done);
 		writer_done = NULL;
 		kfree(writer_freelists);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 13/14] rcuscale: Count outstanding callbacks per-task rather than per-CPU
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (11 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 12/14] rcuscale: NULL out top-level pointers to heap memory neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  2024-08-16  7:02 ` [PATCH rcu 14/14] refscale: Constify struct ref_scale_ops neeraj.upadhyay
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic,
	Vlastimil Babka

From: "Paul E. McKenney" <paulmck@kernel.org>

The current rcu_scale_writer() asynchronous grace-period testing uses a
per-CPU counter to track the number of outstanding callbacks.  This is
subject to CPU-imbalance errors when tasks migrate from one CPU to another
between the time that the counter is incremented and the callback is
queued, and additionally in kernels configured such that callbacks can
be invoked on some CPU other than the one that queued it.

This commit therefore arranges for per-task callback counts, thus avoiding
any issues with migration of either tasks or callbacks.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/rcuscale.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 61a178914256..6d37596deb1f 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -114,6 +114,7 @@ struct writer_mblock {
 
 struct writer_freelist {
 	struct llist_head ws_lhg;
+	atomic_t ws_inflight;
 	struct llist_head ____cacheline_internodealigned_in_smp ws_lhp;
 	struct writer_mblock *ws_mblocks;
 };
@@ -136,7 +137,6 @@ static u64 t_rcu_scale_writer_started;
 static u64 t_rcu_scale_writer_finished;
 static unsigned long b_rcu_gp_test_started;
 static unsigned long b_rcu_gp_test_finished;
-static DEFINE_PER_CPU(atomic_t, n_async_inflight);
 
 #define MAX_MEAS 10000
 #define MIN_MEAS 100
@@ -520,8 +520,9 @@ static void rcu_scale_free(struct writer_mblock *wmbp)
 static void rcu_scale_async_cb(struct rcu_head *rhp)
 {
 	struct writer_mblock *wmbp = container_of(rhp, struct writer_mblock, wmb_rh);
+	struct writer_freelist *wflp = wmbp->wmb_wfl;
 
-	atomic_dec(this_cpu_ptr(&n_async_inflight));
+	atomic_dec(&wflp->ws_inflight);
 	rcu_scale_free(wmbp);
 }
 
@@ -541,6 +542,7 @@ rcu_scale_writer(void *arg)
 	DEFINE_TORTURE_RANDOM(tr);
 	u64 *wdp;
 	u64 *wdpp = writer_durations[me];
+	struct writer_freelist *wflp = &writer_freelists[me];
 	struct writer_mblock *wmbp = NULL;
 
 	VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started");
@@ -584,8 +586,8 @@ rcu_scale_writer(void *arg)
 		if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) {
 			if (!wmbp)
 				wmbp = rcu_scale_alloc(me);
-			if (wmbp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) {
-				atomic_inc(this_cpu_ptr(&n_async_inflight));
+			if (wmbp && atomic_read(&wflp->ws_inflight) < gp_async_max) {
+				atomic_inc(&wflp->ws_inflight);
 				cur_ops->async(&wmbp->wmb_rh, rcu_scale_async_cb);
 				wmbp = NULL;
 				gp_succeeded = true;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rcu 14/14] refscale: Constify struct ref_scale_ops
  2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
                   ` (12 preceding siblings ...)
  2024-08-16  7:02 ` [PATCH rcu 13/14] rcuscale: Count outstanding callbacks per-task rather than per-CPU neeraj.upadhyay
@ 2024-08-16  7:02 ` neeraj.upadhyay
  13 siblings, 0 replies; 15+ messages in thread
From: neeraj.upadhyay @ 2024-08-16  7:02 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, paulmck, neeraj.upadhyay,
	neeraj.upadhyay, boqun.feng, joel, urezki, frederic,
	Christophe JAILLET

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

'struct ref_scale_ops' are not modified in these drivers.

Constifying this structure moves some data to a read-only section, so
increase overall security.

On a x86_64, with allmodconfig:
Before:
======
   text	   data	    bss	    dec	    hex	filename
  34231	   4167	    736	  39134	   98de	kernel/rcu/refscale.o

After:
=====
   text	   data	    bss	    dec	    hex	filename
  35175	   3239	    736	  39150	   98ee	kernel/rcu/refscale.o

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
 kernel/rcu/refscale.c | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index cfec0648e141..0db9db73f57f 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -135,7 +135,7 @@ struct ref_scale_ops {
 	const char *name;
 };
 
-static struct ref_scale_ops *cur_ops;
+static const struct ref_scale_ops *cur_ops;
 
 static void un_delay(const int udl, const int ndl)
 {
@@ -171,7 +171,7 @@ static bool rcu_sync_scale_init(void)
 	return true;
 }
 
-static struct ref_scale_ops rcu_ops = {
+static const struct ref_scale_ops rcu_ops = {
 	.init		= rcu_sync_scale_init,
 	.readsection	= ref_rcu_read_section,
 	.delaysection	= ref_rcu_delay_section,
@@ -205,7 +205,7 @@ static void srcu_ref_scale_delay_section(const int nloops, const int udl, const
 	}
 }
 
-static struct ref_scale_ops srcu_ops = {
+static const struct ref_scale_ops srcu_ops = {
 	.init		= rcu_sync_scale_init,
 	.readsection	= srcu_ref_scale_read_section,
 	.delaysection	= srcu_ref_scale_delay_section,
@@ -232,7 +232,7 @@ static void rcu_tasks_ref_scale_delay_section(const int nloops, const int udl, c
 		un_delay(udl, ndl);
 }
 
-static struct ref_scale_ops rcu_tasks_ops = {
+static const struct ref_scale_ops rcu_tasks_ops = {
 	.init		= rcu_sync_scale_init,
 	.readsection	= rcu_tasks_ref_scale_read_section,
 	.delaysection	= rcu_tasks_ref_scale_delay_section,
@@ -271,7 +271,7 @@ static void rcu_trace_ref_scale_delay_section(const int nloops, const int udl, c
 	}
 }
 
-static struct ref_scale_ops rcu_trace_ops = {
+static const struct ref_scale_ops rcu_trace_ops = {
 	.init		= rcu_sync_scale_init,
 	.readsection	= rcu_trace_ref_scale_read_section,
 	.delaysection	= rcu_trace_ref_scale_delay_section,
@@ -310,7 +310,7 @@ static void ref_refcnt_delay_section(const int nloops, const int udl, const int
 	}
 }
 
-static struct ref_scale_ops refcnt_ops = {
+static const struct ref_scale_ops refcnt_ops = {
 	.init		= rcu_sync_scale_init,
 	.readsection	= ref_refcnt_section,
 	.delaysection	= ref_refcnt_delay_section,
@@ -347,7 +347,7 @@ static void ref_rwlock_delay_section(const int nloops, const int udl, const int
 	}
 }
 
-static struct ref_scale_ops rwlock_ops = {
+static const struct ref_scale_ops rwlock_ops = {
 	.init		= ref_rwlock_init,
 	.readsection	= ref_rwlock_section,
 	.delaysection	= ref_rwlock_delay_section,
@@ -384,7 +384,7 @@ static void ref_rwsem_delay_section(const int nloops, const int udl, const int n
 	}
 }
 
-static struct ref_scale_ops rwsem_ops = {
+static const struct ref_scale_ops rwsem_ops = {
 	.init		= ref_rwsem_init,
 	.readsection	= ref_rwsem_section,
 	.delaysection	= ref_rwsem_delay_section,
@@ -419,7 +419,7 @@ static void ref_lock_delay_section(const int nloops, const int udl, const int nd
 	preempt_enable();
 }
 
-static struct ref_scale_ops lock_ops = {
+static const struct ref_scale_ops lock_ops = {
 	.readsection	= ref_lock_section,
 	.delaysection	= ref_lock_delay_section,
 	.name		= "lock"
@@ -454,7 +454,7 @@ static void ref_lock_irq_delay_section(const int nloops, const int udl, const in
 	preempt_enable();
 }
 
-static struct ref_scale_ops lock_irq_ops = {
+static const struct ref_scale_ops lock_irq_ops = {
 	.readsection	= ref_lock_irq_section,
 	.delaysection	= ref_lock_irq_delay_section,
 	.name		= "lock-irq"
@@ -490,7 +490,7 @@ static void ref_acqrel_delay_section(const int nloops, const int udl, const int
 	preempt_enable();
 }
 
-static struct ref_scale_ops acqrel_ops = {
+static const struct ref_scale_ops acqrel_ops = {
 	.readsection	= ref_acqrel_section,
 	.delaysection	= ref_acqrel_delay_section,
 	.name		= "acqrel"
@@ -524,7 +524,7 @@ static void ref_clock_delay_section(const int nloops, const int udl, const int n
 	stopopts = x;
 }
 
-static struct ref_scale_ops clock_ops = {
+static const struct ref_scale_ops clock_ops = {
 	.readsection	= ref_clock_section,
 	.delaysection	= ref_clock_delay_section,
 	.name		= "clock"
@@ -556,7 +556,7 @@ static void ref_jiffies_delay_section(const int nloops, const int udl, const int
 	stopopts = x;
 }
 
-static struct ref_scale_ops jiffies_ops = {
+static const struct ref_scale_ops jiffies_ops = {
 	.readsection	= ref_jiffies_section,
 	.delaysection	= ref_jiffies_delay_section,
 	.name		= "jiffies"
@@ -706,9 +706,9 @@ static void refscale_typesafe_ctor(void *rtsp_in)
 	preempt_enable();
 }
 
-static struct ref_scale_ops typesafe_ref_ops;
-static struct ref_scale_ops typesafe_lock_ops;
-static struct ref_scale_ops typesafe_seqlock_ops;
+static const struct ref_scale_ops typesafe_ref_ops;
+static const struct ref_scale_ops typesafe_lock_ops;
+static const struct ref_scale_ops typesafe_seqlock_ops;
 
 // Initialize for a typesafe test.
 static bool typesafe_init(void)
@@ -769,7 +769,7 @@ static void typesafe_cleanup(void)
 }
 
 // The typesafe_init() function distinguishes these structures by address.
-static struct ref_scale_ops typesafe_ref_ops = {
+static const struct ref_scale_ops typesafe_ref_ops = {
 	.init		= typesafe_init,
 	.cleanup	= typesafe_cleanup,
 	.readsection	= typesafe_read_section,
@@ -777,7 +777,7 @@ static struct ref_scale_ops typesafe_ref_ops = {
 	.name		= "typesafe_ref"
 };
 
-static struct ref_scale_ops typesafe_lock_ops = {
+static const struct ref_scale_ops typesafe_lock_ops = {
 	.init		= typesafe_init,
 	.cleanup	= typesafe_cleanup,
 	.readsection	= typesafe_read_section,
@@ -785,7 +785,7 @@ static struct ref_scale_ops typesafe_lock_ops = {
 	.name		= "typesafe_lock"
 };
 
-static struct ref_scale_ops typesafe_seqlock_ops = {
+static const struct ref_scale_ops typesafe_seqlock_ops = {
 	.init		= typesafe_init,
 	.cleanup	= typesafe_cleanup,
 	.readsection	= typesafe_read_section,
@@ -1026,7 +1026,7 @@ static int main_func(void *arg)
 }
 
 static void
-ref_scale_print_module_parms(struct ref_scale_ops *cur_ops, const char *tag)
+ref_scale_print_module_parms(const struct ref_scale_ops *cur_ops, const char *tag)
 {
 	pr_alert("%s" SCALE_FLAG
 		 "--- %s:  verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag,
@@ -1081,7 +1081,7 @@ ref_scale_init(void)
 {
 	long i;
 	int firsterr = 0;
-	static struct ref_scale_ops *scale_ops[] = {
+	static const struct ref_scale_ops *scale_ops[] = {
 		&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops,
 		&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops,
 		&typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-08-16  7:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-16  7:02 [PATCH rcu 00/14] RCU scaling tests updates for v6.12 Neeraj Upadhyay
2024-08-16  7:02 ` [PATCH rcu 01/14] refscale: Add TINY scenario neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 02/14] refscale: Optimize process_durations() neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 03/14] rcuscale: Save a few lines with whitespace-only change neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 04/14] rcuscale: Dump stacks of stalled rcu_scale_writer() instances neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 05/14] rcuscale: Dump grace-period statistics when rcu_scale_writer() stalls neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 06/14] rcu: Mark callbacks not currently participating in barrier operation neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 07/14] rcuscale: Print detailed grace-period and barrier diagnostics neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 08/14] rcuscale: Provide clear error when async specified without primitives neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 09/14] rcuscale: Make all writer tasks report upon hang neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 10/14] rcuscale: Make rcu_scale_writer() tolerate repeated GFP_KERNEL failure neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 11/14] rcuscale: Use special allocator for rcu_scale_writer() neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 12/14] rcuscale: NULL out top-level pointers to heap memory neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 13/14] rcuscale: Count outstanding callbacks per-task rather than per-CPU neeraj.upadhyay
2024-08-16  7:02 ` [PATCH rcu 14/14] refscale: Constify struct ref_scale_ops neeraj.upadhyay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox