public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking
@ 2006-07-06 17:14 Paul E. McKenney
  2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
  2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
  0 siblings, 2 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes

Version 3 of the SRCU patchset.

This patch incorporates a number of improvements, many of which came
up in off-list discussions with Alan Stern.  Neither of us are sure
why these discussions ended up off-list, so I have summarized them
below.

o	Fixes some "zombie code" -- excess curly braces and the like.

o	Gets rid of the double-flip in favor of an additional
	synchronize_sched().  This turned out to be safe, despite
	my saying otherwise at http://lkml.org/lkml/2006/6/27/486.
	The trick that I was missing is that synchronize_sched()
	forces all CPUs to execute at least one memory barrier during
	the synchronize_sched()'s execution, which forces all CPUs
	to see synchronize_srcu()'s counter increment as happening
	after any memory manipulations prior to the synchronize_srcu().

	Upgraded comments to indicate what the synchronize_sched()
	calls are needed for.

o	Added a barrier() to both srcu_read_lock() and srcu_read_unlock()
	to prevent the compiler from performing optimizations that
	would cause the critical section to move outside of the
	enclosing srcu_read_lock() and srcu_read_unlock().

	However, these barrier()s in srcu_read_lock() and srcu_read_unlock()
	are needed only in non-CONFIG_PREEMPT kernels, so they compile
	to nothing in CONFIG_PREEMPT kernels (where the preempt_disable()
	and preempt_enable() calls supply the needed barrier() call).

o	Added a check to synchronize_srcu() that permits this primitive
	to take advantage of grace periods induced by concurrent executions
	in other tasks.  This can be useful in cases where you are
	using a single srcu_struct to handle all the individually-locked
	chains of a hash table, for example.

o	cleanup_srcu_struct() now contains error checks to catch cases
	where readers are still using the srcu_struct in question.
	It does a WARN_ON() and leaks the srcu_struct's per-CPU data
	in that case.

o	There is an srcu_readers_active() that returns the number of
	readers (approximate!) currently using the specified srcu_struct.
	This can be useful when terminating use of an srcu_struct, e.g.,
	at module-unload time.

o	Improved the RCU torture tests, increasing the skew on reader
	times and providing implementation-specific delay functions.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
  2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
@ 2006-07-06 17:20 ` Paul E. McKenney
       [not found]   ` <20060709235029.GA194@oleg>
  2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
  1 sibling, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes

Updated patch adding a variant of RCU that permits sleeping in read-side
critical sections.  SRCU is as follows:

o	Each use of SRCU creates its own srcu_struct, and each
	srcu_struct has its own set of grace periods.  This is
	critical, as it prevents one subsystem with a blocking
	reader from holding up SRCU grace periods for other
	subsystems.

o	The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
	and synchronize_srcu()) all take a pointer to a srcu_struct.

o	The SRCU primitives must be called from process context.

o	srcu_read_lock() returns an int that must be passed to
	the matching srcu_read_unlock().  Realtime RCU avoids the
	need for this by storing the state in the task struct,
	but SRCU needs to allow a given code path to pass through
	multiple SRCU domains -- storing state in the task struct
	would therefore require either arbitrary space in the
	task struct or arbitrary limits on SRCU nesting.  So I
	kicked the state-storage problem up to the caller.

	Of course, it is not permitted to call synchronize_srcu()
	while in an SRCU read-side critical section.

o	There is no call_srcu().  It would not be hard to implement
	one, but it seems like too easy a way to OOM the system.
	(Hey, we have enough trouble with call_rcu(), which does
	-not- permit readers to sleep!!!)  So, if you want it,
	please tell me why...

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---

 Documentation/RCU/checklist.txt |   38 ++++++
 Documentation/RCU/rcu.txt       |    3 
 Documentation/RCU/whatisRCU.txt |    3 
 include/linux/srcu.h            |   49 ++++++++
 kernel/Makefile                 |    2 
 kernel/srcu.c                   |  238 ++++++++++++++++++++++++++++++++++++++++
 6 files changed, 331 insertions(+), 2 deletions(-)

diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt linux-2.6.17-srcu/Documentation/RCU/checklist.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/checklist.txt	2006-06-30 17:07:02.000000000 -0700
@@ -183,3 +183,41 @@ over a rather long period of time, but i
 	disable irq on a given acquisition of that lock will result in
 	deadlock as soon as the RCU callback happens to interrupt that
 	acquisition's critical section.
+
+13.	SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
+	may only be invoked from process context.  Unlike other forms of
+	RCU, it -is- permissible to block in an SRCU read-side critical
+	section (demarked by srcu_read_lock() and srcu_read_unlock()),
+	hence the "SRCU": "sleepable RCU".  Please note that if you
+	don't need to sleep in read-side critical sections, you should
+	be using RCU rather than SRCU, because RCU is almost always
+	faster and easier to use than is SRCU.
+
+	Also unlike other forms of RCU, explicit initialization
+	and cleanup is required via init_srcu_struct() and
+	cleanup_srcu_struct().	These are passed a "struct srcu_struct"
+	that defines the scope of a given SRCU domain.	Once initialized,
+	the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
+	and synchronize_srcu().  A given synchronize_srcu() waits only
+	for SRCU read-side critical sections governed by srcu_read_lock()
+	and srcu_read_unlock() calls that have been passd the same
+	srcu_struct.  This property is what makes sleeping read-side
+	critical sections tolerable -- a given subsystem delays only
+	its own updates, not those of other subsystems using SRCU.
+	Therefore, SRCU is less prone to OOM the system than RCU would
+	be if RCU's read-side critical sections were permitted to
+	sleep.
+
+	The ability to sleep in read-side critical sections does not
+	come for free.	First, corresponding srcu_read_lock() and
+	srcu_read_unlock() calls must be passed the same srcu_struct.
+	Second, grace-period-detection overhead is amortized only
+	over those updates sharing a given srcu_struct, rather than
+	being globally amortized as they are for other forms of RCU.
+	Therefore, SRCU should be used in preference to rw_semaphore
+	only in extremely read-intensive situations, or in situations
+	requiring SRCU's read-side deadlock immunity or low read-side
+	realtime latency.
+
+	Note that, rcu_assign_pointer() and rcu_dereference() relate to
+	SRCU just as they do to other forms of RCU.
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt linux-2.6.17-srcu/Documentation/RCU/rcu.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/rcu.txt	2006-06-24 08:04:07.000000000 -0700
@@ -45,7 +45,8 @@ o	How can I see where RCU is currently u
 
 	Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
 	"rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh",
-	"synchronize_rcu", and "synchronize_net".
+	"srcu_read_lock", "srcu_read_unlock", "synchronize_rcu",
+	"synchronize_net", and "synchronize_srcu".
 
 o	What guidelines should I follow when writing code that uses RCU?
 
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt	2006-06-24 08:04:07.000000000 -0700
@@ -767,6 +767,8 @@ Markers for RCU read-side critical secti
 	rcu_read_unlock
 	rcu_read_lock_bh
 	rcu_read_unlock_bh
+	srcu_read_lock
+	srcu_read_unlock
 
 RCU pointer/list traversal:
 
@@ -794,6 +796,7 @@ RCU grace period:
 	synchronize_net
 	synchronize_sched
 	synchronize_rcu
+	synchronize_srcu
 	call_rcu
 	call_rcu_bh
 
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/include/linux/srcu.h linux-2.6.17-srcu/include/linux/srcu.h
--- linux-2.6.17-torturercu_bh/include/linux/srcu.h	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/include/linux/srcu.h	2006-07-02 07:27:32.000000000 -0700
@@ -0,0 +1,49 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * 		Documentation/RCU/ *.txt
+ *
+ */
+
+struct srcu_struct_array {
+	int c[2];
+};
+
+struct srcu_struct {
+	int completed;
+	struct srcu_struct_array *per_cpu_ref;
+	struct mutex mutex;
+};
+
+#ifndef CONFIG_PREEMPT
+#define srcu_barrier() barrier()
+#else /* #ifndef CONFIG_PREEMPT */
+#define srcu_barrier()
+#endif /* #else #ifndef CONFIG_PREEMPT */
+
+void init_srcu_struct(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
+int srcu_read_lock(struct srcu_struct *sp);
+void srcu_read_unlock(struct srcu_struct *sp, int idx);
+void synchronize_srcu(struct srcu_struct *sp);
+long srcu_batches_completed(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/Makefile linux-2.6.17-srcu/kernel/Makefile
--- linux-2.6.17-torturercu_bh/kernel/Makefile	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/kernel/Makefile	2006-06-24 08:04:07.000000000 -0700
@@ -8,7 +8,7 @@ obj-y     = sched.o fork.o exec_domain.o
 	    signal.o sys.o kmod.o workqueue.o pid.o \
 	    rcupdate.o extable.o params.o posix-timers.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
-	    hrtimer.o
+	    hrtimer.o srcu.o
 
 obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o
 obj-$(CONFIG_FUTEX) += futex.o
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/srcu.c linux-2.6.17-srcu/kernel/srcu.c
--- linux-2.6.17-torturercu_bh/kernel/srcu.c	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/kernel/srcu.c	2006-07-04 18:49:13.000000000 -0700
@@ -0,0 +1,238 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * 		Documentation/RCU/ *.txt
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/percpu.h>
+#include <linux/preempt.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/srcu.h>
+
+/**
+ * init_srcu_struct - initialize a sleep-RCU structure
+ * @sp: structure to initialize.
+ *
+ * Must invoke this on a given srcu_struct before passing that srcu_struct
+ * to any other function.  Each srcu_struct represents a separate domain
+ * of SRCU protection.
+ */
+void init_srcu_struct(struct srcu_struct *sp)
+{
+	sp->completed = 0;
+	sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
+	mutex_init(&sp->mutex);
+}
+
+/*
+ * srcu_readers_active_idx -- returns approximate number of readers
+ *	active on the specified rank of per-CPU counters.
+ */
+
+static int srcu_readers_active_idx(struct srcu_struct *sp, int idx)
+{
+	int cpu;
+	int sum;
+
+	sum = 0;
+	for_each_possible_cpu(cpu)
+		sum += per_cpu_ptr(sp->per_cpu_ref, cpu)->c[idx];
+	return (sum);
+}
+
+/**
+ * srcu_readers_active - returns approximate number of readers.
+ * @sp: which srcu_struct to count active readers (holding srcu_read_lock).
+ *
+ * Note that this is not an atomic primitive, and can therefore suffer
+ * severe errors when invoked on an active srcu_struct.  That said, it
+ * can be useful as an error check at cleanup time.
+ */
+int srcu_readers_active(struct srcu_struct *sp)
+{
+	return srcu_readers_active_idx(sp, 0) + srcu_readers_active_idx(sp, 1);
+}
+
+/**
+ * cleanup_srcu_struct - deconstruct a sleep-RCU structure
+ * @sp: structure to clean up.
+ *
+ * Must invoke this after you are finished using a given srcu_struct that
+ * was initialized via init_srcu_struct(), else you leak memory.
+ */
+void cleanup_srcu_struct(struct srcu_struct *sp)
+{
+	int sum;
+
+	sum = srcu_readers_active(sp);
+	WARN_ON(sum);  /* Leakage unless caller handles error. */
+	if (sum != 0)
+		return;
+	free_percpu(sp->per_cpu_ref);
+	sp->per_cpu_ref = NULL;
+}
+
+/**
+ * srcu_read_lock - register a new reader for an SRCU-protected structure.
+ * @sp: srcu_struct in which to register the new reader.
+ *
+ * Counts the new reader in the appropriate per-CPU element of the
+ * srcu_struct.  Must be called from process context.
+ * Returns an index that must be passed to the matching srcu_read_unlock().
+ */
+int srcu_read_lock(struct srcu_struct *sp)
+{
+	int idx;
+
+	preempt_disable();
+	idx = sp->completed & 0x1;
+	barrier();  /* ensure compiler looks -once- at sp->completed. */
+	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
+	srcu_barrier();  /* ensure compiler won't misorder critical section. */
+	preempt_enable();
+	return idx;
+}
+
+/**
+ * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
+ * @sp: srcu_struct in which to unregister the old reader.
+ * @idx: return value from corresponding srcu_read_lock().
+ *
+ * Removes the count for the old reader from the appropriate per-CPU
+ * element of the srcu_struct.  Note that this may well be a different
+ * CPU than that which was incremented by the corresponding srcu_read_lock().
+ * Must be called from process context.
+ */
+void srcu_read_unlock(struct srcu_struct *sp, int idx)
+{
+	preempt_disable();
+	srcu_barrier();  /* ensure compiler won't misorder critical section. */
+	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
+	preempt_enable();
+}
+
+/**
+ * synchronize_srcu - wait for prior SRCU read-side critical-section completion
+ * @sp: srcu_struct with which to synchronize.
+ *
+ * Flip the completed counter, and wait for the old count to drain to zero.
+ * As with classic RCU, the updater must use some separate means of
+ * synchronizing concurrent updates.  Can block; must be called from
+ * process context.
+ */
+void synchronize_srcu(struct srcu_struct *sp)
+{
+	int idx;
+	int sum;
+
+	idx = sp->completed;
+	mutex_lock(&sp->mutex);
+
+	/*
+	 * Check to see if someone else did the work for us while we were
+	 * waiting to acquire the lock.  We need -two- advances of
+	 * the counter, not just one.  If there was but one, we might have
+	 * shown up -after- our helper's first synchronize_sched(), thus
+	 * having failed to prevent CPU-reordering races with concurrent
+	 * srcu_read_unlock()s on other CPUs (see comment below).  So we
+	 * either (1) wait for two or (2) supply the second ourselves.
+	 */
+
+	if ((sp->completed - idx) >= 2) {
+		mutex_unlock(&sp->mutex);
+		return;
+	}
+
+	synchronize_sched();  /* Force memory barrier on all CPUs. */
+
+	/*
+	 * The preceding synchronize_sched() ensures that any CPU that
+	 * sees the new value of sp->completed will also see any preceding
+	 * changes to data structures made by this CPU.  This prevents
+	 * some other CPU from reordering the accesses in its SRCU
+	 * read-side critical section to precede the corresponding
+	 * srcu_read_lock() -- ensuring that such references will in
+	 * fact be protected.
+	 *
+	 * So it is now safe to do the flip.
+	 */
+
+	idx = sp->completed & 0x1;
+	sp->completed++;
+
+	synchronize_sched();  /* Force memory barrier on all CPUs. */
+
+	/*
+	 * At this point, because of the preceding synchronize_sched(),
+	 * all srcu_read_lock() calls using the old counters have completed.
+	 * Their corresponding critical sections might well be still
+	 * executing, but the srcu_read_lock() primitives themselves
+	 * will have finished executing.
+	 */
+
+	for (;;) {
+		sum = srcu_readers_active_idx(sp, idx);
+		if (sum == 0)
+			break;
+		schedule_timeout_interruptible(1);
+	}
+
+	synchronize_sched();  /* Force memory barrier on all CPUs. */
+
+	/*
+	 * The preceding synchronize_sched() forces all srcu_read_unlock()
+	 * primitives that were executing concurrently with the preceding
+	 * for_each_possible_cpu() loop to have completed by this point.
+	 * More importantly, it also forces the corresponding SRCU read-side
+	 * critical sections to have also completed, and the corresponding
+	 * references to SRCU-protected data items to be dropped.
+	 */
+
+	mutex_unlock(&sp->mutex);
+}
+
+/**
+ * srcu_batches_completed - return batches completed.
+ * @sp: srcu_struct on which to report batch completion.
+ *
+ * Report the number of batches, correlated with, but not necessarily
+ * precisely the same as, the number of grace periods that have elapsed.
+ */
+
+long srcu_batches_completed(struct srcu_struct *sp)
+{
+	return sp->completed;
+}
+
+EXPORT_SYMBOL_GPL(init_srcu_struct);
+EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
+EXPORT_SYMBOL_GPL(srcu_read_lock);
+EXPORT_SYMBOL_GPL(srcu_read_unlock);
+EXPORT_SYMBOL_GPL(synchronize_srcu);
+EXPORT_SYMBOL_GPL(srcu_batches_completed);
+EXPORT_SYMBOL_GPL(srcu_readers_active);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/2] srcu-3: add SRCU operations to rcutorture
  2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
  2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
@ 2006-07-06 17:25 ` Paul E. McKenney
  1 sibling, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes

Adds SRCU operations to rcutorture and updates rcutorture documentation.
Also increases the stress imposed by the rcutorture test.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---

 Documentation/RCU/torture.txt |   15 +++++
 kernel/rcutorture.c           |  121 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 132 insertions(+), 4 deletions(-)

diff -urpNa -X dontdiff linux-2.6.17-srcu/Documentation/RCU/torture.txt linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt
--- linux-2.6.17-srcu/Documentation/RCU/torture.txt	2006-06-24 12:13:12.000000000 -0700
+++ linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt	2006-06-26 18:50:33.000000000 -0700
@@ -118,6 +118,21 @@ o	"Free-Block Circulation": Shows the nu
 	as it is only incremented if a torture structure's counter
 	somehow gets incremented farther than it should.
 
+Different implementations of RCU can provide implementation-specific
+additional information.  For example, SRCU provides the following:
+
+	srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0
+	srcu-torture: Reader Pipe:  559738 939 0 0 0 0 0 0 0 0 0
+	srcu-torture: Reader Batch:  560434 243 0 0 0 0 0 0 0 0
+	srcu-torture: Free-Block Circulation:  355 354 353 352 351 350 349 348 347 346 0
+	srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
+
+The first four lines are similar to those for RCU.  The last line shows
+the per-CPU counter state.  The numbers in parentheses are the values
+of the "old" and "current" counters for the corresponding CPU.  The
+"idx" value maps the "old" and "current" values to the underlying array,
+and is useful for debugging.
+
 
 USAGE
 
diff -urpNa -X dontdiff linux-2.6.17-srcu/kernel/rcutorture.c linux-2.6.17-torturesrcu/kernel/rcutorture.c
--- linux-2.6.17-srcu/kernel/rcutorture.c	2006-06-30 15:03:35.000000000 -0700
+++ linux-2.6.17-torturesrcu/kernel/rcutorture.c	2006-07-02 07:34:35.000000000 -0700
@@ -44,6 +44,7 @@
 #include <linux/delay.h>
 #include <linux/byteorder/swabb.h>
 #include <linux/stat.h>
+#include <linux/srcu.h>
 
 MODULE_LICENSE("GPL");
 
@@ -53,7 +54,7 @@ static int stat_interval;	/* Interval be
 static int verbose;		/* Print more debug info. */
 static int test_no_idle_hz;	/* Test RCU's support for tickless idle CPUs. */
 static int shuffle_interval = 5; /* Interval between shuffles (in sec)*/
-static char *torture_type = "rcu"; /* What to torture. */
+static char *torture_type = "rcu"; /* What to torture: rcu, srcu. */
 
 module_param(nreaders, int, 0);
 MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
@@ -66,7 +67,7 @@ MODULE_PARM_DESC(test_no_idle_hz, "Test 
 module_param(shuffle_interval, int, 0);
 MODULE_PARM_DESC(shuffle_interval, "Number of seconds between shuffles");
 module_param(torture_type, charp, 0);
-MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh)");
+MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, srcu)");
 
 #define TORTURE_FLAG "-torture:"
 #define PRINTK_STRING(s) \
@@ -180,6 +181,7 @@ struct rcu_torture_ops {
 	void (*init)(void);
 	void (*cleanup)(void);
 	int (*readlock)(void);
+	void (*readdelay)(struct rcu_random_state *rrsp);
 	void (*readunlock)(int idx);
 	int (*completed)(void);
 	void (*deferredfree)(struct rcu_torture *p);
@@ -198,6 +200,18 @@ static int rcu_torture_read_lock(void)
 	return 0;
 }
 
+static void rcu_read_delay(struct rcu_random_state *rrsp)
+{
+	long delay;
+	const long longdelay = 200;
+
+	/* We want there to be long-running readers, but not all the time. */
+
+	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay);
+	if (!delay)
+		udelay(longdelay);
+}
+
 static void rcu_torture_read_unlock(int idx)
 {
 	rcu_read_unlock();
@@ -239,6 +253,7 @@ static struct rcu_torture_ops rcu_ops = 
 	.init = NULL,
 	.cleanup = NULL,
 	.readlock = rcu_torture_read_lock,
+	.readdelay = rcu_read_delay,
 	.readunlock = rcu_torture_read_unlock,
 	.completed = rcu_torture_completed,
 	.deferredfree = rcu_torture_deferred_free,
@@ -275,6 +290,7 @@ static struct rcu_torture_ops rcu_bh_ops
 	.init = NULL,
 	.cleanup = NULL,
 	.readlock = rcu_bh_torture_read_lock,
+	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
 	.readunlock = rcu_bh_torture_read_unlock,
 	.completed = rcu_bh_torture_completed,
 	.deferredfree = rcu_bh_torture_deferred_free,
@@ -282,8 +298,105 @@ static struct rcu_torture_ops rcu_bh_ops
 	.name = "rcu_bh"
 };
 
+/*
+ * Definitions for srcu torture testing.
+ */
+
+static struct srcu_struct srcu_ctl;
+static struct list_head srcu_removed;
+
+static void srcu_torture_init(void)
+{
+	init_srcu_struct(&srcu_ctl);
+	INIT_LIST_HEAD(&srcu_removed);
+}
+
+static void srcu_torture_cleanup(void)
+{
+	synchronize_srcu(&srcu_ctl);
+	cleanup_srcu_struct(&srcu_ctl);
+}
+
+static int srcu_torture_read_lock(void)
+{
+	return srcu_read_lock(&srcu_ctl);
+}
+
+static void srcu_read_delay(struct rcu_random_state *rrsp)
+{
+	long delay;
+	const long uspertick = 1000000 / HZ;
+	const long longdelay = 10;
+
+	/* We want there to be long-running readers, but not all the time. */
+
+	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick);
+	if (!delay)
+		schedule_timeout_interruptible(longdelay);
+}
+
+static void srcu_torture_read_unlock(int idx)
+{
+	srcu_read_unlock(&srcu_ctl, idx);
+}
+
+static int srcu_torture_completed(void)
+{
+	return srcu_batches_completed(&srcu_ctl);
+}
+
+static void srcu_torture_deferred_free(struct rcu_torture *p)
+{
+	int i;
+	struct rcu_torture *rp;
+	struct rcu_torture *rp1;
+
+	synchronize_srcu(&srcu_ctl);
+	list_add(&p->rtort_free, &srcu_removed);
+	list_for_each_entry_safe(rp, rp1, &srcu_removed, rtort_free) {
+		i = rp->rtort_pipe_count;
+		if (i > RCU_TORTURE_PIPE_LEN)
+			i = RCU_TORTURE_PIPE_LEN;
+		atomic_inc(&rcu_torture_wcount[i]);
+		if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
+			rp->rtort_mbtest = 0;
+			list_del(&rp->rtort_free);
+			rcu_torture_free(rp);
+		}
+	}
+}
+
+int srcu_torture_stats(char *page)
+{
+	int cnt = 0;
+	int cpu;
+	int idx = srcu_ctl.completed & 0x1;
+
+	cnt += sprintf(&page[cnt], "%s%s per-CPU(idx=%d):",
+		       torture_type, TORTURE_FLAG, idx);
+	for_each_possible_cpu(cpu) {
+		cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
+			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
+			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
+	}
+	cnt += sprintf(&page[cnt], "\n");
+	return cnt;
+}
+
+static struct rcu_torture_ops srcu_ops = {
+	.init = srcu_torture_init,
+	.cleanup = srcu_torture_cleanup,
+	.readlock = srcu_torture_read_lock,
+	.readdelay = srcu_read_delay,
+	.readunlock = srcu_torture_read_unlock,
+	.completed = srcu_torture_completed,
+	.deferredfree = srcu_torture_deferred_free,
+	.stats = srcu_torture_stats,
+	.name = "srcu"
+};
+
 static struct rcu_torture_ops *torture_ops[] =
-	{ &rcu_ops, &rcu_bh_ops, NULL };
+	{ &rcu_ops, &rcu_bh_ops, &srcu_ops, NULL };
 
 /*
  * RCU torture writer kthread.  Repeatedly substitutes a new structure
@@ -359,7 +472,7 @@ rcu_torture_reader(void *arg)
 		}
 		if (p->rtort_mbtest == 0)
 			atomic_inc(&n_rcu_torture_mberror);
-		udelay(rcu_random(&rand) & 0x7f);
+		cur_ops->readdelay(&rand);
 		preempt_disable();
 		pipe_count = p->rtort_pipe_count;
 		if (pipe_count > RCU_TORTURE_PIPE_LEN) {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found]   ` <20060709235029.GA194@oleg>
@ 2006-07-10 16:51     ` Paul E. McKenney
       [not found]       ` <44B29212.1070301@yahoo.com.au>
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-10 16:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, akpm, matthltc, dipankar, stern, mingo, tytso,
	dvhltc, jes, dhowells

On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
> On 07/06, Paul E. McKenney wrote:
> >
> > Updated patch adding a variant of RCU that permits sleeping in read-side
> > critical sections.
> 
> I do not see any problems with this patch, but I have a couple of
> questions, so your help is needed again.

Thank you for looking it over!

> > +void synchronize_srcu(struct srcu_struct *sp)
> > +{
> > +	[... snip ...]
> > +
> > +	synchronize_sched();  /* Force memory barrier on all CPUs. */
> > +
> > +	/*
> > +	 * The preceding synchronize_sched() forces all srcu_read_unlock()
> > +	 * primitives that were executing concurrently with the preceding
> > +	 * for_each_possible_cpu() loop to have completed by this point.
> > +	 * More importantly, it also forces the corresponding SRCU read-side
> > +	 * critical sections to have also completed, and the corresponding
> > +	 * references to SRCU-protected data items to be dropped.
> > +	 */
> > +
> > +	mutex_unlock(&sp->mutex);
> > +}
> 
> Isn't it possible to unlock ->mutex earlier, before the last
> synchronize_sched()?

It seems possible, but I would like to think carefully about this one
first, and, if it still seems plausible, test it heavily.  If I understand
your line of reasoning, the thought is that the first synchronize_sched()
at the beginning of synchronize_srcu() ensures that all of the counter
updates pertaining to the last instance of synchronize_srcu() have
been committed.  The same reasoning might well cover the sp->completed
fastpath as well.

In any case, this is a performance boost off the fastpath.  A good boost,
if it works, but I will be much more excited if you find a way of speeding
up srcu_read_lock() or srcu_read_unlock().  ;-)

> Another question: what is the semantics of synchronize_sched() ?
> 
> I am not talking about the current implementation, it is very clear.
> The question is: what is the _definition_ of synchronize_sched()
> (which must be valid for "any" RCU implementation) ?
> 
> 1) The comment in include/linux/rcupdate.h states that "all preempt_disable
>    code sequences will have completed before this primitive returns".
> 
> 2) kernel/srcu.c claims that this primitive "forces memory barrier on all
>    CPUs". (so the comment in rcupdate.h is not complete).
> 
>    (I understand this so that each cpu does something which implies mb()
>     semantics).
> 
> As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> (the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> should also guarantee the completion of mem ops on all CPUs before return,
> not just mb() (which does not have any timing guaranties).
> 
> Could you clarify this issue?
> 
> (Again, I do not see any problems with the current RCU implementation).

However, this -does- seem to be to be a problem with the comment headers
and the documentation.  Does the following patch make things better?

David, would it be worthwhile adding this global-memory-barrier effect
of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
Documentation/memory-barriers.txt?

							Thanx, Paul

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---

 Documentation/RCU/checklist.txt |    4 ++++
 include/linux/rcupdate.h        |    3 +++
 kernel/rcupdate.c               |    3 +++
 kernel/srcu.c                   |    3 ++-
 4 files changed, 12 insertions(+), 1 deletion(-)

diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt	2006-07-06 16:45:01.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt	2006-07-10 09:43:19.000000000 -0700
@@ -221,3 +221,7 @@ over a rather long period of time, but i
 
 	Note that, rcu_assign_pointer() and rcu_dereference() relate to
 	SRCU just as they do to other forms of RCU.
+
+14.	The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
+	primitives force at least one memory barrier to be executed on
+	each active CPU before they return.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h
--- linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h	2006-07-10 09:48:51.000000000 -0700
@@ -251,6 +251,9 @@ extern int rcu_needs_cpu(int cpu);
  * guarantees that rcu_read_lock() sections will have completed.
  * In "classic RCU", these two guarantees happen to be one and
  * the same, but can differ in realtime RCU implementations.
+ * 
+ * In addition, this primitive guarantees that every active CPU has
+ * executed at least one memory barrier before it returns.
  */
 #define synchronize_sched() synchronize_rcu()
 
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c	2006-07-10 09:48:32.000000000 -0700
@@ -597,6 +597,9 @@ static void wakeme_after_rcu(struct rcu_
  * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
  * and may be nested.
  *
+ * This primitive also causes each active CPU to execute at least one
+ * memory barrier before it returns.
+ *
  * If your read-side code is not protected by rcu_read_lock(), do -not-
  * use synchronize_rcu().
  */
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/srcu.c linux-2.6.17-srcu-LKML-5/kernel/srcu.c
--- linux-2.6.17-srcu-LKML-4/kernel/srcu.c	2006-07-06 16:50:23.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/srcu.c	2006-07-10 09:48:09.000000000 -0700
@@ -143,7 +143,8 @@ void srcu_read_unlock(struct srcu_struct
  * Flip the completed counter, and wait for the old count to drain to zero.
  * As with classic RCU, the updater must use some separate means of
  * synchronizing concurrent updates.  Can block; must be called from
- * process context.
+ * process context.  Has the side-effect of forcing a memory barrier on
+ * each active CPU before returning.
  *
  * Note that it is illegal to call synchornize_srcu() from the corresponding
  * SRCU read-side critical section; doing so will result in deadlock.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found]       ` <44B29212.1070301@yahoo.com.au>
@ 2006-07-11 14:19         ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-11 14:19 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Oleg Nesterov, linux-kernel, akpm, matthltc, dipankar, stern,
	mingo, tytso, dvhltc, jes, dhowells

On Tue, Jul 11, 2006 at 03:44:50AM +1000, Nick Piggin wrote:
> Paul E. McKenney wrote:
> >On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
> 
> >>As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> >>(the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> >>should also guarantee the completion of mem ops on all CPUs before return,
> >>not just mb() (which does not have any timing guaranties).
> >>
> >>Could you clarify this issue?
> >>
> >>(Again, I do not see any problems with the current RCU implementation).
> >
> >
> >However, this -does- seem to be to be a problem with the comment headers
> >and the documentation.  Does the following patch make things better?
> >
> >David, would it be worthwhile adding this global-memory-barrier effect
> >of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
> 
> And call_rcu? (or is that already tucked away in the documentation
> somewhere?) ie. there is a memory barrier between the call_rcu() call
> and the actual callback.
> 
> This is something I needed clarification with (as you might remember),
> which might not be clear from an RCU API user's point of view.

Good point -- since synchronize_rcu() is just a wrapper around call_rcu(),
they do have the same properties.  How about the following?

						Thanx, Paul

Signed-off-by:  Paul E. McKenney <paulmck@us.ibm.com>

 Documentation/RCU/checklist.txt |    4 +++-
 include/linux/srcu.h            |    5 +++++
 kernel/rcupdate.c               |    4 ++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt	2006-07-10 09:43:19.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt	2006-07-11 07:12:25.000000000 -0700
@@ -224,4 +224,6 @@ over a rather long period of time, but i
 
 14.	The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
 	primitives force at least one memory barrier to be executed on
-	each active CPU before they return.
+	each active CPU before they return.  Similarly, call_rcu()
+	forces at least one memory barrier to be executed on each active
+	CPU before the corresponding callback is invoked.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c	2006-07-10 09:48:32.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c	2006-07-11 07:11:07.000000000 -0700
@@ -116,6 +116,10 @@ static inline void force_quiescent_state
  * read-side critical sections have completed.  RCU read-side critical
  * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
  * and may be nested.
+ *
+ * There will be at least one memory barrier executed on each active
+ * CPU between the time call_rcu() is invoked and the time that the
+ * corresponding callback is invoked.
  */
 void fastcall call_rcu(struct rcu_head *head,
 				void (*func)(struct rcu_head *rcu))

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-07-11 14:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
     [not found]   ` <20060709235029.GA194@oleg>
2006-07-10 16:51     ` Paul E. McKenney
     [not found]       ` <44B29212.1070301@yahoo.com.au>
2006-07-11 14:19         ` Paul E. McKenney
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox