[PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking
@ 2006-07-06 17:14 Paul E. McKenney
  2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
  2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
  0 siblings, 2 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes

Version 3 of the SRCU patchset.

This patch incorporates a number of improvements, many of which came
up in off-list discussions with Alan Stern.  Neither of us are sure
why these discussions ended up off-list, so I have summarized them
below.

o	Fixes some "zombie code" -- excess curly braces and the like.

o	Gets rid of the double-flip in favor of an additional
	synchronize_sched().  This turned out to be safe, despite
	my saying otherwise at http://lkml.org/lkml/2006/6/27/486.
	The trick that I was missing is that synchronize_sched()
	forces all CPUs to execute at least one memory barrier during
	the synchronize_sched()'s execution, which forces all CPUs
	to see synchronize_srcu()'s counter increment as happening
	after any memory manipulations prior to the synchronize_srcu().

	Upgraded comments to indicate what the synchronize_sched()
	calls are needed for.

o	Added a barrier() to both srcu_read_lock() and srcu_read_unlock()
	to prevent the compiler from performing optimizations that
	would cause the critical section to move outside of the
	enclosing srcu_read_lock() and srcu_read_unlock().

	However, these barrier()s in srcu_read_lock() and srcu_read_unlock()
	are needed only in non-CONFIG_PREEMPT kernels, so they compile
	to nothing in CONFIG_PREEMPT kernels (where the preempt_disable()
	and preempt_enable() calls supply the needed barrier() call).

o	Added a check to synchronize_srcu() that permits this primitive
	to take advantage of grace periods induced by concurrent executions
	in other tasks.  This can be useful in cases where you are
	using a single srcu_struct to handle all the individually-locked
	chains of a hash table, for example.

o	cleanup_srcu_struct() now contains error checks to catch cases
	where readers are still using the srcu_struct in question.
	It does a WARN_ON() and leaks the srcu_struct's per-CPU data
	in that case.

o	There is an srcu_readers_active() that returns the number of
	readers (approximate!) currently using the specified srcu_struct.
	This can be useful when terminating use of an srcu_struct, e.g.,
	at module-unload time.

o	Improved the RCU torture tests, increasing the skew on reader
	times and providing implementation-specific delay functions.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
  2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
@ 2006-07-06 17:20 ` Paul E. McKenney
       [not found]   ` <20060709235029.GA194@oleg>
  2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
  1 sibling, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes

Updated patch adding a variant of RCU that permits sleeping in read-side
critical sections.  SRCU is as follows:

o	Each use of SRCU creates its own srcu_struct, and each
	srcu_struct has its own set of grace periods.  This is
	critical, as it prevents one subsystem with a blocking
	reader from holding up SRCU grace periods for other
	subsystems.

o	The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
	and synchronize_srcu()) all take a pointer to a srcu_struct.

o	The SRCU primitives must be called from process context.

o	srcu_read_lock() returns an int that must be passed to
	the matching srcu_read_unlock().  Realtime RCU avoids the
	need for this by storing the state in the task struct,
	but SRCU needs to allow a given code path to pass through
	multiple SRCU domains -- storing state in the task struct
	would therefore require either arbitrary space in the
	task struct or arbitrary limits on SRCU nesting.  So I
	kicked the state-storage problem up to the caller.

	Of course, it is not permitted to call synchronize_srcu()
	while in an SRCU read-side critical section.

o	There is no call_srcu().  It would not be hard to implement
	one, but it seems like too easy a way to OOM the system.
	(Hey, we have enough trouble with call_rcu(), which does
	-not- permit readers to sleep!!!)  So, if you want it,
	please tell me why...

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---

 Documentation/RCU/checklist.txt |   38 ++++++
 Documentation/RCU/rcu.txt       |    3 
 Documentation/RCU/whatisRCU.txt |    3 
 include/linux/srcu.h            |   49 ++++++++
 kernel/Makefile                 |    2 
 kernel/srcu.c                   |  238 ++++++++++++++++++++++++++++++++++++++++
 6 files changed, 331 insertions(+), 2 deletions(-)

diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt linux-2.6.17-srcu/Documentation/RCU/checklist.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/checklist.txt	2006-06-30 17:07:02.000000000 -0700
@@ -183,3 +183,41 @@ over a rather long period of time, but i
 	disable irq on a given acquisition of that lock will result in
 	deadlock as soon as the RCU callback happens to interrupt that
 	acquisition's critical section.
+
+13.	SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
+	may only be invoked from process context.  Unlike other forms of
+	RCU, it -is- permissible to block in an SRCU read-side critical
+	section (demarked by srcu_read_lock() and srcu_read_unlock()),
+	hence the "SRCU": "sleepable RCU".  Please note that if you
+	don't need to sleep in read-side critical sections, you should
+	be using RCU rather than SRCU, because RCU is almost always
+	faster and easier to use than is SRCU.
+
+	Also unlike other forms of RCU, explicit initialization
+	and cleanup is required via init_srcu_struct() and
+	cleanup_srcu_struct().	These are passed a "struct srcu_struct"
+	that defines the scope of a given SRCU domain.	Once initialized,
+	the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
+	and synchronize_srcu().  A given synchronize_srcu() waits only
+	for SRCU read-side critical sections governed by srcu_read_lock()
+	and srcu_read_unlock() calls that have been passd the same
+	srcu_struct.  This property is what makes sleeping read-side
+	critical sections tolerable -- a given subsystem delays only
+	its own updates, not those of other subsystems using SRCU.
+	Therefore, SRCU is less prone to OOM the system than RCU would
+	be if RCU's read-side critical sections were permitted to
+	sleep.
+
+	The ability to sleep in read-side critical sections does not
+	come for free.	First, corresponding srcu_read_lock() and
+	srcu_read_unlock() calls must be passed the same srcu_struct.
+	Second, grace-period-detection overhead is amortized only
+	over those updates sharing a given srcu_struct, rather than
+	being globally amortized as they are for other forms of RCU.
+	Therefore, SRCU should be used in preference to rw_semaphore
+	only in extremely read-intensive situations, or in situations
+	requiring SRCU's read-side deadlock immunity or low read-side
+	realtime latency.
+
+	Note that, rcu_assign_pointer() and rcu_dereference() relate to
+	SRCU just as they do to other forms of RCU.
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt linux-2.6.17-srcu/Documentation/RCU/rcu.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/rcu.txt	2006-06-24 08:04:07.000000000 -0700
@@ -45,7 +45,8 @@ o	How can I see where RCU is currently u
 
 	Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
 	"rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh",
-	"synchronize_rcu", and "synchronize_net".
+	"srcu_read_lock", "srcu_read_unlock", "synchronize_rcu",
+	"synchronize_net", and "synchronize_srcu".
 
 o	What guidelines should I follow when writing code that uses RCU?
 
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt	2006-06-24 08:04:07.000000000 -0700
@@ -767,6 +767,8 @@ Markers for RCU read-side critical secti
 	rcu_read_unlock
 	rcu_read_lock_bh
 	rcu_read_unlock_bh
+	srcu_read_lock
+	srcu_read_unlock
 
 RCU pointer/list traversal:
 
@@ -794,6 +796,7 @@ RCU grace period:
 	synchronize_net
 	synchronize_sched
 	synchronize_rcu
+	synchronize_srcu
 	call_rcu
 	call_rcu_bh
 
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/include/linux/srcu.h linux-2.6.17-srcu/include/linux/srcu.h
--- linux-2.6.17-torturercu_bh/include/linux/srcu.h	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/include/linux/srcu.h	2006-07-02 07:27:32.000000000 -0700
@@ -0,0 +1,49 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * 		Documentation/RCU/ *.txt
+ *
+ */
+
+struct srcu_struct_array {
+	int c[2];
+};
+
+struct srcu_struct {
+	int completed;
+	struct srcu_struct_array *per_cpu_ref;
+	struct mutex mutex;
+};
+
+#ifndef CONFIG_PREEMPT
+#define srcu_barrier() barrier()
+#else /* #ifndef CONFIG_PREEMPT */
+#define srcu_barrier()
+#endif /* #else #ifndef CONFIG_PREEMPT */
+
+void init_srcu_struct(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
+int srcu_read_lock(struct srcu_struct *sp);
+void srcu_read_unlock(struct srcu_struct *sp, int idx);
+void synchronize_srcu(struct srcu_struct *sp);
+long srcu_batches_completed(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/Makefile linux-2.6.17-srcu/kernel/Makefile
--- linux-2.6.17-torturercu_bh/kernel/Makefile	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/kernel/Makefile	2006-06-24 08:04:07.000000000 -0700
@@ -8,7 +8,7 @@ obj-y     = sched.o fork.o exec_domain.o
 	    signal.o sys.o kmod.o workqueue.o pid.o \
 	    rcupdate.o extable.o params.o posix-timers.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
-	    hrtimer.o
+	    hrtimer.o srcu.o
 
 obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o
 obj-$(CONFIG_FUTEX) += futex.o
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/srcu.c linux-2.6.17-srcu/kernel/srcu.c
--- linux-2.6.17-torturercu_bh/kernel/srcu.c	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/kernel/srcu.c	2006-07-04 18:49:13.000000000 -0700
@@ -0,0 +1,238 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * 		Documentation/RCU/ *.txt
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/percpu.h>
+#include <linux/preempt.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/srcu.h>
+
+/**
+ * init_srcu_struct - initialize a sleep-RCU structure
+ * @sp: structure to initialize.
+ *
+ * Must invoke this on a given srcu_struct before passing that srcu_struct
+ * to any other function.  Each srcu_struct represents a separate domain
+ * of SRCU protection.
+ */
+void init_srcu_struct(struct srcu_struct *sp)
+{
+	sp->completed = 0;
+	sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
+	mutex_init(&sp->mutex);
+}
+
+/*
+ * srcu_readers_active_idx -- returns approximate number of readers
+ *	active on the specified rank of per-CPU counters.
+ */
+
+static int srcu_readers_active_idx(struct srcu_struct *sp, int idx)
+{
+	int cpu;
+	int sum;
+
+	sum = 0;
+	for_each_possible_cpu(cpu)
+		sum += per_cpu_ptr(sp->per_cpu_ref, cpu)->c[idx];
+	return (sum);
+}
+
+/**
+ * srcu_readers_active - returns approximate number of readers.
+ * @sp: which srcu_struct to count active readers (holding srcu_read_lock).
+ *
+ * Note that this is not an atomic primitive, and can therefore suffer
+ * severe errors when invoked on an active srcu_struct.  That said, it
+ * can be useful as an error check at cleanup time.
+ */
+int srcu_readers_active(struct srcu_struct *sp)
+{
+	return srcu_readers_active_idx(sp, 0) + srcu_readers_active_idx(sp, 1);
+}
+
+/**
+ * cleanup_srcu_struct - deconstruct a sleep-RCU structure
+ * @sp: structure to clean up.
+ *
+ * Must invoke this after you are finished using a given srcu_struct that
+ * was initialized via init_srcu_struct(), else you leak memory.
+ */
+void cleanup_srcu_struct(struct srcu_struct *sp)
+{
+	int sum;
+
+	sum = srcu_readers_active(sp);
+	WARN_ON(sum);  /* Leakage unless caller handles error. */
+	if (sum != 0)
+		return;
+	free_percpu(sp->per_cpu_ref);
+	sp->per_cpu_ref = NULL;
+}
+
+/**
+ * srcu_read_lock - register a new reader for an SRCU-protected structure.
+ * @sp: srcu_struct in which to register the new reader.
+ *
+ * Counts the new reader in the appropriate per-CPU element of the
+ * srcu_struct.  Must be called from process context.
+ * Returns an index that must be passed to the matching srcu_read_unlock().
+ */
+int srcu_read_lock(struct srcu_struct *sp)
+{
+	int idx;
+
+	preempt_disable();
+	idx = sp->completed & 0x1;
+	barrier();  /* ensure compiler looks -once- at sp->completed. */
+	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
+	srcu_barrier();  /* ensure compiler won't misorder critical section. */
+	preempt_enable();
+	return idx;
+}
+
+/**
+ * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
+ * @sp: srcu_struct in which to unregister the old reader.
+ * @idx: return value from corresponding srcu_read_lock().
+ *
+ * Removes the count for the old reader from the appropriate per-CPU
+ * element of the srcu_struct.  Note that this may well be a different
+ * CPU than that which was incremented by the corresponding srcu_read_lock().
+ * Must be called from process context.
+ */
+void srcu_read_unlock(struct srcu_struct *sp, int idx)
+{
+	preempt_disable();
+	srcu_barrier();  /* ensure compiler won't misorder critical section. */
+	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
+	preempt_enable();
+}
+
+/**
+ * synchronize_srcu - wait for prior SRCU read-side critical-section completion
+ * @sp: srcu_struct with which to synchronize.
+ *
+ * Flip the completed counter, and wait for the old count to drain to zero.
+ * As with classic RCU, the updater must use some separate means of
+ * synchronizing concurrent updates.  Can block; must be called from
+ * process context.
+ */
+void synchronize_srcu(struct srcu_struct *sp)
+{
+	int idx;
+	int sum;
+
+	idx = sp->completed;
+	mutex_lock(&sp->mutex);
+
+	/*
+	 * Check to see if someone else did the work for us while we were
+	 * waiting to acquire the lock.  We need -two- advances of
+	 * the counter, not just one.  If there was but one, we might have
+	 * shown up -after- our helper's first synchronize_sched(), thus
+	 * having failed to prevent CPU-reordering races with concurrent
+	 * srcu_read_unlock()s on other CPUs (see comment below).  So we
+	 * either (1) wait for two or (2) supply the second ourselves.
+	 */
+
+	if ((sp->completed - idx) >= 2) {
+		mutex_unlock(&sp->mutex);
+		return;
+	}
+
+	synchronize_sched();  /* Force memory barrier on all CPUs. */
+
+	/*
+	 * The preceding synchronize_sched() ensures that any CPU that
+	 * sees the new value of sp->completed will also see any preceding
+	 * changes to data structures made by this CPU.  This prevents
+	 * some other CPU from reordering the accesses in its SRCU
+	 * read-side critical section to precede the corresponding
+	 * srcu_read_lock() -- ensuring that such references will in
+	 * fact be protected.
+	 *
+	 * So it is now safe to do the flip.
+	 */
+
+	idx = sp->completed & 0x1;
+	sp->completed++;
+
+	synchronize_sched();  /* Force memory barrier on all CPUs. */
+
+	/*
+	 * At this point, because of the preceding synchronize_sched(),
+	 * all srcu_read_lock() calls using the old counters have completed.
+	 * Their corresponding critical sections might well be still
+	 * executing, but the srcu_read_lock() primitives themselves
+	 * will have finished executing.
+	 */
+
+	for (;;) {
+		sum = srcu_readers_active_idx(sp, idx);
+		if (sum == 0)
+			break;
+		schedule_timeout_interruptible(1);
+	}
+
+	synchronize_sched();  /* Force memory barrier on all CPUs. */
+
+	/*
+	 * The preceding synchronize_sched() forces all srcu_read_unlock()
+	 * primitives that were executing concurrently with the preceding
+	 * for_each_possible_cpu() loop to have completed by this point.
+	 * More importantly, it also forces the corresponding SRCU read-side
+	 * critical sections to have also completed, and the corresponding
+	 * references to SRCU-protected data items to be dropped.
+	 */
+
+	mutex_unlock(&sp->mutex);
+}
+
+/**
+ * srcu_batches_completed - return batches completed.
+ * @sp: srcu_struct on which to report batch completion.
+ *
+ * Report the number of batches, correlated with, but not necessarily
+ * precisely the same as, the number of grace periods that have elapsed.
+ */
+
+long srcu_batches_completed(struct srcu_struct *sp)
+{
+	return sp->completed;
+}
+
+EXPORT_SYMBOL_GPL(init_srcu_struct);
+EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
+EXPORT_SYMBOL_GPL(srcu_read_lock);
+EXPORT_SYMBOL_GPL(srcu_read_unlock);
+EXPORT_SYMBOL_GPL(synchronize_srcu);
+EXPORT_SYMBOL_GPL(srcu_batches_completed);
+EXPORT_SYMBOL_GPL(srcu_readers_active);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] srcu-3: add SRCU operations to rcutorture
  2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
  2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
@ 2006-07-06 17:25 ` Paul E. McKenney
  1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes

Adds SRCU operations to rcutorture and updates rcutorture documentation.
Also increases the stress imposed by the rcutorture test.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---

 Documentation/RCU/torture.txt |   15 +++++
 kernel/rcutorture.c           |  121 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 132 insertions(+), 4 deletions(-)

diff -urpNa -X dontdiff linux-2.6.17-srcu/Documentation/RCU/torture.txt linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt
--- linux-2.6.17-srcu/Documentation/RCU/torture.txt	2006-06-24 12:13:12.000000000 -0700
+++ linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt	2006-06-26 18:50:33.000000000 -0700
@@ -118,6 +118,21 @@ o	"Free-Block Circulation": Shows the nu
 	as it is only incremented if a torture structure's counter
 	somehow gets incremented farther than it should.
 
+Different implementations of RCU can provide implementation-specific
+additional information.  For example, SRCU provides the following:
+
+	srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0
+	srcu-torture: Reader Pipe:  559738 939 0 0 0 0 0 0 0 0 0
+	srcu-torture: Reader Batch:  560434 243 0 0 0 0 0 0 0 0
+	srcu-torture: Free-Block Circulation:  355 354 353 352 351 350 349 348 347 346 0
+	srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
+
+The first four lines are similar to those for RCU.  The last line shows
+the per-CPU counter state.  The numbers in parentheses are the values
+of the "old" and "current" counters for the corresponding CPU.  The
+"idx" value maps the "old" and "current" values to the underlying array,
+and is useful for debugging.
+
 
 USAGE
 
diff -urpNa -X dontdiff linux-2.6.17-srcu/kernel/rcutorture.c linux-2.6.17-torturesrcu/kernel/rcutorture.c
--- linux-2.6.17-srcu/kernel/rcutorture.c	2006-06-30 15:03:35.000000000 -0700
+++ linux-2.6.17-torturesrcu/kernel/rcutorture.c	2006-07-02 07:34:35.000000000 -0700
@@ -44,6 +44,7 @@
 #include <linux/delay.h>
 #include <linux/byteorder/swabb.h>
 #include <linux/stat.h>
+#include <linux/srcu.h>
 
 MODULE_LICENSE("GPL");
 
@@ -53,7 +54,7 @@ static int stat_interval;	/* Interval be
 static int verbose;		/* Print more debug info. */
 static int test_no_idle_hz;	/* Test RCU's support for tickless idle CPUs. */
 static int shuffle_interval = 5; /* Interval between shuffles (in sec)*/
-static char *torture_type = "rcu"; /* What to torture. */
+static char *torture_type = "rcu"; /* What to torture: rcu, srcu. */
 
 module_param(nreaders, int, 0);
 MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
@@ -66,7 +67,7 @@ MODULE_PARM_DESC(test_no_idle_hz, "Test 
 module_param(shuffle_interval, int, 0);
 MODULE_PARM_DESC(shuffle_interval, "Number of seconds between shuffles");
 module_param(torture_type, charp, 0);
-MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh)");
+MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, srcu)");
 
 #define TORTURE_FLAG "-torture:"
 #define PRINTK_STRING(s) \
@@ -180,6 +181,7 @@ struct rcu_torture_ops {
 	void (*init)(void);
 	void (*cleanup)(void);
 	int (*readlock)(void);
+	void (*readdelay)(struct rcu_random_state *rrsp);
 	void (*readunlock)(int idx);
 	int (*completed)(void);
 	void (*deferredfree)(struct rcu_torture *p);
@@ -198,6 +200,18 @@ static int rcu_torture_read_lock(void)
 	return 0;
 }
 
+static void rcu_read_delay(struct rcu_random_state *rrsp)
+{
+	long delay;
+	const long longdelay = 200;
+
+	/* We want there to be long-running readers, but not all the time. */
+
+	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay);
+	if (!delay)
+		udelay(longdelay);
+}
+
 static void rcu_torture_read_unlock(int idx)
 {
 	rcu_read_unlock();
@@ -239,6 +253,7 @@ static struct rcu_torture_ops rcu_ops = 
 	.init = NULL,
 	.cleanup = NULL,
 	.readlock = rcu_torture_read_lock,
+	.readdelay = rcu_read_delay,
 	.readunlock = rcu_torture_read_unlock,
 	.completed = rcu_torture_completed,
 	.deferredfree = rcu_torture_deferred_free,
@@ -275,6 +290,7 @@ static struct rcu_torture_ops rcu_bh_ops
 	.init = NULL,
 	.cleanup = NULL,
 	.readlock = rcu_bh_torture_read_lock,
+	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
 	.readunlock = rcu_bh_torture_read_unlock,
 	.completed = rcu_bh_torture_completed,
 	.deferredfree = rcu_bh_torture_deferred_free,
@@ -282,8 +298,105 @@ static struct rcu_torture_ops rcu_bh_ops
 	.name = "rcu_bh"
 };
 
+/*
+ * Definitions for srcu torture testing.
+ */
+
+static struct srcu_struct srcu_ctl;
+static struct list_head srcu_removed;
+
+static void srcu_torture_init(void)
+{
+	init_srcu_struct(&srcu_ctl);
+	INIT_LIST_HEAD(&srcu_removed);
+}
+
+static void srcu_torture_cleanup(void)
+{
+	synchronize_srcu(&srcu_ctl);
+	cleanup_srcu_struct(&srcu_ctl);
+}
+
+static int srcu_torture_read_lock(void)
+{
+	return srcu_read_lock(&srcu_ctl);
+}
+
+static void srcu_read_delay(struct rcu_random_state *rrsp)
+{
+	long delay;
+	const long uspertick = 1000000 / HZ;
+	const long longdelay = 10;
+
+	/* We want there to be long-running readers, but not all the time. */
+
+	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick);
+	if (!delay)
+		schedule_timeout_interruptible(longdelay);
+}
+
+static void srcu_torture_read_unlock(int idx)
+{
+	srcu_read_unlock(&srcu_ctl, idx);
+}
+
+static int srcu_torture_completed(void)
+{
+	return srcu_batches_completed(&srcu_ctl);
+}
+
+static void srcu_torture_deferred_free(struct rcu_torture *p)
+{
+	int i;
+	struct rcu_torture *rp;
+	struct rcu_torture *rp1;
+
+	synchronize_srcu(&srcu_ctl);
+	list_add(&p->rtort_free, &srcu_removed);
+	list_for_each_entry_safe(rp, rp1, &srcu_removed, rtort_free) {
+		i = rp->rtort_pipe_count;
+		if (i > RCU_TORTURE_PIPE_LEN)
+			i = RCU_TORTURE_PIPE_LEN;
+		atomic_inc(&rcu_torture_wcount[i]);
+		if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
+			rp->rtort_mbtest = 0;
+			list_del(&rp->rtort_free);
+			rcu_torture_free(rp);
+		}
+	}
+}
+
+int srcu_torture_stats(char *page)
+{
+	int cnt = 0;
+	int cpu;
+	int idx = srcu_ctl.completed & 0x1;
+
+	cnt += sprintf(&page[cnt], "%s%s per-CPU(idx=%d):",
+		       torture_type, TORTURE_FLAG, idx);
+	for_each_possible_cpu(cpu) {
+		cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
+			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
+			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
+	}
+	cnt += sprintf(&page[cnt], "\n");
+	return cnt;
+}
+
+static struct rcu_torture_ops srcu_ops = {
+	.init = srcu_torture_init,
+	.cleanup = srcu_torture_cleanup,
+	.readlock = srcu_torture_read_lock,
+	.readdelay = srcu_read_delay,
+	.readunlock = srcu_torture_read_unlock,
+	.completed = srcu_torture_completed,
+	.deferredfree = srcu_torture_deferred_free,
+	.stats = srcu_torture_stats,
+	.name = "srcu"
+};
+
 static struct rcu_torture_ops *torture_ops[] =
-	{ &rcu_ops, &rcu_bh_ops, NULL };
+	{ &rcu_ops, &rcu_bh_ops, &srcu_ops, NULL };
 
 /*
  * RCU torture writer kthread.  Repeatedly substitutes a new structure
@@ -359,7 +472,7 @@ rcu_torture_reader(void *arg)
 		}
 		if (p->rtort_mbtest == 0)
 			atomic_inc(&n_rcu_torture_mberror);
-		udelay(rcu_random(&rand) & 0x7f);
+		cur_ops->readdelay(&rand);
 		preempt_disable();
 		pipe_count = p->rtort_pipe_count;
 		if (pipe_count > RCU_TORTURE_PIPE_LEN) {

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found] ` <1152226204.21787.2093.camel@stark>
@ 2006-07-06 23:39   ` Paul E. McKenney
       [not found]     ` <Pine.LNX.4.44L0.0607071051430.17135-100000@iolanthe.rowland.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-06 23:39 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Alan Stern, linux-kernel, Andrew Morton, dipankar, Ingo Molnar,
	tytso, Darren Hart, oleg, Jes Sorensen

On Thu, Jul 06, 2006 at 03:50:03PM -0700, Matt Helsley wrote:
> On Thu, 2006-07-06 at 16:28 -0400, Alan Stern wrote:
> > I've been trying to come up with a way to allow SRCU structures to be
> > initialized statically rather than dynamically.  The per-cpu data makes it
> > quite hard.  Not only do you have to use different routines to access
> > static vs. dynamic per-cpu data, there's just no good way to write a
> > static initializer.  This is because the per-cpu data requires its own
> > separate definition, and there's no way to call DEFINE_PER_CPU from within 
> > an initializer.
> > 
> > Here, in outline, is the best I've been able to come up with.  It uses a
> > function pointer member to select the appropriate sort of per-cpu data
> > access.  You would use it like this:
> > 
> > PREDEFINE_SRCU(s);
> > static DEFINE_SRCU(s);
> > ...
> > idx = srcu_read_lock(&s);
> > ... etc ...
> > 
> > Alternative possibilities involve an entire parallel implementation for
> > statically-initialized structures (which seems excessive) or using a
> > runtime test instead of a function pointer to select the dereferencing
> > mechanism.
> > 
> > Can anybody suggest anything better?
> > 
> > Alan Stern
> 
> I started to come up with something similar but did not get as far. I
> suspect the runtime test you're suggesting would look like:
> 
> #include <asm/sections.h>
> 
> ...
> if ((per_cpu_ptr >= __per_cpu_start) && (per_cpu_ptr < __per_cpu_end)) {
>     /* staticly-allocated per-cpu data */
>     ...
> } else {
>     /* dynamically-allocated per-cpu data */
>     ...
> }
> ...
> 
> I think that's easier to read and understand than following a function
> pointer.

Is this what the two of you are getting at?

#define DEFINE_SRCU_STRUCT(name) \
	DEFINE_PER_CPU(struct srcu_struct_array, name) = { 0, 0 }; \
	struct srcu_struct name = { \
		.completed = 0, \
		.per_cpu_ref = NULL, \
		.mutex = __MUTEX_INITIALIZER(name.mutex) \
	}

#define srcu_read_lock(ss) \
	({ \
		if ((ss)->per_cpu_ref != NULL) \
			srcu_read_lock_dynamic(&ss); \
		else { \
			int ret; \
			\
			preempt_disable(); \
			ret = srcu_read_lock_static(&ss, &__get_cpu_var(ss)); \
			preempt_enable(); \
			ret; \
		} \
	})

int srcu_read_lock_dynamic(struct srcu_struct *sp)
{
	int idx;

	preempt_disable();
	idx = sp->completed & 0x1;
	barrier();  /* ensure compiler looks -once- at sp->completed. */
	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
	srcu_barrier();  /* ensure compiler won't misorder critical section. */
	preempt_enable();
	return idx;
}

int srcu_read_lock_static(struct srcu_struct *sp, srcu_struct_array *cp)
{
	int idx;

	idx = sp->completed & 0x1;
	barrier();  /* ensure compiler looks -once- at sp->completed. */
	cp->c[idx]++;
	srcu_barrier();  /* ensure compiler won't misorder critical section. */
	return idx;
}

And similarly for srcu_read_unlock()?

I sure hope that there is a better way!!!  For one thing, you cannot pass
a pointer in to srcu_read_lock(), since __get_cpu_var's name mangling would
fail in that case...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found]     ` <Pine.LNX.4.44L0.0607071051430.17135-100000@iolanthe.rowland.org>
@ 2006-07-07 16:33       ` Paul E. McKenney
       [not found]         ` <Pine.LNX.4.44L0.0607071345270.6793-100000@iolanthe.rowland.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-07 16:33 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matt Helsley, linux-kernel, Andrew Morton, dipankar, Ingo Molnar,
	tytso, Darren Hart, oleg, Jes Sorensen

On Fri, Jul 07, 2006 at 10:58:28AM -0400, Alan Stern wrote:
> On Thu, 6 Jul 2006, Paul E. McKenney wrote:
> 
> > Is this what the two of you are getting at?
> > 
> > #define DEFINE_SRCU_STRUCT(name) \
> > 	DEFINE_PER_CPU(struct srcu_struct_array, name) = { 0, 0 }; \
> > 	struct srcu_struct name = { \
> > 		.completed = 0, \
> > 		.per_cpu_ref = NULL, \
> > 		.mutex = __MUTEX_INITIALIZER(name.mutex) \
> > 	}
> 
> Note that this approach won't work when you need to do something like:
> 
> 	struct xyz {
> 		struct srcu_struct s;
> 	} the_xyz = {
> 		.s = /* What goes here? */
> 	};

Yep, this the same issue leading to my complaint below about not being
able to pass a pointer to the resulting srcu_struct.

> > #define srcu_read_lock(ss) \
> > 	({ \
> > 		if ((ss)->per_cpu_ref != NULL) \
> > 			srcu_read_lock_dynamic(&ss); \
> > 		else { \
> > 			int ret; \
> > 			\
> > 			preempt_disable(); \
> > 			ret = srcu_read_lock_static(&ss, &__get_cpu_var(ss)); \
> > 			preempt_enable(); \
> > 			ret; \
> > 		} \
> > 	})
> > 
> > int srcu_read_lock_dynamic(struct srcu_struct *sp)
> > {
> > 	int idx;
> > 
> > 	preempt_disable();
> > 	idx = sp->completed & 0x1;
> > 	barrier();  /* ensure compiler looks -once- at sp->completed. */
> > 	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
> > 	srcu_barrier();  /* ensure compiler won't misorder critical section. */
> > 	preempt_enable();
> > 	return idx;
> > }
> > 
> > int srcu_read_lock_static(struct srcu_struct *sp, srcu_struct_array *cp)
> > {
> > 	int idx;
> > 
> > 	idx = sp->completed & 0x1;
> > 	barrier();  /* ensure compiler looks -once- at sp->completed. */
> > 	cp->c[idx]++;
> > 	srcu_barrier();  /* ensure compiler won't misorder critical section. */
> > 	return idx;
> > }
> > 
> > And similarly for srcu_read_unlock()?
> > 
> > I sure hope that there is a better way!!!  For one thing, you cannot pass
> > a pointer in to srcu_read_lock(), since __get_cpu_var's name mangling would
> > fail in that case...
> 
> No, that's not what we had in mind.

Another approach I looked at was statically allocating a struct
percpu_data, but initializing it seems to be problematic.

So here are the three approaches that seem to have some chance
of working:

1.	Your approach of dynamically selecting between the
	per_cpu_ptr() and per_cpu() APIs based on a flag
	within the structure.

2.	Creating a pair of SRCU APIs, reflecting the two
	underlying per-CPU APIs (one for staticly allocated
	per-CPU variables, the other for dynamically allocated
	per-CPU variables).

3.	A compile-time translation layer, making use of
	two different structure types and a bit of gcc
	type comparison.  The idea would be to create
	a srcu_struct_static and a srcu_struct_dynamic
	structure that contained a pointer to the corresponding
	per-CPU variable and an srcu_struct, and to have
	a set of macros that did a typeof comparison, selecting
	the appropriate underlying primitive from the set
	of two.

	This is essentially #2, but with some cpp/typeof
	magic to make it look to the user of SRCU that there
	is but one API.

The goal I believe we are trying to attain with SRCU include:

a.	Minimal read-side overhead.  This goal favors 2 and 3.
	(Yes, blocking is so expensive that the extra check is
	"in the noise" if we block on the read side -- but I
	expect uses where blocking can happen but is extremely
	rare.)

b.	Minimal API expansion.  This goal favors 1 and 3.

c.	Simple and straightforward use of well-understood and
	timeworn features of gcc.  This goal favors 1 and 2.

Based on this breakdown, we have a three-way tie.  I tend to pay less
much attention to (c), which would lead me to choose #2.

Thoughts?  Other important goals?  Better yet, other approaches?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found]         ` <Pine.LNX.4.44L0.0607071345270.6793-100000@iolanthe.rowland.org>
@ 2006-07-07 18:59           ` Paul E. McKenney
  2006-07-07 19:59             ` Alan Stern
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-07 18:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
	Darren Hart, oleg, Jes Sorensen, linux-kernel

On Fri, Jul 07, 2006 at 02:02:27PM -0400, Alan Stern wrote:
> On Fri, 7 Jul 2006, Paul E. McKenney wrote:
> 
> > > Note that this approach won't work when you need to do something like:
> > > 
> > > 	struct xyz {
> > > 		struct srcu_struct s;
> > > 	} the_xyz = {
> > > 		.s = /* What goes here? */
> > > 	};
> > 
> > Yep, this the same issue leading to my complaint below about not being
> > able to pass a pointer to the resulting srcu_struct.
> 
> No, not really.  The problem here is that you can't use DEFINE_PER_CPU 
> inside the initializer for the_xyz.  The problem about not being able to 
> pass a pointer is easily fixed; this problem is not so easy.

Both symptoms of the same problem in my view, but I agree that other
perspectives are possible and perhaps even useful.  ;-)

We agree on the important thing, which is that the approach I was
calling out in the earlier email has some severe shortcomings, and
that we therefore need to do something different.

> > Another approach I looked at was statically allocating a struct
> > percpu_data, but initializing it seems to be problematic.
> > 
> > So here are the three approaches that seem to have some chance
> > of working:
> > 
> > 1.	Your approach of dynamically selecting between the
> > 	per_cpu_ptr() and per_cpu() APIs based on a flag
> > 	within the structure.
> 
> Or a function pointer within the structure.

Agreed, either a function pointer or a flag.

> > 2.	Creating a pair of SRCU APIs, reflecting the two
> > 	underlying per-CPU APIs (one for staticly allocated
> > 	per-CPU variables, the other for dynamically allocated
> > 	per-CPU variables).
> 
> This seems ridiculous.  It would be much better IMO to come up with a 
> least-common-multiple API that would apply to both sorts of variables.
> For example, per-cpu data could be represented by _both_ a pointer and a 
> table instead of just a pointer (static) or just a table (dynamic).

No argument here.

> > 3.	A compile-time translation layer, making use of
> > 	two different structure types and a bit of gcc
> > 	type comparison.  The idea would be to create
> > 	a srcu_struct_static and a srcu_struct_dynamic
> > 	structure that contained a pointer to the corresponding
> > 	per-CPU variable and an srcu_struct, and to have
> > 	a set of macros that did a typeof comparison, selecting
> > 	the appropriate underlying primitive from the set
> > 	of two.
> > 
> > 	This is essentially #2, but with some cpp/typeof
> > 	magic to make it look to the user of SRCU that there
> > 	is but one API.
> 
> This would add tremendous complexity, in terms of how the API is
> implemented, for no very good reason.  Programming is hard enough 
> already...

Leaving out the "tremendous", yes, there would be some machinations.
It would certainly be OK by me if this can be avoided.  ;-)

> > The goal I believe we are trying to attain with SRCU include:
> > 
> > a.	Minimal read-side overhead.  This goal favors 2 and 3.
> > 	(Yes, blocking is so expensive that the extra check is
> > 	"in the noise" if we block on the read side -- but I
> > 	expect uses where blocking can happen but is extremely
> > 	rare.)
> > 
> > b.	Minimal API expansion.  This goal favors 1 and 3.
> > 
> > c.	Simple and straightforward use of well-understood and
> > 	timeworn features of gcc.  This goal favors 1 and 2.
> > 
> > Based on this breakdown, we have a three-way tie.  I tend to pay less
> > much attention to (c), which would lead me to choose #2.
> > 
> > Thoughts?  Other important goals?  Better yet, other approaches?
> 
> I think it's foolish for us to waste a tremendous amount of time on this 
> when the real problem is the poor design of the per-cpu API.  Fix that, 
> and most of the difficulties will be gone.

If the per-CPU API was reasonably unifiable, I expect that it would
already be unified.  The problem is that the easy ways to unify it hit
some extremely hot code paths with extra cache misses -- for example, one
could add a struct percpu_data to each and every static DEFINE_PERCPU(),
but at the cost of an extra cache line touched and extra indirection
-- which I believe was deemed unacceptable -- and would introduce
initialization difficulties for the static case.

So, a fourth possibility -- can a call from start_kernel() invoke some
function in yours and Matt's code invoke init_srcu_struct() to get a
statically allocated srcu_struct initialized?  Or, if this is part of
a module, can the module initialization function do this work?

(Hey, I had to ask!)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
  2006-07-07 18:59           ` Paul E. McKenney
@ 2006-07-07 19:59             ` Alan Stern
  2006-07-07 21:11               ` Matt Helsley
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2006-07-07 19:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Matt Helsley, Andrew Morton, dipankar, Ingo Molnar, tytso,
	Darren Hart, oleg, Jes Sorensen, linux-kernel

On Fri, 7 Jul 2006, Paul E. McKenney wrote:

> > I think it's foolish for us to waste a tremendous amount of time on this 
> > when the real problem is the poor design of the per-cpu API.  Fix that, 
> > and most of the difficulties will be gone.
> 
> If the per-CPU API was reasonably unifiable, I expect that it would
> already be unified.  The problem is that the easy ways to unify it hit
> some extremely hot code paths with extra cache misses -- for example, one
> could add a struct percpu_data to each and every static DEFINE_PERCPU(),
> but at the cost of an extra cache line touched and extra indirection
> -- which I believe was deemed unacceptable -- and would introduce
> initialization difficulties for the static case.

Here's a sketch of a possible approach.  Generalizing it and making it 
look pretty are left as exercises for the reader.  :-)

In srcu_struct, along with

	struct srcu_struct_array *per_cpu_ref;

add
	struct percpu_data *per_cpu_table;

Dynamic initialization does:

	sp->per_cpu_ref = NULL;
	sp->per_cpu_table = alloc_percpu(...);

Static initialization does:

	sp->per_cpu_ref = PER_CPU_ADDRESS(...);		/* My macro
		from before; gives the address of the static variable */

	sp->per_cpu_table = (struct percpu_data *)
		~(unsigned long) __per_cpu_offset;

Then the unified_per_cpu_ptr(ref, table, cpu) macro would expand to
something like this:

({
	struct percpu_data *t = (struct percpu_data *)~(unsigned long)(table);
	RELOC_HIDE(ref, t->ptrs[cpu]);
})

Making this work right would of course require knowledge of the intimate
details of both include/linux/percpu.h and include/asmXXX/percpu.h.  
There's some ambiguity about what t above points to: a structure
containing an array of pointers to void, or an array of unsigned longs.  
Fortunately I think it doesn't matter.

Doing it this way would not incur any extra cache misses, except for the 
need to store an extra member in srcu_struct.

> So, a fourth possibility -- can a call from start_kernel() invoke some
> function in yours and Matt's code invoke init_srcu_struct() to get a
> statically allocated srcu_struct initialized?  Or, if this is part of
> a module, can the module initialization function do this work?
> 
> (Hey, I had to ask!)

That is certainly a viable approach: just force everyone to use dynamic 
initialization.  Changes to existing code would be relatively few.

I'm not sure where the right place would be to add these initialization 
calls.  After kmalloc is working but before the relevant notifier chains 
get used at all.  Is there such a place?  I guess it depends on which 
notifier chains we convert.

We might want to leave some chains using the existing rw-semaphore API.  
It's more appropriate when there's a high frequency of write-locking
(i.e., things registering or unregistering on the notifier chain).  The 
SRCU approach is more appropriate when the chain is called a lot and 
needs to have low overhead, but (un)registration is uncommon.  Matt's task 
notifiers are a good example.

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
  2006-07-07 19:59             ` Alan Stern
@ 2006-07-07 21:11               ` Matt Helsley
  2006-07-07 21:47                 ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Matt Helsley @ 2006-07-07 21:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Paul E. McKenney, Andrew Morton, dipankar, Ingo Molnar, tytso,
	Darren Hart, oleg, Jes Sorensen, LKML

On Fri, 2006-07-07 at 15:59 -0400, Alan Stern wrote:
> On Fri, 7 Jul 2006, Paul E. McKenney wrote:

<snip>

> > So, a fourth possibility -- can a call from start_kernel() invoke some
> > function in yours and Matt's code invoke init_srcu_struct() to get a
> > statically allocated srcu_struct initialized?  Or, if this is part of
> > a module, can the module initialization function do this work?
> > 
> > (Hey, I had to ask!)
> 
> That is certainly a viable approach: just force everyone to use dynamic 
> initialization.  Changes to existing code would be relatively few.

	Works for me. I've been working on patches for Andrew's multi-chain
proposal and I could use an init function there anyway. Should be faster
too -- dynamically-allocated per-cpu memory can take advantage of
node-local memory whereas, to my knowledge, statically-allocated cannot.

> I'm not sure where the right place would be to add these initialization 
> calls.  After kmalloc is working but before the relevant notifier chains 
> get used at all.  Is there such a place?  I guess it depends on which 
> notifier chains we convert.
> 
> We might want to leave some chains using the existing rw-semaphore API.  
> It's more appropriate when there's a high frequency of write-locking
> (i.e., things registering or unregistering on the notifier chain).  The 
> SRCU approach is more appropriate when the chain is called a lot and 
> needs to have low overhead, but (un)registration is uncommon.  Matt's task 
> notifiers are a good example.

Yes, it is an excellent example.

> Alan Stern

Cheers,
	-Matt Helsley


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
  2006-07-07 21:11               ` Matt Helsley
@ 2006-07-07 21:47                 ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-07 21:47 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Alan Stern, Andrew Morton, dipankar, Ingo Molnar, tytso,
	Darren Hart, oleg, Jes Sorensen, LKML

On Fri, Jul 07, 2006 at 02:11:26PM -0700, Matt Helsley wrote:
> On Fri, 2006-07-07 at 15:59 -0400, Alan Stern wrote:
> > On Fri, 7 Jul 2006, Paul E. McKenney wrote:
> 
> <snip>
> 
> > > So, a fourth possibility -- can a call from start_kernel() invoke some
> > > function in yours and Matt's code invoke init_srcu_struct() to get a
> > > statically allocated srcu_struct initialized?  Or, if this is part of
> > > a module, can the module initialization function do this work?
> > > 
> > > (Hey, I had to ask!)
> > 
> > That is certainly a viable approach: just force everyone to use dynamic 
> > initialization.  Changes to existing code would be relatively few.
> 
> 	Works for me. I've been working on patches for Andrew's multi-chain
> proposal and I could use an init function there anyway. Should be faster
> too -- dynamically-allocated per-cpu memory can take advantage of
> node-local memory whereas, to my knowledge, statically-allocated cannot.

Sounds very good to me!  ;-)

> > I'm not sure where the right place would be to add these initialization 
> > calls.  After kmalloc is working but before the relevant notifier chains 
> > get used at all.  Is there such a place?  I guess it depends on which 
> > notifier chains we convert.
> > 
> > We might want to leave some chains using the existing rw-semaphore API.  
> > It's more appropriate when there's a high frequency of write-locking
> > (i.e., things registering or unregistering on the notifier chain).  The 
> > SRCU approach is more appropriate when the chain is called a lot and 
> > needs to have low overhead, but (un)registration is uncommon.  Matt's task 
> > notifiers are a good example.
> 
> Yes, it is an excellent example.

Good!!!  Please let me know how it goes.  I will shelve the idea of
statically allocated per-CPU data for srcu_struct for the moment.
If some other application shows up that needs it, I will revisit.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found]   ` <20060709235029.GA194@oleg>
@ 2006-07-10 16:51     ` Paul E. McKenney
       [not found]       ` <44B29212.1070301@yahoo.com.au>
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-10 16:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, akpm, matthltc, dipankar, stern, mingo, tytso,
	dvhltc, jes, dhowells

On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
> On 07/06, Paul E. McKenney wrote:
> >
> > Updated patch adding a variant of RCU that permits sleeping in read-side
> > critical sections.
> 
> I do not see any problems with this patch, but I have a couple of
> questions, so your help is needed again.

Thank you for looking it over!

> > +void synchronize_srcu(struct srcu_struct *sp)
> > +{
> > +	[... snip ...]
> > +
> > +	synchronize_sched();  /* Force memory barrier on all CPUs. */
> > +
> > +	/*
> > +	 * The preceding synchronize_sched() forces all srcu_read_unlock()
> > +	 * primitives that were executing concurrently with the preceding
> > +	 * for_each_possible_cpu() loop to have completed by this point.
> > +	 * More importantly, it also forces the corresponding SRCU read-side
> > +	 * critical sections to have also completed, and the corresponding
> > +	 * references to SRCU-protected data items to be dropped.
> > +	 */
> > +
> > +	mutex_unlock(&sp->mutex);
> > +}
> 
> Isn't it possible to unlock ->mutex earlier, before the last
> synchronize_sched()?

It seems possible, but I would like to think carefully about this one
first, and, if it still seems plausible, test it heavily.  If I understand
your line of reasoning, the thought is that the first synchronize_sched()
at the beginning of synchronize_srcu() ensures that all of the counter
updates pertaining to the last instance of synchronize_srcu() have
been committed.  The same reasoning might well cover the sp->completed
fastpath as well.

In any case, this is a performance boost off the fastpath.  A good boost,
if it works, but I will be much more excited if you find a way of speeding
up srcu_read_lock() or srcu_read_unlock().  ;-)

> Another question: what is the semantics of synchronize_sched() ?
> 
> I am not talking about the current implementation, it is very clear.
> The question is: what is the _definition_ of synchronize_sched()
> (which must be valid for "any" RCU implementation) ?
> 
> 1) The comment in include/linux/rcupdate.h states that "all preempt_disable
>    code sequences will have completed before this primitive returns".
> 
> 2) kernel/srcu.c claims that this primitive "forces memory barrier on all
>    CPUs". (so the comment in rcupdate.h is not complete).
> 
>    (I understand this so that each cpu does something which implies mb()
>     semantics).
> 
> As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> (the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> should also guarantee the completion of mem ops on all CPUs before return,
> not just mb() (which does not have any timing guaranties).
> 
> Could you clarify this issue?
> 
> (Again, I do not see any problems with the current RCU implementation).

However, this -does- seem to be to be a problem with the comment headers
and the documentation.  Does the following patch make things better?

David, would it be worthwhile adding this global-memory-barrier effect
of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
Documentation/memory-barriers.txt?

							Thanx, Paul

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---

 Documentation/RCU/checklist.txt |    4 ++++
 include/linux/rcupdate.h        |    3 +++
 kernel/rcupdate.c               |    3 +++
 kernel/srcu.c                   |    3 ++-
 4 files changed, 12 insertions(+), 1 deletion(-)

diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt	2006-07-06 16:45:01.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt	2006-07-10 09:43:19.000000000 -0700
@@ -221,3 +221,7 @@ over a rather long period of time, but i
 
 	Note that, rcu_assign_pointer() and rcu_dereference() relate to
 	SRCU just as they do to other forms of RCU.
+
+14.	The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
+	primitives force at least one memory barrier to be executed on
+	each active CPU before they return.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h
--- linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h	2006-07-10 09:48:51.000000000 -0700
@@ -251,6 +251,9 @@ extern int rcu_needs_cpu(int cpu);
  * guarantees that rcu_read_lock() sections will have completed.
  * In "classic RCU", these two guarantees happen to be one and
  * the same, but can differ in realtime RCU implementations.
+ * 
+ * In addition, this primitive guarantees that every active CPU has
+ * executed at least one memory barrier before it returns.
  */
 #define synchronize_sched() synchronize_rcu()
 
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c	2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c	2006-07-10 09:48:32.000000000 -0700
@@ -597,6 +597,9 @@ static void wakeme_after_rcu(struct rcu_
  * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
  * and may be nested.
  *
+ * This primitive also causes each active CPU to execute at least one
+ * memory barrier before it returns.
+ *
  * If your read-side code is not protected by rcu_read_lock(), do -not-
  * use synchronize_rcu().
  */
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/srcu.c linux-2.6.17-srcu-LKML-5/kernel/srcu.c
--- linux-2.6.17-srcu-LKML-4/kernel/srcu.c	2006-07-06 16:50:23.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/srcu.c	2006-07-10 09:48:09.000000000 -0700
@@ -143,7 +143,8 @@ void srcu_read_unlock(struct srcu_struct
  * Flip the completed counter, and wait for the old count to drain to zero.
  * As with classic RCU, the updater must use some separate means of
  * synchronizing concurrent updates.  Can block; must be called from
- * process context.
+ * process context.  Has the side-effect of forcing a memory barrier on
+ * each active CPU before returning.
  *
  * Note that it is illegal to call synchornize_srcu() from the corresponding
  * SRCU read-side critical section; doing so will result in deadlock.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found]       ` <44B29212.1070301@yahoo.com.au>
@ 2006-07-11 14:19         ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-11 14:19 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Oleg Nesterov, linux-kernel, akpm, matthltc, dipankar, stern,
	mingo, tytso, dvhltc, jes, dhowells

On Tue, Jul 11, 2006 at 03:44:50AM +1000, Nick Piggin wrote:
> Paul E. McKenney wrote:
> >On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
> 
> >>As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> >>(the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> >>should also guarantee the completion of mem ops on all CPUs before return,
> >>not just mb() (which does not have any timing guaranties).
> >>
> >>Could you clarify this issue?
> >>
> >>(Again, I do not see any problems with the current RCU implementation).
> >
> >
> >However, this -does- seem to be to be a problem with the comment headers
> >and the documentation.  Does the following patch make things better?
> >
> >David, would it be worthwhile adding this global-memory-barrier effect
> >of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
> 
> And call_rcu? (or is that already tucked away in the documentation
> somewhere?) ie. there is a memory barrier between the call_rcu() call
> and the actual callback.
> 
> This is something I needed clarification with (as you might remember),
> which might not be clear from an RCU API user's point of view.

Good point -- since synchronize_rcu() is just a wrapper around call_rcu(),
they do have the same properties.  How about the following?

						Thanx, Paul

Signed-off-by:  Paul E. McKenney <paulmck@us.ibm.com>

 Documentation/RCU/checklist.txt |    4 +++-
 include/linux/srcu.h            |    5 +++++
 kernel/rcupdate.c               |    4 ++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt	2006-07-10 09:43:19.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt	2006-07-11 07:12:25.000000000 -0700
@@ -224,4 +224,6 @@ over a rather long period of time, but i
 
 14.	The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
 	primitives force at least one memory barrier to be executed on
-	each active CPU before they return.
+	each active CPU before they return.  Similarly, call_rcu()
+	forces at least one memory barrier to be executed on each active
+	CPU before the corresponding callback is invoked.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c	2006-07-10 09:48:32.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c	2006-07-11 07:11:07.000000000 -0700
@@ -116,6 +116,10 @@ static inline void force_quiescent_state
  * read-side critical sections have completed.  RCU read-side critical
  * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
  * and may be nested.
+ *
+ * There will be at least one memory barrier executed on each active
+ * CPU between the time call_rcu() is invoked and the time that the
+ * corresponding callback is invoked.
  */
 void fastcall call_rcu(struct rcu_head *head,
 				void (*func)(struct rcu_head *rcu))

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
       [not found] <20060711172530.GA93@oleg>
@ 2006-07-11 14:56 ` Alan Stern
  2006-07-11 18:21   ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2006-07-11 14:56 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Paul E. McKenney, Andrew Morton, matthltc, dipankar, mingo, tytso,
	dvhltc, jes, Kernel development list

On Tue, 11 Jul 2006, Oleg Nesterov wrote:

> Let's look at the 3-rd synchronize_sched() call.
> 
> Suppose that the reader writes to the rcu-protected memory and then
> does srcu_read_unlock(). It is possible that the writer sees the result
> of srcu_read_unlock() (so that srcu_readers_active_idx() returns 0)
> but does not see other writes yet. The writer does synchronize_sched(),
> the reader (according to 2) above) does mb(). But this doesn't guarantee
> that all preceding mem ops were finished, so it is possible that
> synchronize_srcu() returns and probably frees that memory too early.
> 
> If it were possible to "insert" mb() into srcu_read_unlock() before ->c[]--,
> we are ok, but synchronize_sched() is a "shot in the dark".
> 
> In a similar manner, the 2-nd synchronize_sched() requires (I think) that
> all changes which were done by the current CPU should be visible to other
> CPUs before return.
> 
> Does it make sense?

Yes, it's a valid concern.

Paul will correct me if this is wrong...  As I understand it, the code 
which responds to a synchronize_sched() call (that is, the code which runs 
on other CPUs) does lots of stuff, involving context changes and 
who-knows-what else.  There are lots of memory barriers in there, buried 
in the middle, along with plenty of other memory reads and writes.

In outline, the reader's CPU responds to synchronize_sched() something 
like this:

	At the next safe time (context switch):
		mb();
		Tell the writer's CPU that the preempt count on
			the reader's CPU is 0

Since that "tell the writer's CPU" step won't complete until after the 
mb() has forced all preceding memory operations to complete, we are safe.

So perhaps the documentation for synchronize_sched() and synchronize_rcu()  
should say that when the call returns, all CPUs (including the caller's)
will have executed a memory barrier and so will have completed all memory
operations that started before the synchronize_xxx() call was issued.

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
  2006-07-11 14:56 ` Alan Stern
@ 2006-07-11 18:21   ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2006-07-11 18:21 UTC (permalink / raw)
  To: Alan Stern
  Cc: Oleg Nesterov, Andrew Morton, matthltc, dipankar, mingo, tytso,
	dvhltc, jes, Kernel development list

On Tue, Jul 11, 2006 at 10:56:05AM -0400, Alan Stern wrote:
> On Tue, 11 Jul 2006, Oleg Nesterov wrote:
> 
> > Let's look at the 3-rd synchronize_sched() call.
> > 
> > Suppose that the reader writes to the rcu-protected memory and then
> > does srcu_read_unlock(). It is possible that the writer sees the result
> > of srcu_read_unlock() (so that srcu_readers_active_idx() returns 0)
> > but does not see other writes yet. The writer does synchronize_sched(),
> > the reader (according to 2) above) does mb(). But this doesn't guarantee
> > that all preceding mem ops were finished, so it is possible that
> > synchronize_srcu() returns and probably frees that memory too early.
> > 
> > If it were possible to "insert" mb() into srcu_read_unlock() before ->c[]--,
> > we are ok, but synchronize_sched() is a "shot in the dark".
> > 
> > In a similar manner, the 2-nd synchronize_sched() requires (I think) that
> > all changes which were done by the current CPU should be visible to other
> > CPUs before return.
> > 
> > Does it make sense?
> 
> Yes, it's a valid concern.
> 
> Paul will correct me if this is wrong...  As I understand it, the code 
> which responds to a synchronize_sched() call (that is, the code which runs 
> on other CPUs) does lots of stuff, involving context changes and 
> who-knows-what else.  There are lots of memory barriers in there, buried 
> in the middle, along with plenty of other memory reads and writes.
> 
> In outline, the reader's CPU responds to synchronize_sched() something 
> like this:
> 
> 	At the next safe time (context switch):
> 		mb();
> 		Tell the writer's CPU that the preempt count on
> 			the reader's CPU is 0
> 
> Since that "tell the writer's CPU" step won't complete until after the 
> mb() has forced all preceding memory operations to complete, we are safe.
> 
> So perhaps the documentation for synchronize_sched() and synchronize_rcu()  
> should say that when the call returns, all CPUs (including the caller's)
> will have executed a memory barrier and so will have completed all memory
> operations that started before the synchronize_xxx() call was issued.

I would instead think of this in the context of memory-barrier pairings
as documented in Documentation/memory-barriers.txt.

As Oleg notes above, synchronize_sched() is a "shot in the dark", since
there is nothing that lines up the synchronize_srcu() with any concurrent
srcu_read_lock() or srcu_read_unlock().  The trick here is that the
synchronize_sched() eliminates one of the possibilities (taking this
from the viewpoint of the last synchronize_sched() in synchronize_srcu()
on CPU 0 interacting with a concurrent srcu_read_unlock() on CPU 1):

1.	srcu_read_unlock()'s counter decrement is perceived by CPU 1
	as happening after all of the read-side critical-section
	accesses.  This is OK, because this means that the critical
	section completes before synchronize_srcu()'s checks, and thus
	before any destructive operations following the synchronize_srcu().

2.	srcu_read_unlock()'s counter decrement is perceived by CPU 1
	as happening -before- some of the read-side critical-section
	accesses.  This could be bad, but...  The synchronize_srcu()
	forces CPU 0 to execute a memory barrier on each active CPU,
	including CPU 1.  CPU 0 must subsequently execute a memory
	barrier to become aware of the grace period ending, so that
	it sees -all- CPU 1's earlier memory accesses (required, because
	CPU 0 had to have seen, perhaps indirectly, CPU 1's indication
	that it had gone through a quiescent state).

	Therefore, by the time CPU 0 is done with the synchronize_sched(),
	the memory accesses that CPU 1 executed as part of the SRCU
	read-side critical section must all be visible.

Any sequence of events that has the srcu_read_unlock() happening after
the synchronize_sched() starts is prohibited, since the synchronize_sched()
won't start until the counters all hit zero.

In case people were wondering, the ornateness of the above reasoning is
one reason I have been resisting any attempt to further complicate
the SRCU stuff.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-07-11 18:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
     [not found]   ` <20060709235029.GA194@oleg>
2006-07-10 16:51     ` Paul E. McKenney
     [not found]       ` <44B29212.1070301@yahoo.com.au>
2006-07-11 14:19         ` Paul E. McKenney
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
     [not found] <Pine.LNX.4.44L0.0607061603320.5768-100000@iolanthe.rowland.org>
     [not found] ` <1152226204.21787.2093.camel@stark>
2006-07-06 23:39   ` [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking Paul E. McKenney
     [not found]     ` <Pine.LNX.4.44L0.0607071051430.17135-100000@iolanthe.rowland.org>
2006-07-07 16:33       ` Paul E. McKenney
     [not found]         ` <Pine.LNX.4.44L0.0607071345270.6793-100000@iolanthe.rowland.org>
2006-07-07 18:59           ` Paul E. McKenney
2006-07-07 19:59             ` Alan Stern
2006-07-07 21:11               ` Matt Helsley
2006-07-07 21:47                 ` Paul E. McKenney
     [not found] <20060711172530.GA93@oleg>
2006-07-11 14:56 ` Alan Stern
2006-07-11 18:21   ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox