* [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking
@ 2006-07-06 17:14 Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
0 siblings, 2 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:14 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes
Version 3 of the SRCU patchset.
This patch incorporates a number of improvements, many of which came
up in off-list discussions with Alan Stern. Neither of us are sure
why these discussions ended up off-list, so I have summarized them
below.
o Fixes some "zombie code" -- excess curly braces and the like.
o Gets rid of the double-flip in favor of an additional
synchronize_sched(). This turned out to be safe, despite
my saying otherwise at http://lkml.org/lkml/2006/6/27/486.
The trick that I was missing is that synchronize_sched()
forces all CPUs to execute at least one memory barrier during
the synchronize_sched()'s execution, which forces all CPUs
to see synchronize_srcu()'s counter increment as happening
after any memory manipulations prior to the synchronize_srcu().
Upgraded comments to indicate what the synchronize_sched()
calls are needed for.
o Added a barrier() to both srcu_read_lock() and srcu_read_unlock()
to prevent the compiler from performing optimizations that
would cause the critical section to move outside of the
enclosing srcu_read_lock() and srcu_read_unlock().
However, these barrier()s in srcu_read_lock() and srcu_read_unlock()
are needed only in non-CONFIG_PREEMPT kernels, so they compile
to nothing in CONFIG_PREEMPT kernels (where the preempt_disable()
and preempt_enable() calls supply the needed barrier() call).
o Added a check to synchronize_srcu() that permits this primitive
to take advantage of grace periods induced by concurrent executions
in other tasks. This can be useful in cases where you are
using a single srcu_struct to handle all the individually-locked
chains of a hash table, for example.
o cleanup_srcu_struct() now contains error checks to catch cases
where readers are still using the srcu_struct in question.
It does a WARN_ON() and leaks the srcu_struct's per-CPU data
in that case.
o There is an srcu_readers_active() that returns the number of
readers (approximate!) currently using the specified srcu_struct.
This can be useful when terminating use of an srcu_struct, e.g.,
at module-unload time.
o Improved the RCU torture tests, increasing the skew on reader
times and providing implementation-specific delay functions.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
@ 2006-07-06 17:20 ` Paul E. McKenney
[not found] ` <20060709235029.GA194@oleg>
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
1 sibling, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:20 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes
Updated patch adding a variant of RCU that permits sleeping in read-side
critical sections. SRCU is as follows:
o Each use of SRCU creates its own srcu_struct, and each
srcu_struct has its own set of grace periods. This is
critical, as it prevents one subsystem with a blocking
reader from holding up SRCU grace periods for other
subsystems.
o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
and synchronize_srcu()) all take a pointer to a srcu_struct.
o The SRCU primitives must be called from process context.
o srcu_read_lock() returns an int that must be passed to
the matching srcu_read_unlock(). Realtime RCU avoids the
need for this by storing the state in the task struct,
but SRCU needs to allow a given code path to pass through
multiple SRCU domains -- storing state in the task struct
would therefore require either arbitrary space in the
task struct or arbitrary limits on SRCU nesting. So I
kicked the state-storage problem up to the caller.
Of course, it is not permitted to call synchronize_srcu()
while in an SRCU read-side critical section.
o There is no call_srcu(). It would not be hard to implement
one, but it seems like too easy a way to OOM the system.
(Hey, we have enough trouble with call_rcu(), which does
-not- permit readers to sleep!!!) So, if you want it,
please tell me why...
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---
Documentation/RCU/checklist.txt | 38 ++++++
Documentation/RCU/rcu.txt | 3
Documentation/RCU/whatisRCU.txt | 3
include/linux/srcu.h | 49 ++++++++
kernel/Makefile | 2
kernel/srcu.c | 238 ++++++++++++++++++++++++++++++++++++++++
6 files changed, 331 insertions(+), 2 deletions(-)
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt linux-2.6.17-srcu/Documentation/RCU/checklist.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/checklist.txt 2006-06-30 17:07:02.000000000 -0700
@@ -183,3 +183,41 @@ over a rather long period of time, but i
disable irq on a given acquisition of that lock will result in
deadlock as soon as the RCU callback happens to interrupt that
acquisition's critical section.
+
+13. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
+ may only be invoked from process context. Unlike other forms of
+ RCU, it -is- permissible to block in an SRCU read-side critical
+ section (demarked by srcu_read_lock() and srcu_read_unlock()),
+ hence the "SRCU": "sleepable RCU". Please note that if you
+ don't need to sleep in read-side critical sections, you should
+ be using RCU rather than SRCU, because RCU is almost always
+ faster and easier to use than is SRCU.
+
+ Also unlike other forms of RCU, explicit initialization
+ and cleanup is required via init_srcu_struct() and
+ cleanup_srcu_struct(). These are passed a "struct srcu_struct"
+ that defines the scope of a given SRCU domain. Once initialized,
+ the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
+ and synchronize_srcu(). A given synchronize_srcu() waits only
+ for SRCU read-side critical sections governed by srcu_read_lock()
+ and srcu_read_unlock() calls that have been passd the same
+ srcu_struct. This property is what makes sleeping read-side
+ critical sections tolerable -- a given subsystem delays only
+ its own updates, not those of other subsystems using SRCU.
+ Therefore, SRCU is less prone to OOM the system than RCU would
+ be if RCU's read-side critical sections were permitted to
+ sleep.
+
+ The ability to sleep in read-side critical sections does not
+ come for free. First, corresponding srcu_read_lock() and
+ srcu_read_unlock() calls must be passed the same srcu_struct.
+ Second, grace-period-detection overhead is amortized only
+ over those updates sharing a given srcu_struct, rather than
+ being globally amortized as they are for other forms of RCU.
+ Therefore, SRCU should be used in preference to rw_semaphore
+ only in extremely read-intensive situations, or in situations
+ requiring SRCU's read-side deadlock immunity or low read-side
+ realtime latency.
+
+ Note that, rcu_assign_pointer() and rcu_dereference() relate to
+ SRCU just as they do to other forms of RCU.
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt linux-2.6.17-srcu/Documentation/RCU/rcu.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/rcu.txt 2006-06-24 08:04:07.000000000 -0700
@@ -45,7 +45,8 @@ o How can I see where RCU is currently u
Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
"rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh",
- "synchronize_rcu", and "synchronize_net".
+ "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu",
+ "synchronize_net", and "synchronize_srcu".
o What guidelines should I follow when writing code that uses RCU?
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt 2006-06-24 08:04:07.000000000 -0700
@@ -767,6 +767,8 @@ Markers for RCU read-side critical secti
rcu_read_unlock
rcu_read_lock_bh
rcu_read_unlock_bh
+ srcu_read_lock
+ srcu_read_unlock
RCU pointer/list traversal:
@@ -794,6 +796,7 @@ RCU grace period:
synchronize_net
synchronize_sched
synchronize_rcu
+ synchronize_srcu
call_rcu
call_rcu_bh
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/include/linux/srcu.h linux-2.6.17-srcu/include/linux/srcu.h
--- linux-2.6.17-torturercu_bh/include/linux/srcu.h 1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/include/linux/srcu.h 2006-07-02 07:27:32.000000000 -0700
@@ -0,0 +1,49 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * Documentation/RCU/ *.txt
+ *
+ */
+
+struct srcu_struct_array {
+ int c[2];
+};
+
+struct srcu_struct {
+ int completed;
+ struct srcu_struct_array *per_cpu_ref;
+ struct mutex mutex;
+};
+
+#ifndef CONFIG_PREEMPT
+#define srcu_barrier() barrier()
+#else /* #ifndef CONFIG_PREEMPT */
+#define srcu_barrier()
+#endif /* #else #ifndef CONFIG_PREEMPT */
+
+void init_srcu_struct(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
+int srcu_read_lock(struct srcu_struct *sp);
+void srcu_read_unlock(struct srcu_struct *sp, int idx);
+void synchronize_srcu(struct srcu_struct *sp);
+long srcu_batches_completed(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/Makefile linux-2.6.17-srcu/kernel/Makefile
--- linux-2.6.17-torturercu_bh/kernel/Makefile 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/kernel/Makefile 2006-06-24 08:04:07.000000000 -0700
@@ -8,7 +8,7 @@ obj-y = sched.o fork.o exec_domain.o
signal.o sys.o kmod.o workqueue.o pid.o \
rcupdate.o extable.o params.o posix-timers.o \
kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
- hrtimer.o
+ hrtimer.o srcu.o
obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o
obj-$(CONFIG_FUTEX) += futex.o
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/srcu.c linux-2.6.17-srcu/kernel/srcu.c
--- linux-2.6.17-torturercu_bh/kernel/srcu.c 1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/kernel/srcu.c 2006-07-04 18:49:13.000000000 -0700
@@ -0,0 +1,238 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * Documentation/RCU/ *.txt
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/percpu.h>
+#include <linux/preempt.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/srcu.h>
+
+/**
+ * init_srcu_struct - initialize a sleep-RCU structure
+ * @sp: structure to initialize.
+ *
+ * Must invoke this on a given srcu_struct before passing that srcu_struct
+ * to any other function. Each srcu_struct represents a separate domain
+ * of SRCU protection.
+ */
+void init_srcu_struct(struct srcu_struct *sp)
+{
+ sp->completed = 0;
+ sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
+ mutex_init(&sp->mutex);
+}
+
+/*
+ * srcu_readers_active_idx -- returns approximate number of readers
+ * active on the specified rank of per-CPU counters.
+ */
+
+static int srcu_readers_active_idx(struct srcu_struct *sp, int idx)
+{
+ int cpu;
+ int sum;
+
+ sum = 0;
+ for_each_possible_cpu(cpu)
+ sum += per_cpu_ptr(sp->per_cpu_ref, cpu)->c[idx];
+ return (sum);
+}
+
+/**
+ * srcu_readers_active - returns approximate number of readers.
+ * @sp: which srcu_struct to count active readers (holding srcu_read_lock).
+ *
+ * Note that this is not an atomic primitive, and can therefore suffer
+ * severe errors when invoked on an active srcu_struct. That said, it
+ * can be useful as an error check at cleanup time.
+ */
+int srcu_readers_active(struct srcu_struct *sp)
+{
+ return srcu_readers_active_idx(sp, 0) + srcu_readers_active_idx(sp, 1);
+}
+
+/**
+ * cleanup_srcu_struct - deconstruct a sleep-RCU structure
+ * @sp: structure to clean up.
+ *
+ * Must invoke this after you are finished using a given srcu_struct that
+ * was initialized via init_srcu_struct(), else you leak memory.
+ */
+void cleanup_srcu_struct(struct srcu_struct *sp)
+{
+ int sum;
+
+ sum = srcu_readers_active(sp);
+ WARN_ON(sum); /* Leakage unless caller handles error. */
+ if (sum != 0)
+ return;
+ free_percpu(sp->per_cpu_ref);
+ sp->per_cpu_ref = NULL;
+}
+
+/**
+ * srcu_read_lock - register a new reader for an SRCU-protected structure.
+ * @sp: srcu_struct in which to register the new reader.
+ *
+ * Counts the new reader in the appropriate per-CPU element of the
+ * srcu_struct. Must be called from process context.
+ * Returns an index that must be passed to the matching srcu_read_unlock().
+ */
+int srcu_read_lock(struct srcu_struct *sp)
+{
+ int idx;
+
+ preempt_disable();
+ idx = sp->completed & 0x1;
+ barrier(); /* ensure compiler looks -once- at sp->completed. */
+ per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
+ srcu_barrier(); /* ensure compiler won't misorder critical section. */
+ preempt_enable();
+ return idx;
+}
+
+/**
+ * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
+ * @sp: srcu_struct in which to unregister the old reader.
+ * @idx: return value from corresponding srcu_read_lock().
+ *
+ * Removes the count for the old reader from the appropriate per-CPU
+ * element of the srcu_struct. Note that this may well be a different
+ * CPU than that which was incremented by the corresponding srcu_read_lock().
+ * Must be called from process context.
+ */
+void srcu_read_unlock(struct srcu_struct *sp, int idx)
+{
+ preempt_disable();
+ srcu_barrier(); /* ensure compiler won't misorder critical section. */
+ per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
+ preempt_enable();
+}
+
+/**
+ * synchronize_srcu - wait for prior SRCU read-side critical-section completion
+ * @sp: srcu_struct with which to synchronize.
+ *
+ * Flip the completed counter, and wait for the old count to drain to zero.
+ * As with classic RCU, the updater must use some separate means of
+ * synchronizing concurrent updates. Can block; must be called from
+ * process context.
+ */
+void synchronize_srcu(struct srcu_struct *sp)
+{
+ int idx;
+ int sum;
+
+ idx = sp->completed;
+ mutex_lock(&sp->mutex);
+
+ /*
+ * Check to see if someone else did the work for us while we were
+ * waiting to acquire the lock. We need -two- advances of
+ * the counter, not just one. If there was but one, we might have
+ * shown up -after- our helper's first synchronize_sched(), thus
+ * having failed to prevent CPU-reordering races with concurrent
+ * srcu_read_unlock()s on other CPUs (see comment below). So we
+ * either (1) wait for two or (2) supply the second ourselves.
+ */
+
+ if ((sp->completed - idx) >= 2) {
+ mutex_unlock(&sp->mutex);
+ return;
+ }
+
+ synchronize_sched(); /* Force memory barrier on all CPUs. */
+
+ /*
+ * The preceding synchronize_sched() ensures that any CPU that
+ * sees the new value of sp->completed will also see any preceding
+ * changes to data structures made by this CPU. This prevents
+ * some other CPU from reordering the accesses in its SRCU
+ * read-side critical section to precede the corresponding
+ * srcu_read_lock() -- ensuring that such references will in
+ * fact be protected.
+ *
+ * So it is now safe to do the flip.
+ */
+
+ idx = sp->completed & 0x1;
+ sp->completed++;
+
+ synchronize_sched(); /* Force memory barrier on all CPUs. */
+
+ /*
+ * At this point, because of the preceding synchronize_sched(),
+ * all srcu_read_lock() calls using the old counters have completed.
+ * Their corresponding critical sections might well be still
+ * executing, but the srcu_read_lock() primitives themselves
+ * will have finished executing.
+ */
+
+ for (;;) {
+ sum = srcu_readers_active_idx(sp, idx);
+ if (sum == 0)
+ break;
+ schedule_timeout_interruptible(1);
+ }
+
+ synchronize_sched(); /* Force memory barrier on all CPUs. */
+
+ /*
+ * The preceding synchronize_sched() forces all srcu_read_unlock()
+ * primitives that were executing concurrently with the preceding
+ * for_each_possible_cpu() loop to have completed by this point.
+ * More importantly, it also forces the corresponding SRCU read-side
+ * critical sections to have also completed, and the corresponding
+ * references to SRCU-protected data items to be dropped.
+ */
+
+ mutex_unlock(&sp->mutex);
+}
+
+/**
+ * srcu_batches_completed - return batches completed.
+ * @sp: srcu_struct on which to report batch completion.
+ *
+ * Report the number of batches, correlated with, but not necessarily
+ * precisely the same as, the number of grace periods that have elapsed.
+ */
+
+long srcu_batches_completed(struct srcu_struct *sp)
+{
+ return sp->completed;
+}
+
+EXPORT_SYMBOL_GPL(init_srcu_struct);
+EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
+EXPORT_SYMBOL_GPL(srcu_read_lock);
+EXPORT_SYMBOL_GPL(srcu_read_unlock);
+EXPORT_SYMBOL_GPL(synchronize_srcu);
+EXPORT_SYMBOL_GPL(srcu_batches_completed);
+EXPORT_SYMBOL_GPL(srcu_readers_active);
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] srcu-3: add SRCU operations to rcutorture
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
@ 2006-07-06 17:25 ` Paul E. McKenney
1 sibling, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:25 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes
Adds SRCU operations to rcutorture and updates rcutorture documentation.
Also increases the stress imposed by the rcutorture test.
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---
Documentation/RCU/torture.txt | 15 +++++
kernel/rcutorture.c | 121 ++++++++++++++++++++++++++++++++++++++++--
2 files changed, 132 insertions(+), 4 deletions(-)
diff -urpNa -X dontdiff linux-2.6.17-srcu/Documentation/RCU/torture.txt linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt
--- linux-2.6.17-srcu/Documentation/RCU/torture.txt 2006-06-24 12:13:12.000000000 -0700
+++ linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt 2006-06-26 18:50:33.000000000 -0700
@@ -118,6 +118,21 @@ o "Free-Block Circulation": Shows the nu
as it is only incremented if a torture structure's counter
somehow gets incremented farther than it should.
+Different implementations of RCU can provide implementation-specific
+additional information. For example, SRCU provides the following:
+
+ srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0
+ srcu-torture: Reader Pipe: 559738 939 0 0 0 0 0 0 0 0 0
+ srcu-torture: Reader Batch: 560434 243 0 0 0 0 0 0 0 0
+ srcu-torture: Free-Block Circulation: 355 354 353 352 351 350 349 348 347 346 0
+ srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
+
+The first four lines are similar to those for RCU. The last line shows
+the per-CPU counter state. The numbers in parentheses are the values
+of the "old" and "current" counters for the corresponding CPU. The
+"idx" value maps the "old" and "current" values to the underlying array,
+and is useful for debugging.
+
USAGE
diff -urpNa -X dontdiff linux-2.6.17-srcu/kernel/rcutorture.c linux-2.6.17-torturesrcu/kernel/rcutorture.c
--- linux-2.6.17-srcu/kernel/rcutorture.c 2006-06-30 15:03:35.000000000 -0700
+++ linux-2.6.17-torturesrcu/kernel/rcutorture.c 2006-07-02 07:34:35.000000000 -0700
@@ -44,6 +44,7 @@
#include <linux/delay.h>
#include <linux/byteorder/swabb.h>
#include <linux/stat.h>
+#include <linux/srcu.h>
MODULE_LICENSE("GPL");
@@ -53,7 +54,7 @@ static int stat_interval; /* Interval be
static int verbose; /* Print more debug info. */
static int test_no_idle_hz; /* Test RCU's support for tickless idle CPUs. */
static int shuffle_interval = 5; /* Interval between shuffles (in sec)*/
-static char *torture_type = "rcu"; /* What to torture. */
+static char *torture_type = "rcu"; /* What to torture: rcu, srcu. */
module_param(nreaders, int, 0);
MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
@@ -66,7 +67,7 @@ MODULE_PARM_DESC(test_no_idle_hz, "Test
module_param(shuffle_interval, int, 0);
MODULE_PARM_DESC(shuffle_interval, "Number of seconds between shuffles");
module_param(torture_type, charp, 0);
-MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh)");
+MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, srcu)");
#define TORTURE_FLAG "-torture:"
#define PRINTK_STRING(s) \
@@ -180,6 +181,7 @@ struct rcu_torture_ops {
void (*init)(void);
void (*cleanup)(void);
int (*readlock)(void);
+ void (*readdelay)(struct rcu_random_state *rrsp);
void (*readunlock)(int idx);
int (*completed)(void);
void (*deferredfree)(struct rcu_torture *p);
@@ -198,6 +200,18 @@ static int rcu_torture_read_lock(void)
return 0;
}
+static void rcu_read_delay(struct rcu_random_state *rrsp)
+{
+ long delay;
+ const long longdelay = 200;
+
+ /* We want there to be long-running readers, but not all the time. */
+
+ delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay);
+ if (!delay)
+ udelay(longdelay);
+}
+
static void rcu_torture_read_unlock(int idx)
{
rcu_read_unlock();
@@ -239,6 +253,7 @@ static struct rcu_torture_ops rcu_ops =
.init = NULL,
.cleanup = NULL,
.readlock = rcu_torture_read_lock,
+ .readdelay = rcu_read_delay,
.readunlock = rcu_torture_read_unlock,
.completed = rcu_torture_completed,
.deferredfree = rcu_torture_deferred_free,
@@ -275,6 +290,7 @@ static struct rcu_torture_ops rcu_bh_ops
.init = NULL,
.cleanup = NULL,
.readlock = rcu_bh_torture_read_lock,
+ .readdelay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = rcu_bh_torture_read_unlock,
.completed = rcu_bh_torture_completed,
.deferredfree = rcu_bh_torture_deferred_free,
@@ -282,8 +298,105 @@ static struct rcu_torture_ops rcu_bh_ops
.name = "rcu_bh"
};
+/*
+ * Definitions for srcu torture testing.
+ */
+
+static struct srcu_struct srcu_ctl;
+static struct list_head srcu_removed;
+
+static void srcu_torture_init(void)
+{
+ init_srcu_struct(&srcu_ctl);
+ INIT_LIST_HEAD(&srcu_removed);
+}
+
+static void srcu_torture_cleanup(void)
+{
+ synchronize_srcu(&srcu_ctl);
+ cleanup_srcu_struct(&srcu_ctl);
+}
+
+static int srcu_torture_read_lock(void)
+{
+ return srcu_read_lock(&srcu_ctl);
+}
+
+static void srcu_read_delay(struct rcu_random_state *rrsp)
+{
+ long delay;
+ const long uspertick = 1000000 / HZ;
+ const long longdelay = 10;
+
+ /* We want there to be long-running readers, but not all the time. */
+
+ delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick);
+ if (!delay)
+ schedule_timeout_interruptible(longdelay);
+}
+
+static void srcu_torture_read_unlock(int idx)
+{
+ srcu_read_unlock(&srcu_ctl, idx);
+}
+
+static int srcu_torture_completed(void)
+{
+ return srcu_batches_completed(&srcu_ctl);
+}
+
+static void srcu_torture_deferred_free(struct rcu_torture *p)
+{
+ int i;
+ struct rcu_torture *rp;
+ struct rcu_torture *rp1;
+
+ synchronize_srcu(&srcu_ctl);
+ list_add(&p->rtort_free, &srcu_removed);
+ list_for_each_entry_safe(rp, rp1, &srcu_removed, rtort_free) {
+ i = rp->rtort_pipe_count;
+ if (i > RCU_TORTURE_PIPE_LEN)
+ i = RCU_TORTURE_PIPE_LEN;
+ atomic_inc(&rcu_torture_wcount[i]);
+ if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
+ rp->rtort_mbtest = 0;
+ list_del(&rp->rtort_free);
+ rcu_torture_free(rp);
+ }
+ }
+}
+
+int srcu_torture_stats(char *page)
+{
+ int cnt = 0;
+ int cpu;
+ int idx = srcu_ctl.completed & 0x1;
+
+ cnt += sprintf(&page[cnt], "%s%s per-CPU(idx=%d):",
+ torture_type, TORTURE_FLAG, idx);
+ for_each_possible_cpu(cpu) {
+ cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
+ per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
+ per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
+ }
+ cnt += sprintf(&page[cnt], "\n");
+ return cnt;
+}
+
+static struct rcu_torture_ops srcu_ops = {
+ .init = srcu_torture_init,
+ .cleanup = srcu_torture_cleanup,
+ .readlock = srcu_torture_read_lock,
+ .readdelay = srcu_read_delay,
+ .readunlock = srcu_torture_read_unlock,
+ .completed = srcu_torture_completed,
+ .deferredfree = srcu_torture_deferred_free,
+ .stats = srcu_torture_stats,
+ .name = "srcu"
+};
+
static struct rcu_torture_ops *torture_ops[] =
- { &rcu_ops, &rcu_bh_ops, NULL };
+ { &rcu_ops, &rcu_bh_ops, &srcu_ops, NULL };
/*
* RCU torture writer kthread. Repeatedly substitutes a new structure
@@ -359,7 +472,7 @@ rcu_torture_reader(void *arg)
}
if (p->rtort_mbtest == 0)
atomic_inc(&n_rcu_torture_mberror);
- udelay(rcu_random(&rand) & 0x7f);
+ cur_ops->readdelay(&rand);
preempt_disable();
pipe_count = p->rtort_pipe_count;
if (pipe_count > RCU_TORTURE_PIPE_LEN) {
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <20060709235029.GA194@oleg>
@ 2006-07-10 16:51 ` Paul E. McKenney
[not found] ` <44B29212.1070301@yahoo.com.au>
0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-10 16:51 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, akpm, matthltc, dipankar, stern, mingo, tytso,
dvhltc, jes, dhowells
On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
> On 07/06, Paul E. McKenney wrote:
> >
> > Updated patch adding a variant of RCU that permits sleeping in read-side
> > critical sections.
>
> I do not see any problems with this patch, but I have a couple of
> questions, so your help is needed again.
Thank you for looking it over!
> > +void synchronize_srcu(struct srcu_struct *sp)
> > +{
> > + [... snip ...]
> > +
> > + synchronize_sched(); /* Force memory barrier on all CPUs. */
> > +
> > + /*
> > + * The preceding synchronize_sched() forces all srcu_read_unlock()
> > + * primitives that were executing concurrently with the preceding
> > + * for_each_possible_cpu() loop to have completed by this point.
> > + * More importantly, it also forces the corresponding SRCU read-side
> > + * critical sections to have also completed, and the corresponding
> > + * references to SRCU-protected data items to be dropped.
> > + */
> > +
> > + mutex_unlock(&sp->mutex);
> > +}
>
> Isn't it possible to unlock ->mutex earlier, before the last
> synchronize_sched()?
It seems possible, but I would like to think carefully about this one
first, and, if it still seems plausible, test it heavily. If I understand
your line of reasoning, the thought is that the first synchronize_sched()
at the beginning of synchronize_srcu() ensures that all of the counter
updates pertaining to the last instance of synchronize_srcu() have
been committed. The same reasoning might well cover the sp->completed
fastpath as well.
In any case, this is a performance boost off the fastpath. A good boost,
if it works, but I will be much more excited if you find a way of speeding
up srcu_read_lock() or srcu_read_unlock(). ;-)
> Another question: what is the semantics of synchronize_sched() ?
>
> I am not talking about the current implementation, it is very clear.
> The question is: what is the _definition_ of synchronize_sched()
> (which must be valid for "any" RCU implementation) ?
>
> 1) The comment in include/linux/rcupdate.h states that "all preempt_disable
> code sequences will have completed before this primitive returns".
>
> 2) kernel/srcu.c claims that this primitive "forces memory barrier on all
> CPUs". (so the comment in rcupdate.h is not complete).
>
> (I understand this so that each cpu does something which implies mb()
> semantics).
>
> As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> (the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> should also guarantee the completion of mem ops on all CPUs before return,
> not just mb() (which does not have any timing guaranties).
>
> Could you clarify this issue?
>
> (Again, I do not see any problems with the current RCU implementation).
However, this -does- seem to be to be a problem with the comment headers
and the documentation. Does the following patch make things better?
David, would it be worthwhile adding this global-memory-barrier effect
of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
Documentation/memory-barriers.txt?
Thanx, Paul
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---
Documentation/RCU/checklist.txt | 4 ++++
include/linux/rcupdate.h | 3 +++
kernel/rcupdate.c | 3 +++
kernel/srcu.c | 3 ++-
4 files changed, 12 insertions(+), 1 deletion(-)
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt 2006-07-06 16:45:01.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt 2006-07-10 09:43:19.000000000 -0700
@@ -221,3 +221,7 @@ over a rather long period of time, but i
Note that, rcu_assign_pointer() and rcu_dereference() relate to
SRCU just as they do to other forms of RCU.
+
+14. The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
+ primitives force at least one memory barrier to be executed on
+ each active CPU before they return.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h
--- linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h 2006-07-10 09:48:51.000000000 -0700
@@ -251,6 +251,9 @@ extern int rcu_needs_cpu(int cpu);
* guarantees that rcu_read_lock() sections will have completed.
* In "classic RCU", these two guarantees happen to be one and
* the same, but can differ in realtime RCU implementations.
+ *
+ * In addition, this primitive guarantees that every active CPU has
+ * executed at least one memory barrier before it returns.
*/
#define synchronize_sched() synchronize_rcu()
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c 2006-07-10 09:48:32.000000000 -0700
@@ -597,6 +597,9 @@ static void wakeme_after_rcu(struct rcu_
* sections are delimited by rcu_read_lock() and rcu_read_unlock(),
* and may be nested.
*
+ * This primitive also causes each active CPU to execute at least one
+ * memory barrier before it returns.
+ *
* If your read-side code is not protected by rcu_read_lock(), do -not-
* use synchronize_rcu().
*/
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/srcu.c linux-2.6.17-srcu-LKML-5/kernel/srcu.c
--- linux-2.6.17-srcu-LKML-4/kernel/srcu.c 2006-07-06 16:50:23.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/srcu.c 2006-07-10 09:48:09.000000000 -0700
@@ -143,7 +143,8 @@ void srcu_read_unlock(struct srcu_struct
* Flip the completed counter, and wait for the old count to drain to zero.
* As with classic RCU, the updater must use some separate means of
* synchronizing concurrent updates. Can block; must be called from
- * process context.
+ * process context. Has the side-effect of forcing a memory barrier on
+ * each active CPU before returning.
*
* Note that it is illegal to call synchornize_srcu() from the corresponding
* SRCU read-side critical section; doing so will result in deadlock.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <44B29212.1070301@yahoo.com.au>
@ 2006-07-11 14:19 ` Paul E. McKenney
0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-11 14:19 UTC (permalink / raw)
To: Nick Piggin
Cc: Oleg Nesterov, linux-kernel, akpm, matthltc, dipankar, stern,
mingo, tytso, dvhltc, jes, dhowells
On Tue, Jul 11, 2006 at 03:44:50AM +1000, Nick Piggin wrote:
> Paul E. McKenney wrote:
> >On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
>
> >>As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> >>(the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> >>should also guarantee the completion of mem ops on all CPUs before return,
> >>not just mb() (which does not have any timing guaranties).
> >>
> >>Could you clarify this issue?
> >>
> >>(Again, I do not see any problems with the current RCU implementation).
> >
> >
> >However, this -does- seem to be to be a problem with the comment headers
> >and the documentation. Does the following patch make things better?
> >
> >David, would it be worthwhile adding this global-memory-barrier effect
> >of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
>
> And call_rcu? (or is that already tucked away in the documentation
> somewhere?) ie. there is a memory barrier between the call_rcu() call
> and the actual callback.
>
> This is something I needed clarification with (as you might remember),
> which might not be clear from an RCU API user's point of view.
Good point -- since synchronize_rcu() is just a wrapper around call_rcu(),
they do have the same properties. How about the following?
Thanx, Paul
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Documentation/RCU/checklist.txt | 4 +++-
include/linux/srcu.h | 5 +++++
kernel/rcupdate.c | 4 ++++
3 files changed, 12 insertions(+), 1 deletion(-)
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt 2006-07-10 09:43:19.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt 2006-07-11 07:12:25.000000000 -0700
@@ -224,4 +224,6 @@ over a rather long period of time, but i
14. The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
primitives force at least one memory barrier to be executed on
- each active CPU before they return.
+ each active CPU before they return. Similarly, call_rcu()
+ forces at least one memory barrier to be executed on each active
+ CPU before the corresponding callback is invoked.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c 2006-07-10 09:48:32.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c 2006-07-11 07:11:07.000000000 -0700
@@ -116,6 +116,10 @@ static inline void force_quiescent_state
* read-side critical sections have completed. RCU read-side critical
* sections are delimited by rcu_read_lock() and rcu_read_unlock(),
* and may be nested.
+ *
+ * There will be at least one memory barrier executed on each active
+ * CPU between the time call_rcu() is invoked and the time that the
+ * corresponding callback is invoked.
*/
void fastcall call_rcu(struct rcu_head *head,
void (*func)(struct rcu_head *rcu))
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-07-11 14:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
[not found] ` <20060709235029.GA194@oleg>
2006-07-10 16:51 ` Paul E. McKenney
[not found] ` <44B29212.1070301@yahoo.com.au>
2006-07-11 14:19 ` Paul E. McKenney
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox