* [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking
@ 2006-07-06 17:14 Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
0 siblings, 2 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:14 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes
Version 3 of the SRCU patchset.
This patch incorporates a number of improvements, many of which came
up in off-list discussions with Alan Stern. Neither of us are sure
why these discussions ended up off-list, so I have summarized them
below.
o Fixes some "zombie code" -- excess curly braces and the like.
o Gets rid of the double-flip in favor of an additional
synchronize_sched(). This turned out to be safe, despite
my saying otherwise at http://lkml.org/lkml/2006/6/27/486.
The trick that I was missing is that synchronize_sched()
forces all CPUs to execute at least one memory barrier during
the synchronize_sched()'s execution, which forces all CPUs
to see synchronize_srcu()'s counter increment as happening
after any memory manipulations prior to the synchronize_srcu().
Upgraded comments to indicate what the synchronize_sched()
calls are needed for.
o Added a barrier() to both srcu_read_lock() and srcu_read_unlock()
to prevent the compiler from performing optimizations that
would cause the critical section to move outside of the
enclosing srcu_read_lock() and srcu_read_unlock().
However, these barrier()s in srcu_read_lock() and srcu_read_unlock()
are needed only in non-CONFIG_PREEMPT kernels, so they compile
to nothing in CONFIG_PREEMPT kernels (where the preempt_disable()
and preempt_enable() calls supply the needed barrier() call).
o Added a check to synchronize_srcu() that permits this primitive
to take advantage of grace periods induced by concurrent executions
in other tasks. This can be useful in cases where you are
using a single srcu_struct to handle all the individually-locked
chains of a hash table, for example.
o cleanup_srcu_struct() now contains error checks to catch cases
where readers are still using the srcu_struct in question.
It does a WARN_ON() and leaks the srcu_struct's per-CPU data
in that case.
o There is an srcu_readers_active() that returns the number of
readers (approximate!) currently using the specified srcu_struct.
This can be useful when terminating use of an srcu_struct, e.g.,
at module-unload time.
o Improved the RCU torture tests, increasing the skew on reader
times and providing implementation-specific delay functions.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
@ 2006-07-06 17:20 ` Paul E. McKenney
[not found] ` <20060709235029.GA194@oleg>
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
1 sibling, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:20 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes
Updated patch adding a variant of RCU that permits sleeping in read-side
critical sections. SRCU is as follows:
o Each use of SRCU creates its own srcu_struct, and each
srcu_struct has its own set of grace periods. This is
critical, as it prevents one subsystem with a blocking
reader from holding up SRCU grace periods for other
subsystems.
o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
and synchronize_srcu()) all take a pointer to a srcu_struct.
o The SRCU primitives must be called from process context.
o srcu_read_lock() returns an int that must be passed to
the matching srcu_read_unlock(). Realtime RCU avoids the
need for this by storing the state in the task struct,
but SRCU needs to allow a given code path to pass through
multiple SRCU domains -- storing state in the task struct
would therefore require either arbitrary space in the
task struct or arbitrary limits on SRCU nesting. So I
kicked the state-storage problem up to the caller.
Of course, it is not permitted to call synchronize_srcu()
while in an SRCU read-side critical section.
o There is no call_srcu(). It would not be hard to implement
one, but it seems like too easy a way to OOM the system.
(Hey, we have enough trouble with call_rcu(), which does
-not- permit readers to sleep!!!) So, if you want it,
please tell me why...
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---
Documentation/RCU/checklist.txt | 38 ++++++
Documentation/RCU/rcu.txt | 3
Documentation/RCU/whatisRCU.txt | 3
include/linux/srcu.h | 49 ++++++++
kernel/Makefile | 2
kernel/srcu.c | 238 ++++++++++++++++++++++++++++++++++++++++
6 files changed, 331 insertions(+), 2 deletions(-)
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt linux-2.6.17-srcu/Documentation/RCU/checklist.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/checklist.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/checklist.txt 2006-06-30 17:07:02.000000000 -0700
@@ -183,3 +183,41 @@ over a rather long period of time, but i
disable irq on a given acquisition of that lock will result in
deadlock as soon as the RCU callback happens to interrupt that
acquisition's critical section.
+
+13. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
+ may only be invoked from process context. Unlike other forms of
+ RCU, it -is- permissible to block in an SRCU read-side critical
+ section (demarked by srcu_read_lock() and srcu_read_unlock()),
+ hence the "SRCU": "sleepable RCU". Please note that if you
+ don't need to sleep in read-side critical sections, you should
+ be using RCU rather than SRCU, because RCU is almost always
+ faster and easier to use than is SRCU.
+
+ Also unlike other forms of RCU, explicit initialization
+ and cleanup is required via init_srcu_struct() and
+ cleanup_srcu_struct(). These are passed a "struct srcu_struct"
+ that defines the scope of a given SRCU domain. Once initialized,
+ the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
+ and synchronize_srcu(). A given synchronize_srcu() waits only
+ for SRCU read-side critical sections governed by srcu_read_lock()
+ and srcu_read_unlock() calls that have been passd the same
+ srcu_struct. This property is what makes sleeping read-side
+ critical sections tolerable -- a given subsystem delays only
+ its own updates, not those of other subsystems using SRCU.
+ Therefore, SRCU is less prone to OOM the system than RCU would
+ be if RCU's read-side critical sections were permitted to
+ sleep.
+
+ The ability to sleep in read-side critical sections does not
+ come for free. First, corresponding srcu_read_lock() and
+ srcu_read_unlock() calls must be passed the same srcu_struct.
+ Second, grace-period-detection overhead is amortized only
+ over those updates sharing a given srcu_struct, rather than
+ being globally amortized as they are for other forms of RCU.
+ Therefore, SRCU should be used in preference to rw_semaphore
+ only in extremely read-intensive situations, or in situations
+ requiring SRCU's read-side deadlock immunity or low read-side
+ realtime latency.
+
+ Note that, rcu_assign_pointer() and rcu_dereference() relate to
+ SRCU just as they do to other forms of RCU.
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt linux-2.6.17-srcu/Documentation/RCU/rcu.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/rcu.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/rcu.txt 2006-06-24 08:04:07.000000000 -0700
@@ -45,7 +45,8 @@ o How can I see where RCU is currently u
Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
"rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh",
- "synchronize_rcu", and "synchronize_net".
+ "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu",
+ "synchronize_net", and "synchronize_srcu".
o What guidelines should I follow when writing code that uses RCU?
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt
--- linux-2.6.17-torturercu_bh/Documentation/RCU/whatisRCU.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/Documentation/RCU/whatisRCU.txt 2006-06-24 08:04:07.000000000 -0700
@@ -767,6 +767,8 @@ Markers for RCU read-side critical secti
rcu_read_unlock
rcu_read_lock_bh
rcu_read_unlock_bh
+ srcu_read_lock
+ srcu_read_unlock
RCU pointer/list traversal:
@@ -794,6 +796,7 @@ RCU grace period:
synchronize_net
synchronize_sched
synchronize_rcu
+ synchronize_srcu
call_rcu
call_rcu_bh
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/include/linux/srcu.h linux-2.6.17-srcu/include/linux/srcu.h
--- linux-2.6.17-torturercu_bh/include/linux/srcu.h 1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/include/linux/srcu.h 2006-07-02 07:27:32.000000000 -0700
@@ -0,0 +1,49 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * Documentation/RCU/ *.txt
+ *
+ */
+
+struct srcu_struct_array {
+ int c[2];
+};
+
+struct srcu_struct {
+ int completed;
+ struct srcu_struct_array *per_cpu_ref;
+ struct mutex mutex;
+};
+
+#ifndef CONFIG_PREEMPT
+#define srcu_barrier() barrier()
+#else /* #ifndef CONFIG_PREEMPT */
+#define srcu_barrier()
+#endif /* #else #ifndef CONFIG_PREEMPT */
+
+void init_srcu_struct(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
+int srcu_read_lock(struct srcu_struct *sp);
+void srcu_read_unlock(struct srcu_struct *sp, int idx);
+void synchronize_srcu(struct srcu_struct *sp);
+long srcu_batches_completed(struct srcu_struct *sp);
+void cleanup_srcu_struct(struct srcu_struct *sp);
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/Makefile linux-2.6.17-srcu/kernel/Makefile
--- linux-2.6.17-torturercu_bh/kernel/Makefile 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu/kernel/Makefile 2006-06-24 08:04:07.000000000 -0700
@@ -8,7 +8,7 @@ obj-y = sched.o fork.o exec_domain.o
signal.o sys.o kmod.o workqueue.o pid.o \
rcupdate.o extable.o params.o posix-timers.o \
kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
- hrtimer.o
+ hrtimer.o srcu.o
obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o
obj-$(CONFIG_FUTEX) += futex.o
diff -urpNa -X dontdiff linux-2.6.17-torturercu_bh/kernel/srcu.c linux-2.6.17-srcu/kernel/srcu.c
--- linux-2.6.17-torturercu_bh/kernel/srcu.c 1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.17-srcu/kernel/srcu.c 2006-07-04 18:49:13.000000000 -0700
@@ -0,0 +1,238 @@
+/*
+ * Sleepable Read-Copy Update mechanism for mutual exclusion.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Author: Paul McKenney <paulmck@us.ibm.com>
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * Documentation/RCU/ *.txt
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/percpu.h>
+#include <linux/preempt.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/srcu.h>
+
+/**
+ * init_srcu_struct - initialize a sleep-RCU structure
+ * @sp: structure to initialize.
+ *
+ * Must invoke this on a given srcu_struct before passing that srcu_struct
+ * to any other function. Each srcu_struct represents a separate domain
+ * of SRCU protection.
+ */
+void init_srcu_struct(struct srcu_struct *sp)
+{
+ sp->completed = 0;
+ sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
+ mutex_init(&sp->mutex);
+}
+
+/*
+ * srcu_readers_active_idx -- returns approximate number of readers
+ * active on the specified rank of per-CPU counters.
+ */
+
+static int srcu_readers_active_idx(struct srcu_struct *sp, int idx)
+{
+ int cpu;
+ int sum;
+
+ sum = 0;
+ for_each_possible_cpu(cpu)
+ sum += per_cpu_ptr(sp->per_cpu_ref, cpu)->c[idx];
+ return (sum);
+}
+
+/**
+ * srcu_readers_active - returns approximate number of readers.
+ * @sp: which srcu_struct to count active readers (holding srcu_read_lock).
+ *
+ * Note that this is not an atomic primitive, and can therefore suffer
+ * severe errors when invoked on an active srcu_struct. That said, it
+ * can be useful as an error check at cleanup time.
+ */
+int srcu_readers_active(struct srcu_struct *sp)
+{
+ return srcu_readers_active_idx(sp, 0) + srcu_readers_active_idx(sp, 1);
+}
+
+/**
+ * cleanup_srcu_struct - deconstruct a sleep-RCU structure
+ * @sp: structure to clean up.
+ *
+ * Must invoke this after you are finished using a given srcu_struct that
+ * was initialized via init_srcu_struct(), else you leak memory.
+ */
+void cleanup_srcu_struct(struct srcu_struct *sp)
+{
+ int sum;
+
+ sum = srcu_readers_active(sp);
+ WARN_ON(sum); /* Leakage unless caller handles error. */
+ if (sum != 0)
+ return;
+ free_percpu(sp->per_cpu_ref);
+ sp->per_cpu_ref = NULL;
+}
+
+/**
+ * srcu_read_lock - register a new reader for an SRCU-protected structure.
+ * @sp: srcu_struct in which to register the new reader.
+ *
+ * Counts the new reader in the appropriate per-CPU element of the
+ * srcu_struct. Must be called from process context.
+ * Returns an index that must be passed to the matching srcu_read_unlock().
+ */
+int srcu_read_lock(struct srcu_struct *sp)
+{
+ int idx;
+
+ preempt_disable();
+ idx = sp->completed & 0x1;
+ barrier(); /* ensure compiler looks -once- at sp->completed. */
+ per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
+ srcu_barrier(); /* ensure compiler won't misorder critical section. */
+ preempt_enable();
+ return idx;
+}
+
+/**
+ * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
+ * @sp: srcu_struct in which to unregister the old reader.
+ * @idx: return value from corresponding srcu_read_lock().
+ *
+ * Removes the count for the old reader from the appropriate per-CPU
+ * element of the srcu_struct. Note that this may well be a different
+ * CPU than that which was incremented by the corresponding srcu_read_lock().
+ * Must be called from process context.
+ */
+void srcu_read_unlock(struct srcu_struct *sp, int idx)
+{
+ preempt_disable();
+ srcu_barrier(); /* ensure compiler won't misorder critical section. */
+ per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
+ preempt_enable();
+}
+
+/**
+ * synchronize_srcu - wait for prior SRCU read-side critical-section completion
+ * @sp: srcu_struct with which to synchronize.
+ *
+ * Flip the completed counter, and wait for the old count to drain to zero.
+ * As with classic RCU, the updater must use some separate means of
+ * synchronizing concurrent updates. Can block; must be called from
+ * process context.
+ */
+void synchronize_srcu(struct srcu_struct *sp)
+{
+ int idx;
+ int sum;
+
+ idx = sp->completed;
+ mutex_lock(&sp->mutex);
+
+ /*
+ * Check to see if someone else did the work for us while we were
+ * waiting to acquire the lock. We need -two- advances of
+ * the counter, not just one. If there was but one, we might have
+ * shown up -after- our helper's first synchronize_sched(), thus
+ * having failed to prevent CPU-reordering races with concurrent
+ * srcu_read_unlock()s on other CPUs (see comment below). So we
+ * either (1) wait for two or (2) supply the second ourselves.
+ */
+
+ if ((sp->completed - idx) >= 2) {
+ mutex_unlock(&sp->mutex);
+ return;
+ }
+
+ synchronize_sched(); /* Force memory barrier on all CPUs. */
+
+ /*
+ * The preceding synchronize_sched() ensures that any CPU that
+ * sees the new value of sp->completed will also see any preceding
+ * changes to data structures made by this CPU. This prevents
+ * some other CPU from reordering the accesses in its SRCU
+ * read-side critical section to precede the corresponding
+ * srcu_read_lock() -- ensuring that such references will in
+ * fact be protected.
+ *
+ * So it is now safe to do the flip.
+ */
+
+ idx = sp->completed & 0x1;
+ sp->completed++;
+
+ synchronize_sched(); /* Force memory barrier on all CPUs. */
+
+ /*
+ * At this point, because of the preceding synchronize_sched(),
+ * all srcu_read_lock() calls using the old counters have completed.
+ * Their corresponding critical sections might well be still
+ * executing, but the srcu_read_lock() primitives themselves
+ * will have finished executing.
+ */
+
+ for (;;) {
+ sum = srcu_readers_active_idx(sp, idx);
+ if (sum == 0)
+ break;
+ schedule_timeout_interruptible(1);
+ }
+
+ synchronize_sched(); /* Force memory barrier on all CPUs. */
+
+ /*
+ * The preceding synchronize_sched() forces all srcu_read_unlock()
+ * primitives that were executing concurrently with the preceding
+ * for_each_possible_cpu() loop to have completed by this point.
+ * More importantly, it also forces the corresponding SRCU read-side
+ * critical sections to have also completed, and the corresponding
+ * references to SRCU-protected data items to be dropped.
+ */
+
+ mutex_unlock(&sp->mutex);
+}
+
+/**
+ * srcu_batches_completed - return batches completed.
+ * @sp: srcu_struct on which to report batch completion.
+ *
+ * Report the number of batches, correlated with, but not necessarily
+ * precisely the same as, the number of grace periods that have elapsed.
+ */
+
+long srcu_batches_completed(struct srcu_struct *sp)
+{
+ return sp->completed;
+}
+
+EXPORT_SYMBOL_GPL(init_srcu_struct);
+EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
+EXPORT_SYMBOL_GPL(srcu_read_lock);
+EXPORT_SYMBOL_GPL(srcu_read_unlock);
+EXPORT_SYMBOL_GPL(synchronize_srcu);
+EXPORT_SYMBOL_GPL(srcu_batches_completed);
+EXPORT_SYMBOL_GPL(srcu_readers_active);
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] srcu-3: add SRCU operations to rcutorture
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
@ 2006-07-06 17:25 ` Paul E. McKenney
1 sibling, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-06 17:25 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, matthltc, dipankar, stern, mingo, tytso, dvhltc, oleg, jes
Adds SRCU operations to rcutorture and updates rcutorture documentation.
Also increases the stress imposed by the rcutorture test.
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---
Documentation/RCU/torture.txt | 15 +++++
kernel/rcutorture.c | 121 ++++++++++++++++++++++++++++++++++++++++--
2 files changed, 132 insertions(+), 4 deletions(-)
diff -urpNa -X dontdiff linux-2.6.17-srcu/Documentation/RCU/torture.txt linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt
--- linux-2.6.17-srcu/Documentation/RCU/torture.txt 2006-06-24 12:13:12.000000000 -0700
+++ linux-2.6.17-torturesrcu/Documentation/RCU/torture.txt 2006-06-26 18:50:33.000000000 -0700
@@ -118,6 +118,21 @@ o "Free-Block Circulation": Shows the nu
as it is only incremented if a torture structure's counter
somehow gets incremented farther than it should.
+Different implementations of RCU can provide implementation-specific
+additional information. For example, SRCU provides the following:
+
+ srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0
+ srcu-torture: Reader Pipe: 559738 939 0 0 0 0 0 0 0 0 0
+ srcu-torture: Reader Batch: 560434 243 0 0 0 0 0 0 0 0
+ srcu-torture: Free-Block Circulation: 355 354 353 352 351 350 349 348 347 346 0
+ srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
+
+The first four lines are similar to those for RCU. The last line shows
+the per-CPU counter state. The numbers in parentheses are the values
+of the "old" and "current" counters for the corresponding CPU. The
+"idx" value maps the "old" and "current" values to the underlying array,
+and is useful for debugging.
+
USAGE
diff -urpNa -X dontdiff linux-2.6.17-srcu/kernel/rcutorture.c linux-2.6.17-torturesrcu/kernel/rcutorture.c
--- linux-2.6.17-srcu/kernel/rcutorture.c 2006-06-30 15:03:35.000000000 -0700
+++ linux-2.6.17-torturesrcu/kernel/rcutorture.c 2006-07-02 07:34:35.000000000 -0700
@@ -44,6 +44,7 @@
#include <linux/delay.h>
#include <linux/byteorder/swabb.h>
#include <linux/stat.h>
+#include <linux/srcu.h>
MODULE_LICENSE("GPL");
@@ -53,7 +54,7 @@ static int stat_interval; /* Interval be
static int verbose; /* Print more debug info. */
static int test_no_idle_hz; /* Test RCU's support for tickless idle CPUs. */
static int shuffle_interval = 5; /* Interval between shuffles (in sec)*/
-static char *torture_type = "rcu"; /* What to torture. */
+static char *torture_type = "rcu"; /* What to torture: rcu, srcu. */
module_param(nreaders, int, 0);
MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
@@ -66,7 +67,7 @@ MODULE_PARM_DESC(test_no_idle_hz, "Test
module_param(shuffle_interval, int, 0);
MODULE_PARM_DESC(shuffle_interval, "Number of seconds between shuffles");
module_param(torture_type, charp, 0);
-MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh)");
+MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, srcu)");
#define TORTURE_FLAG "-torture:"
#define PRINTK_STRING(s) \
@@ -180,6 +181,7 @@ struct rcu_torture_ops {
void (*init)(void);
void (*cleanup)(void);
int (*readlock)(void);
+ void (*readdelay)(struct rcu_random_state *rrsp);
void (*readunlock)(int idx);
int (*completed)(void);
void (*deferredfree)(struct rcu_torture *p);
@@ -198,6 +200,18 @@ static int rcu_torture_read_lock(void)
return 0;
}
+static void rcu_read_delay(struct rcu_random_state *rrsp)
+{
+ long delay;
+ const long longdelay = 200;
+
+ /* We want there to be long-running readers, but not all the time. */
+
+ delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay);
+ if (!delay)
+ udelay(longdelay);
+}
+
static void rcu_torture_read_unlock(int idx)
{
rcu_read_unlock();
@@ -239,6 +253,7 @@ static struct rcu_torture_ops rcu_ops =
.init = NULL,
.cleanup = NULL,
.readlock = rcu_torture_read_lock,
+ .readdelay = rcu_read_delay,
.readunlock = rcu_torture_read_unlock,
.completed = rcu_torture_completed,
.deferredfree = rcu_torture_deferred_free,
@@ -275,6 +290,7 @@ static struct rcu_torture_ops rcu_bh_ops
.init = NULL,
.cleanup = NULL,
.readlock = rcu_bh_torture_read_lock,
+ .readdelay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = rcu_bh_torture_read_unlock,
.completed = rcu_bh_torture_completed,
.deferredfree = rcu_bh_torture_deferred_free,
@@ -282,8 +298,105 @@ static struct rcu_torture_ops rcu_bh_ops
.name = "rcu_bh"
};
+/*
+ * Definitions for srcu torture testing.
+ */
+
+static struct srcu_struct srcu_ctl;
+static struct list_head srcu_removed;
+
+static void srcu_torture_init(void)
+{
+ init_srcu_struct(&srcu_ctl);
+ INIT_LIST_HEAD(&srcu_removed);
+}
+
+static void srcu_torture_cleanup(void)
+{
+ synchronize_srcu(&srcu_ctl);
+ cleanup_srcu_struct(&srcu_ctl);
+}
+
+static int srcu_torture_read_lock(void)
+{
+ return srcu_read_lock(&srcu_ctl);
+}
+
+static void srcu_read_delay(struct rcu_random_state *rrsp)
+{
+ long delay;
+ const long uspertick = 1000000 / HZ;
+ const long longdelay = 10;
+
+ /* We want there to be long-running readers, but not all the time. */
+
+ delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick);
+ if (!delay)
+ schedule_timeout_interruptible(longdelay);
+}
+
+static void srcu_torture_read_unlock(int idx)
+{
+ srcu_read_unlock(&srcu_ctl, idx);
+}
+
+static int srcu_torture_completed(void)
+{
+ return srcu_batches_completed(&srcu_ctl);
+}
+
+static void srcu_torture_deferred_free(struct rcu_torture *p)
+{
+ int i;
+ struct rcu_torture *rp;
+ struct rcu_torture *rp1;
+
+ synchronize_srcu(&srcu_ctl);
+ list_add(&p->rtort_free, &srcu_removed);
+ list_for_each_entry_safe(rp, rp1, &srcu_removed, rtort_free) {
+ i = rp->rtort_pipe_count;
+ if (i > RCU_TORTURE_PIPE_LEN)
+ i = RCU_TORTURE_PIPE_LEN;
+ atomic_inc(&rcu_torture_wcount[i]);
+ if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
+ rp->rtort_mbtest = 0;
+ list_del(&rp->rtort_free);
+ rcu_torture_free(rp);
+ }
+ }
+}
+
+int srcu_torture_stats(char *page)
+{
+ int cnt = 0;
+ int cpu;
+ int idx = srcu_ctl.completed & 0x1;
+
+ cnt += sprintf(&page[cnt], "%s%s per-CPU(idx=%d):",
+ torture_type, TORTURE_FLAG, idx);
+ for_each_possible_cpu(cpu) {
+ cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
+ per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
+ per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
+ }
+ cnt += sprintf(&page[cnt], "\n");
+ return cnt;
+}
+
+static struct rcu_torture_ops srcu_ops = {
+ .init = srcu_torture_init,
+ .cleanup = srcu_torture_cleanup,
+ .readlock = srcu_torture_read_lock,
+ .readdelay = srcu_read_delay,
+ .readunlock = srcu_torture_read_unlock,
+ .completed = srcu_torture_completed,
+ .deferredfree = srcu_torture_deferred_free,
+ .stats = srcu_torture_stats,
+ .name = "srcu"
+};
+
static struct rcu_torture_ops *torture_ops[] =
- { &rcu_ops, &rcu_bh_ops, NULL };
+ { &rcu_ops, &rcu_bh_ops, &srcu_ops, NULL };
/*
* RCU torture writer kthread. Repeatedly substitutes a new structure
@@ -359,7 +472,7 @@ rcu_torture_reader(void *arg)
}
if (p->rtort_mbtest == 0)
atomic_inc(&n_rcu_torture_mberror);
- udelay(rcu_random(&rand) & 0x7f);
+ cur_ops->readdelay(&rand);
preempt_disable();
pipe_count = p->rtort_pipe_count;
if (pipe_count > RCU_TORTURE_PIPE_LEN) {
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <20060709235029.GA194@oleg>
@ 2006-07-10 16:51 ` Paul E. McKenney
[not found] ` <44B29212.1070301@yahoo.com.au>
0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-10 16:51 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, akpm, matthltc, dipankar, stern, mingo, tytso,
dvhltc, jes, dhowells
On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
> On 07/06, Paul E. McKenney wrote:
> >
> > Updated patch adding a variant of RCU that permits sleeping in read-side
> > critical sections.
>
> I do not see any problems with this patch, but I have a couple of
> questions, so your help is needed again.
Thank you for looking it over!
> > +void synchronize_srcu(struct srcu_struct *sp)
> > +{
> > + [... snip ...]
> > +
> > + synchronize_sched(); /* Force memory barrier on all CPUs. */
> > +
> > + /*
> > + * The preceding synchronize_sched() forces all srcu_read_unlock()
> > + * primitives that were executing concurrently with the preceding
> > + * for_each_possible_cpu() loop to have completed by this point.
> > + * More importantly, it also forces the corresponding SRCU read-side
> > + * critical sections to have also completed, and the corresponding
> > + * references to SRCU-protected data items to be dropped.
> > + */
> > +
> > + mutex_unlock(&sp->mutex);
> > +}
>
> Isn't it possible to unlock ->mutex earlier, before the last
> synchronize_sched()?
It seems possible, but I would like to think carefully about this one
first, and, if it still seems plausible, test it heavily. If I understand
your line of reasoning, the thought is that the first synchronize_sched()
at the beginning of synchronize_srcu() ensures that all of the counter
updates pertaining to the last instance of synchronize_srcu() have
been committed. The same reasoning might well cover the sp->completed
fastpath as well.
In any case, this is a performance boost off the fastpath. A good boost,
if it works, but I will be much more excited if you find a way of speeding
up srcu_read_lock() or srcu_read_unlock(). ;-)
> Another question: what is the semantics of synchronize_sched() ?
>
> I am not talking about the current implementation, it is very clear.
> The question is: what is the _definition_ of synchronize_sched()
> (which must be valid for "any" RCU implementation) ?
>
> 1) The comment in include/linux/rcupdate.h states that "all preempt_disable
> code sequences will have completed before this primitive returns".
>
> 2) kernel/srcu.c claims that this primitive "forces memory barrier on all
> CPUs". (so the comment in rcupdate.h is not complete).
>
> (I understand this so that each cpu does something which implies mb()
> semantics).
>
> As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> (the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> should also guarantee the completion of mem ops on all CPUs before return,
> not just mb() (which does not have any timing guaranties).
>
> Could you clarify this issue?
>
> (Again, I do not see any problems with the current RCU implementation).
However, this -does- seem to be to be a problem with the comment headers
and the documentation. Does the following patch make things better?
David, would it be worthwhile adding this global-memory-barrier effect
of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
Documentation/memory-barriers.txt?
Thanx, Paul
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
---
Documentation/RCU/checklist.txt | 4 ++++
include/linux/rcupdate.h | 3 +++
kernel/rcupdate.c | 3 +++
kernel/srcu.c | 3 ++-
4 files changed, 12 insertions(+), 1 deletion(-)
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-4/Documentation/RCU/checklist.txt 2006-07-06 16:45:01.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt 2006-07-10 09:43:19.000000000 -0700
@@ -221,3 +221,7 @@ over a rather long period of time, but i
Note that, rcu_assign_pointer() and rcu_dereference() relate to
SRCU just as they do to other forms of RCU.
+
+14. The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
+ primitives force at least one memory barrier to be executed on
+ each active CPU before they return.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h
--- linux-2.6.17-srcu-LKML-4/include/linux/rcupdate.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/include/linux/rcupdate.h 2006-07-10 09:48:51.000000000 -0700
@@ -251,6 +251,9 @@ extern int rcu_needs_cpu(int cpu);
* guarantees that rcu_read_lock() sections will have completed.
* In "classic RCU", these two guarantees happen to be one and
* the same, but can differ in realtime RCU implementations.
+ *
+ * In addition, this primitive guarantees that every active CPU has
+ * executed at least one memory barrier before it returns.
*/
#define synchronize_sched() synchronize_rcu()
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-4/kernel/rcupdate.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c 2006-07-10 09:48:32.000000000 -0700
@@ -597,6 +597,9 @@ static void wakeme_after_rcu(struct rcu_
* sections are delimited by rcu_read_lock() and rcu_read_unlock(),
* and may be nested.
*
+ * This primitive also causes each active CPU to execute at least one
+ * memory barrier before it returns.
+ *
* If your read-side code is not protected by rcu_read_lock(), do -not-
* use synchronize_rcu().
*/
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-4/kernel/srcu.c linux-2.6.17-srcu-LKML-5/kernel/srcu.c
--- linux-2.6.17-srcu-LKML-4/kernel/srcu.c 2006-07-06 16:50:23.000000000 -0700
+++ linux-2.6.17-srcu-LKML-5/kernel/srcu.c 2006-07-10 09:48:09.000000000 -0700
@@ -143,7 +143,8 @@ void srcu_read_unlock(struct srcu_struct
* Flip the completed counter, and wait for the old count to drain to zero.
* As with classic RCU, the updater must use some separate means of
* synchronizing concurrent updates. Can block; must be called from
- * process context.
+ * process context. Has the side-effect of forcing a memory barrier on
+ * each active CPU before returning.
*
* Note that it is illegal to call synchornize_srcu() from the corresponding
* SRCU read-side critical section; doing so will result in deadlock.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] srcu-3: RCU variant permitting read-side blocking
[not found] ` <44B29212.1070301@yahoo.com.au>
@ 2006-07-11 14:19 ` Paul E. McKenney
0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2006-07-11 14:19 UTC (permalink / raw)
To: Nick Piggin
Cc: Oleg Nesterov, linux-kernel, akpm, matthltc, dipankar, stern,
mingo, tytso, dvhltc, jes, dhowells
On Tue, Jul 11, 2006 at 03:44:50AM +1000, Nick Piggin wrote:
> Paul E. McKenney wrote:
> >On Mon, Jul 10, 2006 at 03:50:29AM +0400, Oleg Nesterov wrote:
>
> >>As I see it, 1) + 2) is NOT enough for synchronize_srcu() to be correct
> >>(the 2-nd and 3-rd synchronize_sched() calls). I think synchronize_sched()
> >>should also guarantee the completion of mem ops on all CPUs before return,
> >>not just mb() (which does not have any timing guaranties).
> >>
> >>Could you clarify this issue?
> >>
> >>(Again, I do not see any problems with the current RCU implementation).
> >
> >
> >However, this -does- seem to be to be a problem with the comment headers
> >and the documentation. Does the following patch make things better?
> >
> >David, would it be worthwhile adding this global-memory-barrier effect
> >of synchronize_rcu(), synchronize_sched(), and synchronize_srcu() to
>
> And call_rcu? (or is that already tucked away in the documentation
> somewhere?) ie. there is a memory barrier between the call_rcu() call
> and the actual callback.
>
> This is something I needed clarification with (as you might remember),
> which might not be clear from an RCU API user's point of view.
Good point -- since synchronize_rcu() is just a wrapper around call_rcu(),
they do have the same properties. How about the following?
Thanx, Paul
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Documentation/RCU/checklist.txt | 4 +++-
include/linux/srcu.h | 5 +++++
kernel/rcupdate.c | 4 ++++
3 files changed, 12 insertions(+), 1 deletion(-)
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt
--- linux-2.6.17-srcu-LKML-5/Documentation/RCU/checklist.txt 2006-07-10 09:43:19.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/Documentation/RCU/checklist.txt 2006-07-11 07:12:25.000000000 -0700
@@ -224,4 +224,6 @@ over a rather long period of time, but i
14. The synchronize_rcu(), synchronize_sched(), and synchronize_srcu()
primitives force at least one memory barrier to be executed on
- each active CPU before they return.
+ each active CPU before they return. Similarly, call_rcu()
+ forces at least one memory barrier to be executed on each active
+ CPU before the corresponding callback is invoked.
diff -urpNa -X dontdiff linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c
--- linux-2.6.17-srcu-LKML-5/kernel/rcupdate.c 2006-07-10 09:48:32.000000000 -0700
+++ linux-2.6.17-srcu-LKML-6/kernel/rcupdate.c 2006-07-11 07:11:07.000000000 -0700
@@ -116,6 +116,10 @@ static inline void force_quiescent_state
* read-side critical sections have completed. RCU read-side critical
* sections are delimited by rcu_read_lock() and rcu_read_unlock(),
* and may be nested.
+ *
+ * There will be at least one memory barrier executed on each active
+ * CPU between the time call_rcu() is invoked and the time that the
+ * corresponding callback is invoked.
*/
void fastcall call_rcu(struct rcu_head *head,
void (*func)(struct rcu_head *rcu))
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-07-11 14:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-06 17:14 [PATCH 0/2] srcu-3: add RCU variant that permits read-side blocking Paul E. McKenney
2006-07-06 17:20 ` [PATCH 1/2] srcu-3: RCU variant permitting " Paul E. McKenney
[not found] ` <20060709235029.GA194@oleg>
2006-07-10 16:51 ` Paul E. McKenney
[not found] ` <44B29212.1070301@yahoo.com.au>
2006-07-11 14:19 ` Paul E. McKenney
2006-07-06 17:25 ` [PATCH 2/2] srcu-3: add SRCU operations to rcutorture Paul E. McKenney
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.