* [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference()
@ 2010-01-15 1:01 Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 1/9] rcu: introduce lockdep-based checking to RCU read-side primitives Paul E. McKenney
` (8 more replies)
0 siblings, 9 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:01 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells
Hello!
This patch series adds lockdep-based checking to the rcu_dereference()
primitive in order to flag misuses of RCU. The first three patches put
the RCU infrastructure in place, the next five use this infrastructure in
the net, sched, vfs, radix-tree, and idr subsystems, and the last patch
documents how to use the new lockdep-checked rcu_dereference() primitives.
The new (and default-disabled) CONFIG_PROVE_RCU makes this safe to include
in tip/core/rcu -- unlike v2, your system won't complain about RCU misuse
unless you ask it to. There are still several situations that trigger
on systems I have access to (and I have email out on a couple of them),
some of the remaining will be triggered only by hardware I don't have
access to and kernel features I am unfamiliar with. So this is ready
for some more-general testing.
Changes since v2 (http://lkml.org/lkml/2010/1/4/496):
o Removed references to the check condition for the non-lockdep
version of rcu_dereference_check(), thus preventing numerous
build errors.
o Create a CONFIG_PROVE_RCU (default disabled) so that developers
who want to use lockdep only to debug locks aren't bothered
by RCU-specific lockdep complaints that will no doubt persist
for a little bit while some of the more obscure uses of RCU
are reverse-engineered.
o Fix a few more issues identified by the checking.
o Add documentation in Documentation/RCU/lockdep.txt.
Changes since v1 (http://lkml.org/lkml/2009/12/15/553):
o Made SRCU have per-srcu_struct lockdep maps to prevent lockdep
from conflating independent uses of SRCU, and updated the
rcu_dereference() checking in rcutorture to match.
o Fix a few issues identified by the checking.
o Disable checking within RCU list macros, RCU radix trees, and
IDR.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 1/9] rcu: introduce lockdep-based checking to RCU read-side primitives
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 2/9] rcu: add lockdep-enabled variants of rcu_dereference() Paul E. McKenney
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
Inspection is proving insufficient to catch all RCU misuses, which is
understandable given that rcu_dereference() might be protected by any
of four different flavors of RCU (RCU, RCU-bh, RCU-sched, and SRCU), and
might also/instead be protected by any of a number of locking primitives.
It is therefore time to enlist the aid of lockdep.
This set of patches is inspired by earlier work by Peter Zijlstra and
Thomas Gleixner, and takes the following approach:
o Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.
o Set up separate lockdep classes for each instance of SRCU.
o Create primitives that check for being in an RCU read-side
critical section. These return exact answers if lockdep is
fully enabled, but if unsure, report being in an RCU read-side
critical section. (We want to avoid false positives!)
The primitives are:
For RCU: rcu_read_lock_held(void)
For RCU-bh: rcu_read_lock_bh_held(void)
For RCU-sched: rcu_read_lock_sched_held(void)
For SRCU: srcu_read_lock_held(struct srcu_struct *sp)
o Add rcu_dereference_check(), which takes a second argument
in which one places a boolean expression based on the above
primitives and/or lockdep_is_held().
o A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
rcu_dereference_check(). This depends on CONFIG_PROVE_LOCKING,
and should be quite helpful during the transition period while
CONFIG_PROVE_RCU-unaware patches are in flight.
The existing rcu_dereference() primitive does no checking, but upcoming
patches will change that.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rcupdate.h | 126 ++++++++++++++++++++++++++++++++++++++++++----
include/linux/srcu.h | 87 +++++++++++++++++++++++++++++++-
kernel/rcupdate.c | 10 ++++
kernel/rcutorture.c | 12 ++++-
kernel/srcu.c | 50 ++++++++++++------
lib/Kconfig.debug | 12 ++++
lib/debug_locks.c | 1 +
7 files changed, 267 insertions(+), 31 deletions(-)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 24440f4..e3d37ef 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -78,14 +78,120 @@ extern void rcu_init(void);
} while (0)
#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
extern struct lockdep_map rcu_lock_map;
-# define rcu_read_acquire() \
- lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
+# define rcu_read_acquire() \
+ lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
# define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_)
-#else
-# define rcu_read_acquire() do { } while (0)
-# define rcu_read_release() do { } while (0)
-#endif
+
+extern struct lockdep_map rcu_bh_lock_map;
+# define rcu_read_acquire_bh() \
+ lock_acquire(&rcu_bh_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
+# define rcu_read_release_bh() lock_release(&rcu_bh_lock_map, 1, _THIS_IP_)
+
+extern struct lockdep_map rcu_sched_lock_map;
+# define rcu_read_acquire_sched() \
+ lock_acquire(&rcu_sched_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
+# define rcu_read_release_sched() \
+ lock_release(&rcu_sched_lock_map, 1, _THIS_IP_)
+
+/**
+ * rcu_read_lock_held - might we be in RCU read-side critical section?
+ *
+ * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in
+ * an RCU read-side critical section. In absence of CONFIG_PROVE_LOCKING,
+ * this assumes we are in an RCU read-side critical section unless it can
+ * prove otherwise.
+ */
+static inline int rcu_read_lock_held(void)
+{
+ if (debug_locks)
+ return lock_is_held(&rcu_lock_map);
+ return 1;
+}
+
+/**
+ * rcu_read_lock_bh_held - might we be in RCU-bh read-side critical section?
+ *
+ * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in
+ * an RCU-bh read-side critical section. In absence of CONFIG_PROVE_LOCKING,
+ * this assumes we are in an RCU-bh read-side critical section unless it can
+ * prove otherwise.
+ */
+static inline int rcu_read_lock_bh_held(void)
+{
+ if (debug_locks)
+ return lock_is_held(&rcu_bh_lock_map);
+ return 1;
+}
+
+/**
+ * rcu_read_lock_sched_held - might we be in RCU-sched read-side critical section?
+ *
+ * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in an
+ * RCU-sched read-side critical section. In absence of CONFIG_PROVE_LOCKING,
+ * this assumes we are in an RCU-sched read-side critical section unless it
+ * can prove otherwise. Note that disabling of preemption (including
+ * disabling irqs) counts as an RCU-sched read-side critical section.
+ */
+static inline int rcu_read_lock_sched_held(void)
+{
+ int lockdep_opinion = 0;
+
+ if (debug_locks)
+ lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
+ return lockdep_opinion || preempt_count() != 0;
+}
+
+#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+# define rcu_read_acquire() do { } while (0)
+# define rcu_read_release() do { } while (0)
+# define rcu_read_acquire_bh() do { } while (0)
+# define rcu_read_release_bh() do { } while (0)
+# define rcu_read_acquire_sched() do { } while (0)
+# define rcu_read_release_sched() do { } while (0)
+
+static inline int rcu_read_lock_held(void)
+{
+ return 1;
+}
+
+static inline int rcu_read_lock_bh_held(void)
+{
+ return 1;
+}
+
+static inline int rcu_read_lock_sched_held(void)
+{
+ return preempt_count() != 0;
+}
+
+#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+#ifdef CONFIG_PROVE_RCU
+
+/**
+ * rcu_dereference_check - rcu_dereference with debug checking
+ *
+ * Do an rcu_dereference(), but check that the context is correct.
+ * For example, rcu_dereference_check(gp, rcu_read_lock_held()) to
+ * ensure that the rcu_dereference_check() executes within an RCU
+ * read-side critical section. It is also possible to check for
+ * locks being held, for example, by using lockdep_is_held().
+ */
+#define rcu_dereference_check(p, c) \
+ ({ \
+ if (debug_locks) \
+ WARN_ON_ONCE(!(c)); \
+ rcu_dereference(p); \
+ })
+
+#else /* #ifdef CONFIG_PROVE_RCU */
+
+#define rcu_dereference_check(p, c) rcu_dereference(p)
+
+#endif /* #else #ifdef CONFIG_PROVE_RCU */
/**
* rcu_read_lock - mark the beginning of an RCU read-side critical section.
@@ -160,7 +266,7 @@ static inline void rcu_read_lock_bh(void)
{
__rcu_read_lock_bh();
__acquire(RCU_BH);
- rcu_read_acquire();
+ rcu_read_acquire_bh();
}
/*
@@ -170,7 +276,7 @@ static inline void rcu_read_lock_bh(void)
*/
static inline void rcu_read_unlock_bh(void)
{
- rcu_read_release();
+ rcu_read_release_bh();
__release(RCU_BH);
__rcu_read_unlock_bh();
}
@@ -188,7 +294,7 @@ static inline void rcu_read_lock_sched(void)
{
preempt_disable();
__acquire(RCU_SCHED);
- rcu_read_acquire();
+ rcu_read_acquire_sched();
}
/* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
@@ -205,7 +311,7 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
*/
static inline void rcu_read_unlock_sched(void)
{
- rcu_read_release();
+ rcu_read_release_sched();
__release(RCU_SCHED);
preempt_enable();
}
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 4765d97..adbe167 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -35,6 +35,9 @@ struct srcu_struct {
int completed;
struct srcu_struct_array *per_cpu_ref;
struct mutex mutex;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
};
#ifndef CONFIG_PREEMPT
@@ -43,12 +46,92 @@ struct srcu_struct {
#define srcu_barrier()
#endif /* #else #ifndef CONFIG_PREEMPT */
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
+int __init_srcu_struct(struct srcu_struct *sp, const char *name,
+ struct lock_class_key *key);
+
+#define init_srcu_struct(sp) \
+({ \
+ static struct lock_class_key __srcu_key; \
+ \
+ __init_srcu_struct((sp), #sp, &__srcu_key); \
+})
+
+# define srcu_read_acquire(sp) \
+ lock_acquire(&(sp)->dep_map, 0, 0, 2, 1, NULL, _THIS_IP_)
+# define srcu_read_release(sp) \
+ lock_release(&(sp)->dep_map, 1, _THIS_IP_)
+
+#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
int init_srcu_struct(struct srcu_struct *sp);
+
+# define srcu_read_acquire(sp) do { } while (0)
+# define srcu_read_release(sp) do { } while (0)
+
+#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
void cleanup_srcu_struct(struct srcu_struct *sp);
-int srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
-void srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp);
+int __srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
+void __srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp);
void synchronize_srcu(struct srcu_struct *sp);
void synchronize_srcu_expedited(struct srcu_struct *sp);
long srcu_batches_completed(struct srcu_struct *sp);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
+/**
+ * srcu_read_lock_held - might we be in SRCU read-side critical section?
+ *
+ * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in
+ * an SRCU read-side critical section. In absence of CONFIG_PROVE_LOCKING,
+ * this assumes we are in an SRCU read-side critical section unless it can
+ * prove otherwise.
+ */
+static inline int srcu_read_lock_held(struct srcu_struct *sp)
+{
+ if (debug_locks)
+ return lock_is_held(&sp->dep_map);
+ return 1;
+}
+
+#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+static inline int srcu_read_lock_held(struct srcu_struct *sp)
+{
+ return 1;
+}
+
+#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+/**
+ * srcu_read_lock - register a new reader for an SRCU-protected structure.
+ * @sp: srcu_struct in which to register the new reader.
+ *
+ * Enter an SRCU read-side critical section. Note that SRCU read-side
+ * critical sections may be nested.
+ */
+static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
+{
+ int retval = __srcu_read_lock(sp);
+
+ srcu_read_acquire(sp);
+ return retval;
+}
+
+/**
+ * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
+ * @sp: srcu_struct in which to unregister the old reader.
+ * @idx: return value from corresponding srcu_read_lock().
+ *
+ * Exit an SRCU read-side critical section.
+ */
+static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
+ __releases(sp)
+{
+ srcu_read_release(sp);
+ __srcu_read_unlock(sp, idx);
+}
+
#endif
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 9b7fd47..033cb55 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -50,6 +50,16 @@ static struct lock_class_key rcu_lock_key;
struct lockdep_map rcu_lock_map =
STATIC_LOCKDEP_MAP_INIT("rcu_read_lock", &rcu_lock_key);
EXPORT_SYMBOL_GPL(rcu_lock_map);
+
+static struct lock_class_key rcu_bh_lock_key;
+struct lockdep_map rcu_bh_lock_map =
+ STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_bh", &rcu_bh_lock_key);
+EXPORT_SYMBOL_GPL(rcu_bh_lock_map);
+
+static struct lock_class_key rcu_sched_lock_key;
+struct lockdep_map rcu_sched_lock_map =
+ STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_sched", &rcu_sched_lock_key);
+EXPORT_SYMBOL_GPL(rcu_sched_lock_map);
#endif
/*
diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index adda92b..5f43f30 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -796,7 +796,11 @@ static void rcu_torture_timer(unsigned long unused)
idx = cur_ops->readlock();
completed = cur_ops->completed();
- p = rcu_dereference(rcu_torture_current);
+ p = rcu_dereference_check(rcu_torture_current,
+ rcu_read_lock_held() ||
+ rcu_read_lock_bh_held() ||
+ rcu_read_lock_sched_held() ||
+ srcu_read_lock_held(&srcu_ctl));
if (p == NULL) {
/* Leave because rcu_torture_writer is not yet underway */
cur_ops->readunlock(idx);
@@ -853,7 +857,11 @@ rcu_torture_reader(void *arg)
}
idx = cur_ops->readlock();
completed = cur_ops->completed();
- p = rcu_dereference(rcu_torture_current);
+ p = rcu_dereference_check(rcu_torture_current,
+ rcu_read_lock_held() ||
+ rcu_read_lock_bh_held() ||
+ rcu_read_lock_sched_held() ||
+ srcu_read_lock_held(&srcu_ctl));
if (p == NULL) {
/* Wait for rcu_torture_writer to get underway */
cur_ops->readunlock(idx);
diff --git a/kernel/srcu.c b/kernel/srcu.c
index 31b275b..140b7ba 100644
--- a/kernel/srcu.c
+++ b/kernel/srcu.c
@@ -34,6 +34,30 @@
#include <linux/smp.h>
#include <linux/srcu.h>
+static int init_srcu_struct_fields(struct srcu_struct *sp)
+{
+ sp->completed = 0;
+ mutex_init(&sp->mutex);
+ sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
+ return (sp->per_cpu_ref ? 0 : -ENOMEM);
+}
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
+int __init_srcu_struct(struct srcu_struct *sp, const char *name,
+ struct lock_class_key *key)
+{
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ /* Don't re-initialize a lock while it is held. */
+ debug_check_no_locks_freed((void *)sp, sizeof(*sp));
+ lockdep_init_map(&sp->dep_map, name, key, 0);
+#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+ return (init_srcu_struct_fields(sp));
+}
+EXPORT_SYMBOL_GPL(__init_srcu_struct);
+
+#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
/**
* init_srcu_struct - initialize a sleep-RCU structure
* @sp: structure to initialize.
@@ -44,13 +68,12 @@
*/
int init_srcu_struct(struct srcu_struct *sp)
{
- sp->completed = 0;
- mutex_init(&sp->mutex);
- sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
- return (sp->per_cpu_ref ? 0 : -ENOMEM);
+ return (init_srcu_struct_fields(sp));
}
EXPORT_SYMBOL_GPL(init_srcu_struct);
+#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
/*
* srcu_readers_active_idx -- returns approximate number of readers
* active on the specified rank of per-CPU counters.
@@ -100,15 +123,12 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
}
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
-/**
- * srcu_read_lock - register a new reader for an SRCU-protected structure.
- * @sp: srcu_struct in which to register the new reader.
- *
+/*
* Counts the new reader in the appropriate per-CPU element of the
* srcu_struct. Must be called from process context.
* Returns an index that must be passed to the matching srcu_read_unlock().
*/
-int srcu_read_lock(struct srcu_struct *sp)
+int __srcu_read_lock(struct srcu_struct *sp)
{
int idx;
@@ -120,26 +140,22 @@ int srcu_read_lock(struct srcu_struct *sp)
preempt_enable();
return idx;
}
-EXPORT_SYMBOL_GPL(srcu_read_lock);
+EXPORT_SYMBOL_GPL(__srcu_read_lock);
-/**
- * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
- * @sp: srcu_struct in which to unregister the old reader.
- * @idx: return value from corresponding srcu_read_lock().
- *
+/*
* Removes the count for the old reader from the appropriate per-CPU
* element of the srcu_struct. Note that this may well be a different
* CPU than that which was incremented by the corresponding srcu_read_lock().
* Must be called from process context.
*/
-void srcu_read_unlock(struct srcu_struct *sp, int idx)
+void __srcu_read_unlock(struct srcu_struct *sp, int idx)
{
preempt_disable();
srcu_barrier(); /* ensure compiler won't misorder critical section. */
per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
preempt_enable();
}
-EXPORT_SYMBOL_GPL(srcu_read_unlock);
+EXPORT_SYMBOL_GPL(__srcu_read_unlock);
/*
* Helper function for synchronize_srcu() and synchronize_srcu_expedited().
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 6bf97d1..6af20a8 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -499,6 +499,18 @@ config PROVE_LOCKING
For more details, see Documentation/lockdep-design.txt.
+config PROVE_RCU
+ bool "RCU debugging: prove RCU correctness"
+ depends on PROVE_LOCKING
+ default n
+ help
+ This feature enables lockdep extensions that check for correct
+ use of RCU APIs. This is currently under development. Say Y
+ if you want to debug RCU usage or help work on the PROVE_RCU
+ feature.
+
+ Say N if you are unsure.
+
config LOCKDEP
bool
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
diff --git a/lib/debug_locks.c b/lib/debug_locks.c
index bc3b117..5bf0020 100644
--- a/lib/debug_locks.c
+++ b/lib/debug_locks.c
@@ -23,6 +23,7 @@
* shut up after that.
*/
int debug_locks = 1;
+EXPORT_SYMBOL_GPL(debug_locks);
/*
* The locking-testsuite uses <debug_locks_silent> to get a
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 2/9] rcu: add lockdep-enabled variants of rcu_dereference()
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 1/9] rcu: introduce lockdep-based checking to RCU read-side primitives Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 3/9] rcu: disable lockdep checking in RCU list-traversal primitives Paul E. McKenney
` (6 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Make rcu_dereference() check for being in an RCU read-side critical
section, and create rcu_dereference_bh(), rcu_dereference_sched(), and
srcu_dereference() to check for the other flavors of RCU. Also create
rcu_dereference_raw() to avoid checking, and make rcu_dereference_check()
use rcu_dereference_raw().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rcupdate.h | 41 ++++++++++++++++++++++++++++++++++-------
include/linux/srcu.h | 8 ++++++++
2 files changed, 42 insertions(+), 7 deletions(-)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e3d37ef..839d296 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -184,12 +184,12 @@ static inline int rcu_read_lock_sched_held(void)
({ \
if (debug_locks) \
WARN_ON_ONCE(!(c)); \
- rcu_dereference(p); \
+ rcu_dereference_raw(p); \
})
#else /* #ifdef CONFIG_PROVE_RCU */
-#define rcu_dereference_check(p, c) rcu_dereference(p)
+#define rcu_dereference_check(p, c) rcu_dereference_raw(p)
#endif /* #else #ifdef CONFIG_PROVE_RCU */
@@ -325,22 +325,49 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
/**
- * rcu_dereference - fetch an RCU-protected pointer in an
- * RCU read-side critical section. This pointer may later
- * be safely dereferenced.
+ * rcu_dereference_raw - fetch an RCU-protected pointer
+ *
+ * The caller must be within some flavor of RCU read-side critical
+ * section, or must be otherwise preventing the pointer from changing,
+ * for example, by holding an appropriate lock. This pointer may later
+ * be safely dereferenced. It is the caller's responsibility to have
+ * done the right thing, as this primitive does no checking of any kind.
*
* Inserts memory barriers on architectures that require them
* (currently only the Alpha), and, more importantly, documents
* exactly which pointers are protected by RCU.
*/
-
-#define rcu_dereference(p) ({ \
+#define rcu_dereference_raw(p) ({ \
typeof(p) _________p1 = ACCESS_ONCE(p); \
smp_read_barrier_depends(); \
(_________p1); \
})
/**
+ * rcu_dereference - fetch an RCU-protected pointer, checking for RCU
+ *
+ * Makes rcu_dereference_check() do the dirty work.
+ */
+#define rcu_dereference(p) \
+ rcu_dereference_check(p, rcu_read_lock_held())
+
+/**
+ * rcu_dereference_bh - fetch an RCU-protected pointer, checking for RCU-bh
+ *
+ * Makes rcu_dereference_check() do the dirty work.
+ */
+#define rcu_dereference_bh(p) \
+ rcu_dereference_check(p, rcu_read_lock_bh_held())
+
+/**
+ * rcu_dereference_sched - fetch RCU-protected pointer, checking for RCU-sched
+ *
+ * Makes rcu_dereference_check() do the dirty work.
+ */
+#define rcu_dereference_sched(p) \
+ rcu_dereference_check(p, rcu_read_lock_sched_held())
+
+/**
* rcu_assign_pointer - assign (publicize) a pointer to a newly
* initialized structure that will be dereferenced by RCU read-side
* critical sections. Returns the value assigned.
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index adbe167..3084f80 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -106,6 +106,14 @@ static inline int srcu_read_lock_held(struct srcu_struct *sp)
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
/**
+ * srcu_dereference - fetch SRCU-protected pointer with checking
+ *
+ * Makes rcu_dereference_check() do the dirty work.
+ */
+#define srcu_dereference(p, sp) \
+ rcu_dereference_check(p, srcu_read_lock_held(sp))
+
+/**
* srcu_read_lock - register a new reader for an SRCU-protected structure.
* @sp: srcu_struct in which to register the new reader.
*
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 3/9] rcu: disable lockdep checking in RCU list-traversal primitives
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 1/9] rcu: introduce lockdep-based checking to RCU read-side primitives Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 2/9] rcu: add lockdep-enabled variants of rcu_dereference() Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 4/9] net: add checking to rcu_dereference() primitives Paul E. McKenney
` (5 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The theory is that use of bare rcu_dereference() is more prone to error
than use of the RCU list-traversal primitives. Therefore, disable lockdep
RCU read-side critical-section checking in these primitives for the time
being. Once all of the rcu_dereference() uses have been dealt with, it
may be time to re-enable lockdep checking for the RCU list-traversal
primitives.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rculist.h | 14 +++++++-------
include/linux/rculist_nulls.h | 4 ++--
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 1bf0f70..779d707 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -208,7 +208,7 @@ static inline void list_splice_init_rcu(struct list_head *list,
* primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
*/
#define list_entry_rcu(ptr, type, member) \
- container_of(rcu_dereference(ptr), type, member)
+ container_of(rcu_dereference_raw(ptr), type, member)
/**
* list_first_entry_rcu - get the first element from a list
@@ -225,9 +225,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
list_entry_rcu((ptr)->next, type, member)
#define __list_for_each_rcu(pos, head) \
- for (pos = rcu_dereference((head)->next); \
+ for (pos = rcu_dereference_raw((head)->next); \
pos != (head); \
- pos = rcu_dereference(pos->next))
+ pos = rcu_dereference_raw(pos->next))
/**
* list_for_each_entry_rcu - iterate over rcu list of given type
@@ -257,9 +257,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
* as long as the traversal is guarded by rcu_read_lock().
*/
#define list_for_each_continue_rcu(pos, head) \
- for ((pos) = rcu_dereference((pos)->next); \
+ for ((pos) = rcu_dereference_raw((pos)->next); \
prefetch((pos)->next), (pos) != (head); \
- (pos) = rcu_dereference((pos)->next))
+ (pos) = rcu_dereference_raw((pos)->next))
/**
* list_for_each_entry_continue_rcu - continue iteration over list of given type
@@ -418,10 +418,10 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev,
* as long as the traversal is guarded by rcu_read_lock().
*/
#define hlist_for_each_entry_rcu(tpos, pos, head, member) \
- for (pos = rcu_dereference((head)->first); \
+ for (pos = rcu_dereference_raw((head)->first); \
pos && ({ prefetch(pos->next); 1; }) && \
({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); \
- pos = rcu_dereference(pos->next))
+ pos = rcu_dereference_raw(pos->next))
#endif /* __KERNEL__ */
#endif
diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h
index 589a409..b70ffe5 100644
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@ -101,10 +101,10 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n,
*
*/
#define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \
- for (pos = rcu_dereference((head)->first); \
+ for (pos = rcu_dereference_raw((head)->first); \
(!is_a_nulls(pos)) && \
({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \
- pos = rcu_dereference(pos->next))
+ pos = rcu_dereference_raw(pos->next))
#endif
#endif
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 4/9] net: add checking to rcu_dereference() primitives
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
` (2 preceding siblings ...)
2010-01-15 1:02 ` [PATCH tip/core/rcu 3/9] rcu: disable lockdep checking in RCU list-traversal primitives Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 5/9] sched: use lockdep-based checking on rcu_dereference() Paul E. McKenney
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Update rcu_dereference() primitives to use new lockdep-based checking.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
net/core/dev.c | 2 +-
net/core/filter.c | 6 +++---
net/core/sock.c | 2 +-
net/decnet/dn_route.c | 14 +++++++-------
net/ipv4/route.c | 14 +++++++-------
net/packet/af_packet.c | 2 +-
6 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index be9924f..0d0ff82 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2041,7 +2041,7 @@ gso:
rcu_read_lock_bh();
txq = dev_pick_tx(dev, skb);
- q = rcu_dereference(txq->qdisc);
+ q = rcu_dereference_bh(txq->qdisc);
#ifdef CONFIG_NET_CLS_ACT
skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_EGRESS);
diff --git a/net/core/filter.c b/net/core/filter.c
index 08db7b9..3541aa4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -86,7 +86,7 @@ int sk_filter(struct sock *sk, struct sk_buff *skb)
return err;
rcu_read_lock_bh();
- filter = rcu_dereference(sk->sk_filter);
+ filter = rcu_dereference_bh(sk->sk_filter);
if (filter) {
unsigned int pkt_len = sk_run_filter(skb, filter->insns,
filter->len);
@@ -521,7 +521,7 @@ int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
}
rcu_read_lock_bh();
- old_fp = rcu_dereference(sk->sk_filter);
+ old_fp = rcu_dereference_bh(sk->sk_filter);
rcu_assign_pointer(sk->sk_filter, fp);
rcu_read_unlock_bh();
@@ -536,7 +536,7 @@ int sk_detach_filter(struct sock *sk)
struct sk_filter *filter;
rcu_read_lock_bh();
- filter = rcu_dereference(sk->sk_filter);
+ filter = rcu_dereference_bh(sk->sk_filter);
if (filter) {
rcu_assign_pointer(sk->sk_filter, NULL);
sk_filter_delayed_uncharge(sk, filter);
diff --git a/net/core/sock.c b/net/core/sock.c
index e1f6f22..9309cc9 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1073,7 +1073,7 @@ static void __sk_free(struct sock *sk)
if (sk->sk_destruct)
sk->sk_destruct(sk);
- filter = rcu_dereference(sk->sk_filter);
+ filter = rcu_dereference_bh(sk->sk_filter);
if (filter) {
sk_filter_uncharge(sk, filter);
rcu_assign_pointer(sk->sk_filter, NULL);
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index a032840..af96613 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -1155,8 +1155,8 @@ static int __dn_route_output_key(struct dst_entry **pprt, const struct flowi *fl
if (!(flags & MSG_TRYHARD)) {
rcu_read_lock_bh();
- for(rt = rcu_dereference(dn_rt_hash_table[hash].chain); rt;
- rt = rcu_dereference(rt->u.dst.dn_next)) {
+ for(rt = rcu_dereference_bh(dn_rt_hash_table[hash].chain); rt;
+ rt = rcu_dereference_bh(rt->u.dst.dn_next)) {
if ((flp->fld_dst == rt->fl.fld_dst) &&
(flp->fld_src == rt->fl.fld_src) &&
(flp->mark == rt->fl.mark) &&
@@ -1618,9 +1618,9 @@ int dn_cache_dump(struct sk_buff *skb, struct netlink_callback *cb)
if (h > s_h)
s_idx = 0;
rcu_read_lock_bh();
- for(rt = rcu_dereference(dn_rt_hash_table[h].chain), idx = 0;
+ for(rt = rcu_dereference_bh(dn_rt_hash_table[h].chain), idx = 0;
rt;
- rt = rcu_dereference(rt->u.dst.dn_next), idx++) {
+ rt = rcu_dereference_bh(rt->u.dst.dn_next), idx++) {
if (idx < s_idx)
continue;
skb_dst_set(skb, dst_clone(&rt->u.dst));
@@ -1654,12 +1654,12 @@ static struct dn_route *dn_rt_cache_get_first(struct seq_file *seq)
for(s->bucket = dn_rt_hash_mask; s->bucket >= 0; --s->bucket) {
rcu_read_lock_bh();
- rt = dn_rt_hash_table[s->bucket].chain;
+ rt = rcu_dereference_bh(dn_rt_hash_table[s->bucket].chain);
if (rt)
break;
rcu_read_unlock_bh();
}
- return rcu_dereference(rt);
+ return rt;
}
static struct dn_route *dn_rt_cache_get_next(struct seq_file *seq, struct dn_route *rt)
@@ -1674,7 +1674,7 @@ static struct dn_route *dn_rt_cache_get_next(struct seq_file *seq, struct dn_rou
rcu_read_lock_bh();
rt = dn_rt_hash_table[s->bucket].chain;
}
- return rcu_dereference(rt);
+ return rcu_dereference_bh(rt);
}
static void *dn_rt_cache_seq_start(struct seq_file *seq, loff_t *pos)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e446496..3476b3b 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -287,12 +287,12 @@ static struct rtable *rt_cache_get_first(struct seq_file *seq)
if (!rt_hash_table[st->bucket].chain)
continue;
rcu_read_lock_bh();
- r = rcu_dereference(rt_hash_table[st->bucket].chain);
+ r = rcu_dereference_bh(rt_hash_table[st->bucket].chain);
while (r) {
if (dev_net(r->u.dst.dev) == seq_file_net(seq) &&
r->rt_genid == st->genid)
return r;
- r = rcu_dereference(r->u.dst.rt_next);
+ r = rcu_dereference_bh(r->u.dst.rt_next);
}
rcu_read_unlock_bh();
}
@@ -314,7 +314,7 @@ static struct rtable *__rt_cache_get_next(struct seq_file *seq,
rcu_read_lock_bh();
r = rt_hash_table[st->bucket].chain;
}
- return rcu_dereference(r);
+ return rcu_dereference_bh(r);
}
static struct rtable *rt_cache_get_next(struct seq_file *seq,
@@ -2687,8 +2687,8 @@ int __ip_route_output_key(struct net *net, struct rtable **rp,
hash = rt_hash(flp->fl4_dst, flp->fl4_src, flp->oif, rt_genid(net));
rcu_read_lock_bh();
- for (rth = rcu_dereference(rt_hash_table[hash].chain); rth;
- rth = rcu_dereference(rth->u.dst.rt_next)) {
+ for (rth = rcu_dereference_bh(rt_hash_table[hash].chain); rth;
+ rth = rcu_dereference_bh(rth->u.dst.rt_next)) {
if (rth->fl.fl4_dst == flp->fl4_dst &&
rth->fl.fl4_src == flp->fl4_src &&
rth->fl.iif == 0 &&
@@ -3006,8 +3006,8 @@ int ip_rt_dump(struct sk_buff *skb, struct netlink_callback *cb)
if (!rt_hash_table[h].chain)
continue;
rcu_read_lock_bh();
- for (rt = rcu_dereference(rt_hash_table[h].chain), idx = 0; rt;
- rt = rcu_dereference(rt->u.dst.rt_next), idx++) {
+ for (rt = rcu_dereference_bh(rt_hash_table[h].chain), idx = 0; rt;
+ rt = rcu_dereference_bh(rt->u.dst.rt_next), idx++) {
if (!net_eq(dev_net(rt->u.dst.dev), net) || idx < s_idx)
continue;
if (rt_is_expired(rt))
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f126d18..939471e 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -508,7 +508,7 @@ static inline unsigned int run_filter(struct sk_buff *skb, struct sock *sk,
struct sk_filter *filter;
rcu_read_lock_bh();
- filter = rcu_dereference(sk->sk_filter);
+ filter = rcu_dereference_bh(sk->sk_filter);
if (filter != NULL)
res = sk_run_filter(skb, filter->insns, filter->len);
rcu_read_unlock_bh();
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 5/9] sched: use lockdep-based checking on rcu_dereference()
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
` (3 preceding siblings ...)
2010-01-15 1:02 ` [PATCH tip/core/rcu 4/9] net: add checking to rcu_dereference() primitives Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 6/9] vfs: apply lockdep-based checking to rcu_dereference() uses Paul E. McKenney
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Update the rcu_dereference() usages to take advantage of the new
lockdep-based checking.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/cgroup.h | 5 ++++-
include/linux/cred.h | 2 +-
init/main.c | 2 ++
kernel/cgroup.c | 12 ++++++++++++
kernel/exit.c | 14 +++++++++++---
kernel/fork.c | 1 +
kernel/notifier.c | 6 +++---
kernel/pid.c | 2 +-
kernel/sched.c | 11 ++++++++---
9 files changed, 43 insertions(+), 12 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 0008dee..c9bbcb2 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -28,6 +28,7 @@ struct css_id;
extern int cgroup_init_early(void);
extern int cgroup_init(void);
extern void cgroup_lock(void);
+extern int cgroup_lock_is_held(void);
extern bool cgroup_lock_live_group(struct cgroup *cgrp);
extern void cgroup_unlock(void);
extern void cgroup_fork(struct task_struct *p);
@@ -486,7 +487,9 @@ static inline struct cgroup_subsys_state *cgroup_subsys_state(
static inline struct cgroup_subsys_state *task_subsys_state(
struct task_struct *task, int subsys_id)
{
- return rcu_dereference(task->cgroups->subsys[subsys_id]);
+ return rcu_dereference_check(task->cgroups->subsys[subsys_id],
+ rcu_read_lock_held() ||
+ cgroup_lock_is_held());
}
static inline struct cgroup* task_cgroup(struct task_struct *task,
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 4e3387a..4db09f8 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -280,7 +280,7 @@ static inline void put_cred(const struct cred *_cred)
* task or by holding tasklist_lock to prevent it from being unlinked.
*/
#define __task_cred(task) \
- ((const struct cred *)(rcu_dereference((task)->real_cred)))
+ ((const struct cred *)(rcu_dereference_check((task)->real_cred, rcu_read_lock_held() || lockdep_is_held(&tasklist_lock))))
/**
* get_task_cred - Get another task's objective credentials
diff --git a/init/main.c b/init/main.c
index dac44a9..ea6280c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -416,7 +416,9 @@ static noinline void __init_refok rest_init(void)
kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
numa_default_policy();
pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
+ rcu_read_lock();
kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
+ rcu_read_unlock();
unlock_kernel();
/*
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1fbcc74..1b1373c 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -166,6 +166,18 @@ static DEFINE_SPINLOCK(hierarchy_id_lock);
*/
static int need_forkexit_callback __read_mostly;
+#ifdef CONFIG_PROVE_LOCKING
+int cgroup_lock_is_held(void)
+{
+ return lockdep_is_held(&cgroup_mutex);
+}
+#else /* #ifdef CONFIG_PROVE_LOCKING */
+int cgroup_lock_is_held(void)
+{
+ return mutex_is_locked(&cgroup_mutex);
+}
+#endif /* #else #ifdef CONFIG_PROVE_LOCKING */
+
/* convenient tests for these bits */
inline int cgroup_is_removed(const struct cgroup *cgrp)
{
diff --git a/kernel/exit.c b/kernel/exit.c
index 546774a..45ed043 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -85,7 +85,9 @@ static void __exit_signal(struct task_struct *tsk)
BUG_ON(!sig);
BUG_ON(!atomic_read(&sig->count));
- sighand = rcu_dereference(tsk->sighand);
+ sighand = rcu_dereference_check(tsk->sighand,
+ rcu_read_lock_held() ||
+ lockdep_is_held(&tasklist_lock));
spin_lock(&sighand->siglock);
posix_cpu_timers_exit(tsk);
@@ -170,8 +172,10 @@ void release_task(struct task_struct * p)
repeat:
tracehook_prepare_release_task(p);
/* don't need to get the RCU readlock here - the process is dead and
- * can't be modifying its own credentials */
+ * can't be modifying its own credentials. But shut RCU-lockdep up */
+ rcu_read_lock();
atomic_dec(&__task_cred(p)->user->processes);
+ rcu_read_unlock();
proc_flush_task(p);
@@ -473,9 +477,11 @@ static void close_files(struct files_struct * files)
/*
* It is safe to dereference the fd table without RCU or
* ->file_lock because this is the last reference to the
- * files structure.
+ * files structure. But use RCU to shut RCU-lockdep up.
*/
+ rcu_read_lock();
fdt = files_fdtable(files);
+ rcu_read_unlock();
for (;;) {
unsigned long set;
i = j * __NFDBITS;
@@ -521,10 +527,12 @@ void put_files_struct(struct files_struct *files)
* at the end of the RCU grace period. Otherwise,
* you can free files immediately.
*/
+ rcu_read_lock();
fdt = files_fdtable(files);
if (fdt != &files->fdtab)
kmem_cache_free(files_cachep, files);
free_fdtable(fdt);
+ rcu_read_unlock();
}
}
diff --git a/kernel/fork.c b/kernel/fork.c
index 5b2959b..e01ec3e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -86,6 +86,7 @@ int max_threads; /* tunable limit on nr_threads */
DEFINE_PER_CPU(unsigned long, process_counts) = 0;
__cacheline_aligned DEFINE_RWLOCK(tasklist_lock); /* outer */
+EXPORT_SYMBOL_GPL(tasklist_lock);
int nr_processes(void)
{
diff --git a/kernel/notifier.c b/kernel/notifier.c
index acd24e7..2488ba7 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -78,10 +78,10 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
int ret = NOTIFY_DONE;
struct notifier_block *nb, *next_nb;
- nb = rcu_dereference(*nl);
+ nb = rcu_dereference_raw(*nl);
while (nb && nr_to_call) {
- next_nb = rcu_dereference(nb->next);
+ next_nb = rcu_dereference_raw(nb->next);
#ifdef CONFIG_DEBUG_NOTIFIERS
if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
@@ -309,7 +309,7 @@ int __blocking_notifier_call_chain(struct blocking_notifier_head *nh,
* racy then it does not matter what the result of the test
* is, we re-check the list after having taken the lock anyway:
*/
- if (rcu_dereference(nh->head)) {
+ if (rcu_dereference_raw(nh->head)) {
down_read(&nh->rwsem);
ret = notifier_call_chain(&nh->head, val, v, nr_to_call,
nr_calls);
diff --git a/kernel/pid.c b/kernel/pid.c
index 2e17c9c..b08e697 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -367,7 +367,7 @@ struct task_struct *pid_task(struct pid *pid, enum pid_type type)
struct task_struct *result = NULL;
if (pid) {
struct hlist_node *first;
- first = rcu_dereference(pid->tasks[type].first);
+ first = rcu_dereference_check(pid->tasks[type].first, rcu_read_lock_held() || lockdep_is_held(&tasklist_lock));
if (first)
result = hlist_entry(first, struct task_struct, pids[(type)].node);
}
diff --git a/kernel/sched.c b/kernel/sched.c
index c535cc4..ad419d9 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -645,6 +645,11 @@ static inline int cpu_of(struct rq *rq)
#endif
}
+#define for_each_domain_rd(p) \
+ rcu_dereference_check((p), \
+ rcu_read_lock_sched_held() || \
+ lockdep_is_held(&sched_domains_mutex))
+
/*
* The domain tree (rq->sd) is protected by RCU's quiescent state transition.
* See detach_destroy_domains: synchronize_sched for details.
@@ -653,7 +658,7 @@ static inline int cpu_of(struct rq *rq)
* preempt-disabled sections.
*/
#define for_each_domain(cpu, __sd) \
- for (__sd = rcu_dereference(cpu_rq(cpu)->sd); __sd; __sd = __sd->parent)
+ for (__sd = for_each_domain_rd(cpu_rq(cpu)->sd); __sd; __sd = __sd->parent)
#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
#define this_rq() (&__get_cpu_var(runqueues))
@@ -1531,7 +1536,7 @@ static unsigned long target_load(int cpu, int type)
static struct sched_group *group_of(int cpu)
{
- struct sched_domain *sd = rcu_dereference(cpu_rq(cpu)->sd);
+ struct sched_domain *sd = rcu_dereference_sched(cpu_rq(cpu)->sd);
if (!sd)
return NULL;
@@ -4877,7 +4882,7 @@ static void run_rebalance_domains(struct softirq_action *h)
static inline int on_null_domain(int cpu)
{
- return !rcu_dereference(cpu_rq(cpu)->sd);
+ return !rcu_dereference_sched(cpu_rq(cpu)->sd);
}
/*
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 6/9] vfs: apply lockdep-based checking to rcu_dereference() uses
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
` (4 preceding siblings ...)
2010-01-15 1:02 ` [PATCH tip/core/rcu 5/9] sched: use lockdep-based checking on rcu_dereference() Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 7/9] radix-tree: disable RCU lockdep checking in radix tree Paul E. McKenney
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney, Alexander Viro
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Add lockdep-ified RCU primitives to alloc_fd(), files_fdtable() and
fcheck_files().
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
fs/file.c | 2 +-
fs/proc/array.c | 2 ++
fs/proc/base.c | 6 +++++-
include/linux/fdtable.h | 8 ++++++--
4 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/fs/file.c b/fs/file.c
index 87e1290..38039af 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -478,7 +478,7 @@ repeat:
error = fd;
#if 1
/* Sanity check */
- if (rcu_dereference(fdt->fd[fd]) != NULL) {
+ if (rcu_dereference_raw(fdt->fd[fd]) != NULL) {
printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd);
rcu_assign_pointer(fdt->fd[fd], NULL);
}
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 13b5d07..18e20fe 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -270,7 +270,9 @@ static inline void task_sig(struct seq_file *m, struct task_struct *p)
blocked = p->blocked;
collect_sigign_sigcatch(p, &ignored, &caught);
num_threads = atomic_read(&p->signal->count);
+ rcu_read_lock(); /* FIXME: is this correct? */
qsize = atomic_read(&__task_cred(p)->user->sigpending);
+ rcu_read_unlock();
qlim = p->signal->rlim[RLIMIT_SIGPENDING].rlim_cur;
unlock_task_sighand(p, &flags);
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 18d5cc6..db19ca8 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1095,8 +1095,12 @@ static ssize_t proc_loginuid_write(struct file * file, const char __user * buf,
if (!capable(CAP_AUDIT_CONTROL))
return -EPERM;
- if (current != pid_task(proc_pid(inode), PIDTYPE_PID))
+ rcu_read_lock();
+ if (current != pid_task(proc_pid(inode), PIDTYPE_PID)) {
+ rcu_read_unlock();
return -EPERM;
+ }
+ rcu_read_unlock();
if (count >= PAGE_SIZE)
count = PAGE_SIZE - 1;
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index a2ec74b..144412f 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -57,7 +57,11 @@ struct files_struct {
struct file * fd_array[NR_OPEN_DEFAULT];
};
-#define files_fdtable(files) (rcu_dereference((files)->fdt))
+#define files_fdtable(files) \
+ (rcu_dereference_check((files)->fdt, \
+ rcu_read_lock_held() || \
+ lockdep_is_held(&(files)->file_lock) || \
+ atomic_read(&files->count) == 1))
struct file_operations;
struct vfsmount;
@@ -78,7 +82,7 @@ static inline struct file * fcheck_files(struct files_struct *files, unsigned in
struct fdtable *fdt = files_fdtable(files);
if (fd < fdt->max_fds)
- file = rcu_dereference(fdt->fd[fd]);
+ file = rcu_dereference_check(fdt->fd[fd], rcu_read_lock_held() || lockdep_is_held(&files->file_lock) || atomic_read(&files->count) == 1);
return file;
}
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 7/9] radix-tree: disable RCU lockdep checking in radix tree
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
` (5 preceding siblings ...)
2010-01-15 1:02 ` [PATCH tip/core/rcu 6/9] vfs: apply lockdep-based checking to rcu_dereference() uses Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 9/9] rcu: documentation update for CONFIG_PROVE_RCU Paul E. McKenney
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because the radix tree is used with many different locking designs, we
cannot do any effective checking without changing the radix-tree APIs.
It might make sense to do this later, but only if the RCU lockdep checking
proves itself sufficiently valuable.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
lib/radix-tree.c | 24 ++++++++++++------------
1 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 92cdd99..6b9670d 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -364,7 +364,7 @@ static void *radix_tree_lookup_element(struct radix_tree_root *root,
unsigned int height, shift;
struct radix_tree_node *node, **slot;
- node = rcu_dereference(root->rnode);
+ node = rcu_dereference_raw(root->rnode);
if (node == NULL)
return NULL;
@@ -384,7 +384,7 @@ static void *radix_tree_lookup_element(struct radix_tree_root *root,
do {
slot = (struct radix_tree_node **)
(node->slots + ((index>>shift) & RADIX_TREE_MAP_MASK));
- node = rcu_dereference(*slot);
+ node = rcu_dereference_raw(*slot);
if (node == NULL)
return NULL;
@@ -568,7 +568,7 @@ int radix_tree_tag_get(struct radix_tree_root *root,
if (!root_tag_get(root, tag))
return 0;
- node = rcu_dereference(root->rnode);
+ node = rcu_dereference_raw(root->rnode);
if (node == NULL)
return 0;
@@ -602,7 +602,7 @@ int radix_tree_tag_get(struct radix_tree_root *root,
BUG_ON(ret && saw_unset_tag);
return !!ret;
}
- node = rcu_dereference(node->slots[offset]);
+ node = rcu_dereference_raw(node->slots[offset]);
shift -= RADIX_TREE_MAP_SHIFT;
height--;
}
@@ -711,7 +711,7 @@ __lookup(struct radix_tree_node *slot, void ***results, unsigned long index,
}
shift -= RADIX_TREE_MAP_SHIFT;
- slot = rcu_dereference(slot->slots[i]);
+ slot = rcu_dereference_raw(slot->slots[i]);
if (slot == NULL)
goto out;
}
@@ -758,7 +758,7 @@ radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
unsigned long cur_index = first_index;
unsigned int ret;
- node = rcu_dereference(root->rnode);
+ node = rcu_dereference_raw(root->rnode);
if (!node)
return 0;
@@ -787,7 +787,7 @@ radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
slot = *(((void ***)results)[ret + i]);
if (!slot)
continue;
- results[ret + nr_found] = rcu_dereference(slot);
+ results[ret + nr_found] = rcu_dereference_raw(slot);
nr_found++;
}
ret += nr_found;
@@ -826,7 +826,7 @@ radix_tree_gang_lookup_slot(struct radix_tree_root *root, void ***results,
unsigned long cur_index = first_index;
unsigned int ret;
- node = rcu_dereference(root->rnode);
+ node = rcu_dereference_raw(root->rnode);
if (!node)
return 0;
@@ -915,7 +915,7 @@ __lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
}
}
shift -= RADIX_TREE_MAP_SHIFT;
- slot = rcu_dereference(slot->slots[i]);
+ slot = rcu_dereference_raw(slot->slots[i]);
if (slot == NULL)
break;
}
@@ -951,7 +951,7 @@ radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results,
if (!root_tag_get(root, tag))
return 0;
- node = rcu_dereference(root->rnode);
+ node = rcu_dereference_raw(root->rnode);
if (!node)
return 0;
@@ -980,7 +980,7 @@ radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results,
slot = *(((void ***)results)[ret + i]);
if (!slot)
continue;
- results[ret + nr_found] = rcu_dereference(slot);
+ results[ret + nr_found] = rcu_dereference_raw(slot);
nr_found++;
}
ret += nr_found;
@@ -1020,7 +1020,7 @@ radix_tree_gang_lookup_tag_slot(struct radix_tree_root *root, void ***results,
if (!root_tag_get(root, tag))
return 0;
- node = rcu_dereference(root->rnode);
+ node = rcu_dereference_raw(root->rnode);
if (!node)
return 0;
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
` (6 preceding siblings ...)
2010-01-15 1:02 ` [PATCH tip/core/rcu 7/9] radix-tree: disable RCU lockdep checking in radix tree Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
2010-01-18 23:36 ` Josh Triplett
2010-01-15 1:02 ` [PATCH tip/core/rcu 9/9] rcu: documentation update for CONFIG_PROVE_RCU Paul E. McKenney
8 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because idr can be used with any of a number of locks or with any
flavor of RCU, just disable the lockdep-based diagnostics. If idr
needs diagnostics, the check expression will need to be passed into
the relevant idr primitives as an additional argument.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
lib/idr.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/idr.c b/lib/idr.c
index 1cac726..4877fce 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -502,7 +502,7 @@ void *idr_find(struct idr *idp, int id)
int n;
struct idr_layer *p;
- p = rcu_dereference(idp->top);
+ p = rcu_dereference_raw(idp->top);
if (!p)
return NULL;
n = (p->layer+1) * IDR_BITS;
@@ -517,7 +517,7 @@ void *idr_find(struct idr *idp, int id)
while (n > 0 && p) {
n -= IDR_BITS;
BUG_ON(n != p->layer*IDR_BITS);
- p = rcu_dereference(p->ary[(id >> n) & IDR_MASK]);
+ p = rcu_dereference_raw(p->ary[(id >> n) & IDR_MASK]);
}
return((void *)p);
}
@@ -550,7 +550,7 @@ int idr_for_each(struct idr *idp,
struct idr_layer **paa = &pa[0];
n = idp->layers * IDR_BITS;
- p = rcu_dereference(idp->top);
+ p = rcu_dereference_raw(idp->top);
max = 1 << n;
id = 0;
@@ -558,7 +558,7 @@ int idr_for_each(struct idr *idp,
while (n > 0 && p) {
n -= IDR_BITS;
*paa++ = p;
- p = rcu_dereference(p->ary[(id >> n) & IDR_MASK]);
+ p = rcu_dereference_raw(p->ary[(id >> n) & IDR_MASK]);
}
if (p) {
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH tip/core/rcu 9/9] rcu: documentation update for CONFIG_PROVE_RCU
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
` (7 preceding siblings ...)
2010-01-15 1:02 ` [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses Paul E. McKenney
@ 2010-01-15 1:02 ` Paul E. McKenney
8 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-15 1:02 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
Paul E. McKenney
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Adds a lockdep.txt file and updates checklist.txt and whatisRCU.txt
to reflect the new lockdep-enabled capabilities of RCU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Documentation/RCU/00-INDEX | 2 +
Documentation/RCU/checklist.txt | 34 +++++++++++--------
Documentation/RCU/lockdep.txt | 67 +++++++++++++++++++++++++++++++++++++++
Documentation/RCU/whatisRCU.txt | 13 +++++---
4 files changed, 97 insertions(+), 19 deletions(-)
create mode 100644 Documentation/RCU/lockdep.txt
diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX
index 0a27ea9..71b6f50 100644
--- a/Documentation/RCU/00-INDEX
+++ b/Documentation/RCU/00-INDEX
@@ -6,6 +6,8 @@ checklist.txt
- Review Checklist for RCU Patches
listRCU.txt
- Using RCU to Protect Read-Mostly Linked Lists
+lockdep.txt
+ - RCU and lockdep checking
NMI-RCU.txt
- Using RCU to Protect Dynamic NMI Handlers
rcubarrier.txt
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 767cf06..cbc180f 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -127,10 +127,14 @@ over a rather long period of time, but improvements are always welcome!
perfectly legal (if redundant) for update-side code to
use rcu_dereference() and the "_rcu()" list-traversal
primitives. This is particularly useful in code that
- is common to readers and updaters. However, neither
- rcu_dereference() nor the "_rcu()" list-traversal
- primitives can substitute for a good concurrency design
- coordinating among multiple updaters.
+ is common to readers and updaters. However, lockdep
+ will complain if you access rcu_dereference() outside
+ of an RCU read-side critical section. See lockdep.txt
+ to learn what to do about this.
+
+ Of course, neither rcu_dereference() nor the "_rcu()"
+ list-traversal primitives can substitute for a good
+ concurrency design coordinating among multiple updaters.
b. If the list macros are being used, the list_add_tail_rcu()
and list_add_rcu() primitives must be used in order
@@ -249,7 +253,9 @@ over a rather long period of time, but improvements are always welcome!
must be protected by appropriate update-side locks. RCU
read-side critical sections are delimited by rcu_read_lock()
and rcu_read_unlock(), or by similar primitives such as
- rcu_read_lock_bh() and rcu_read_unlock_bh().
+ rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case
+ the matching rcu_dereference() primitive must be used in order
+ to keep lockdep happy, in this case, rcu_dereference_bh().
The reason that it is permissible to use RCU list-traversal
primitives when the update-side lock is held is that doing so
@@ -302,15 +308,15 @@ over a rather long period of time, but improvements are always welcome!
not the case, a self-spawning RCU callback would prevent the
victim CPU from ever going offline.)
-14. SRCU (srcu_read_lock(), srcu_read_unlock(), synchronize_srcu(),
- and synchronize_srcu_expedited()) may only be invoked from
- process context. Unlike other forms of RCU, it -is- permissible
- to block in an SRCU read-side critical section (demarked by
- srcu_read_lock() and srcu_read_unlock()), hence the "SRCU":
- "sleepable RCU". Please note that if you don't need to sleep
- in read-side critical sections, you should be using RCU rather
- than SRCU, because RCU is almost always faster and easier to
- use than is SRCU.
+14. SRCU (srcu_read_lock(), srcu_read_unlock(), srcu_dereference(),
+ synchronize_srcu(), and synchronize_srcu_expedited()) may only
+ be invoked from process context. Unlike other forms of RCU, it
+ -is- permissible to block in an SRCU read-side critical section
+ (demarked by srcu_read_lock() and srcu_read_unlock()), hence the
+ "SRCU": "sleepable RCU". Please note that if you don't need
+ to sleep in read-side critical sections, you should be using
+ RCU rather than SRCU, because RCU is almost always faster and
+ easier to use than is SRCU.
Also unlike other forms of RCU, explicit initialization
and cleanup is required via init_srcu_struct() and
diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt
new file mode 100644
index 0000000..fe24b58
--- /dev/null
+++ b/Documentation/RCU/lockdep.txt
@@ -0,0 +1,67 @@
+RCU and lockdep checking
+
+All flavors of RCU have lockdep checking available, so that lockdep is
+aware of when each task enters and leaves any flavor of RCU read-side
+critical section. Each flavor of RCU is tracked separately (but note
+that this is not the case in 2.6.32 and earlier). This allows lockdep's
+tracking to include RCU state, which can sometimes help when debugging
+deadlocks and the like.
+
+In addition, RCU provides the following primitives that check lockdep's
+state:
+
+ rcu_read_lock_held() for normal RCU.
+ rcu_read_lock_bh_held() for RCU-bh.
+ rcu_read_lock_sched_held() for RCU-sched.
+ srcu_read_lock_held() for SRCU.
+
+These functions are conservative, and will therefore return 1 if they
+aren't certain (for example, if CONFIG_DEBUG_LOCK_ALLOC is not set).
+This prevents things like WARN_ON(!rcu_read_lock_held()) from giving false
+positives when lockdep is disabled.
+
+In addition, a separate kernel config parameter CONFIG_PROVE_RCU enables
+checking of rcu_dereference() primitives:
+
+ rcu_dereference(p):
+ Check for RCU read-side critical section.
+ rcu_dereference_bh(p):
+ Check for RCU-bh read-side critical section.
+ rcu_dereference_sched(p):
+ Check for RCU-sched read-side critical section.
+ srcu_dereference(p, sp):
+ Check for SRCU read-side critical section.
+ rcu_dereference_check(p, c):
+ Use explicit check expression "c".
+ rcu_dereference_raw(p)
+ Don't check. (Use sparingly, if at all.)
+
+The rcu_dereference_check() check expression can be any boolean
+expression, but would normally include one of the rcu_read_lock_held()
+family of functions and a lockdep expression. However, any boolean
+expression can be used. For a moderately ornate example, consider
+the following:
+
+ file = rcu_dereference_check(fdt->fd[fd],
+ rcu_read_lock_held() ||
+ lockdep_is_held(&files->file_lock) ||
+ atomic_read(&files->count) == 1);
+
+This expression picks up the pointer "fdt->fd[fd]" in an RCU-safe manner,
+and, if CONFIG_PROVE_RCU is configured, verifies that this expression
+is used in:
+
+1. An RCU read-side critical section, or
+2. with files->file_lock held, or
+3. on an unshared files_struct.
+
+In case (1), the pointer is picked up in an RCU-safe manner for vanilla
+RCU read-side critical sections, in case (2) the ->file_lock prevents
+any change from taking place, and finally, in case (3) the current task
+is the only task accessing the file_struct, again preventing any change
+from taking place.
+
+There are currently only "universal" versions of the rcu_assign_pointer()
+and RCU list-/tree-traversal primitives, which do not (yet) check for
+being in an RCU read-side critical section. In the future, separate
+versions of these primitives might be created.
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 469a58b..1dc00ee 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -323,15 +323,17 @@ used as follows:
Defer Protect
a. synchronize_rcu() rcu_read_lock() / rcu_read_unlock()
- call_rcu()
+ call_rcu() rcu_dereference()
b. call_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh()
+ rcu_dereference_bh()
c. synchronize_sched() rcu_read_lock_sched() / rcu_read_unlock_sched()
preempt_disable() / preempt_enable()
local_irq_save() / local_irq_restore()
hardirq enter / hardirq exit
NMI enter / NMI exit
+ rcu_dereference_sched()
These three mechanisms are used as follows:
@@ -781,9 +783,8 @@ Linux-kernel source code, but it helps to have a full list of the
APIs, since there does not appear to be a way to categorize them
in docbook. Here is the list, by category.
-RCU pointer/list traversal:
+RCU list traversal:
- rcu_dereference
list_for_each_entry_rcu
hlist_for_each_entry_rcu
hlist_nulls_for_each_entry_rcu
@@ -809,7 +810,7 @@ RCU: Critical sections Grace period Barrier
rcu_read_lock synchronize_net rcu_barrier
rcu_read_unlock synchronize_rcu
- synchronize_rcu_expedited
+ rcu_dereference synchronize_rcu_expedited
call_rcu
@@ -817,7 +818,7 @@ bh: Critical sections Grace period Barrier
rcu_read_lock_bh call_rcu_bh rcu_barrier_bh
rcu_read_unlock_bh synchronize_rcu_bh
- synchronize_rcu_bh_expedited
+ rcu_dereference_bh synchronize_rcu_bh_expedited
sched: Critical sections Grace period Barrier
@@ -826,12 +827,14 @@ sched: Critical sections Grace period Barrier
rcu_read_unlock_sched call_rcu_sched
[preempt_disable] synchronize_sched_expedited
[and friends]
+ rcu_dereference_sched
SRCU: Critical sections Grace period Barrier
srcu_read_lock synchronize_srcu N/A
srcu_read_unlock synchronize_srcu_expedited
+ srcu_dereference
SRCU: Initialization/cleanup
init_srcu_struct
--
1.5.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses
2010-01-15 1:02 ` [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses Paul E. McKenney
@ 2010-01-18 23:36 ` Josh Triplett
2010-01-19 0:56 ` Paul E. McKenney
0 siblings, 1 reply; 12+ messages in thread
From: Josh Triplett @ 2010-01-18 23:36 UTC (permalink / raw)
To: Paul E. McKenney
Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
dvhltc, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells
On Thu, Jan 14, 2010 at 05:02:45PM -0800, Paul E. McKenney wrote:
> From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> Because idr can be used with any of a number of locks or with any
> flavor of RCU, just disable the lockdep-based diagnostics. If idr
> needs diagnostics, the check expression will need to be passed into
> the relevant idr primitives as an additional argument.
>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The subject line of this patch seems wrong. Like the preceding patches
for radix tree and list operations, this *disables* the checking by
using _raw.
- Josh Triplett
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses
2010-01-18 23:36 ` Josh Triplett
@ 2010-01-19 0:56 ` Paul E. McKenney
0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2010-01-19 0:56 UTC (permalink / raw)
To: Josh Triplett
Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
dvhltc, niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells
On Mon, Jan 18, 2010 at 03:36:04PM -0800, Josh Triplett wrote:
> On Thu, Jan 14, 2010 at 05:02:45PM -0800, Paul E. McKenney wrote:
> > From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >
> > Because idr can be used with any of a number of locks or with any
> > flavor of RCU, just disable the lockdep-based diagnostics. If idr
> > needs diagnostics, the check expression will need to be passed into
> > the relevant idr primitives as an additional argument.
> >
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> The subject line of this patch seems wrong. Like the preceding patches
> for radix tree and list operations, this *disables* the checking by
> using _raw.
It does indeed use _raw. If we start having enough issues with these
interfaces, this will need to change. At the moment, the obvious change
is to provide multiple interfaces as was done for rcu_dereference().
For example, idr_find(), idr_find_bh(), idr_find_sched(), idr_find_srcu(),
idr_find_rcu(), and idr_find_check().
Probably only a few of these are actually in use, but even so, I decided
to see how rcu_dereference() turned out before pushing up to the
higher-level APIs. My guess would be that the list APIs would be most
pressing, but we will see. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-01-19 0:56 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-15 1:01 [PATCH tip/core/rcu 0/9] rcu: v3 add lockdep-based diagnostics to rcu_dereference() Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 1/9] rcu: introduce lockdep-based checking to RCU read-side primitives Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 2/9] rcu: add lockdep-enabled variants of rcu_dereference() Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 3/9] rcu: disable lockdep checking in RCU list-traversal primitives Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 4/9] net: add checking to rcu_dereference() primitives Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 5/9] sched: use lockdep-based checking on rcu_dereference() Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 6/9] vfs: apply lockdep-based checking to rcu_dereference() uses Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 7/9] radix-tree: disable RCU lockdep checking in radix tree Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 8/9] idr: apply lockdep-based diagnostics to rcu_dereference() uses Paul E. McKenney
2010-01-18 23:36 ` Josh Triplett
2010-01-19 0:56 ` Paul E. McKenney
2010-01-15 1:02 ` [PATCH tip/core/rcu 9/9] rcu: documentation update for CONFIG_PROVE_RCU Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox