* [RFC][PATCH 0/6] using lockdep to validate rcu usage
@ 2007-09-19 10:41 Peter Zijlstra
2007-09-19 10:41 ` [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} Peter Zijlstra
` (6 more replies)
0 siblings, 7 replies; 30+ messages in thread
From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw)
To: linux-kernel
Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra,
Nick Piggin
This patch set uses lockdep to validate rcu usage.
It annotates rcu_read_{,un}lock{,_bh}() to catch imbalances. And further uses
that information to establish a proper context for rcu_dereference().
It also separates implicit from explicit preempt_disable() usage, in order to
separate rcu_dereference() from the locking model.
A kernel (2.6.23-rc4-mm1) with these patches boots but does have some funnies -
I suspect it calls printf from places it doesn't like.
The first patch should be safe to apply, the rest is RFC.
If people want to see the very noisy bootlog this generates:
http://programming.kicks-ass.net/kernel-patches/lockdep_rcu/lockdep_rcu.log
^ permalink raw reply [flat|nested] 30+ messages in thread* [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra @ 2007-09-19 10:41 ` Peter Zijlstra 2007-09-19 23:06 ` Paul E. McKenney 2007-09-19 10:41 ` [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() Peter Zijlstra ` (5 subsequent siblings) 6 siblings, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw) To: linux-kernel Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra, Nick Piggin [-- Attachment #1: rcu-lockdep.patch --] [-- Type: text/plain, Size: 3237 bytes --] lockdep annotate rcu_read_{,un}lock{,_bh} in order to catch imbalanced usage. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- include/linux/lockdep.h | 7 +++++++ include/linux/rcupdate.h | 12 ++++++++++++ kernel/rcupdate.c | 8 ++++++++ 3 files changed, 27 insertions(+) Index: linux-2.6/include/linux/lockdep.h =================================================================== --- linux-2.6.orig/include/linux/lockdep.h +++ linux-2.6/include/linux/lockdep.h @@ -252,6 +252,13 @@ extern void lockdep_init_map(struct lock struct lock_class_key *key, int subclass); /* + * To initialize a lockdep_map statically use this macro. + * Note that _name must not be NULL. + */ +#define STATIC_LOCKDEP_MAP_INIT(_name, _key) \ + { .name = (_name), .key = (void *)(_key), } + +/* * Reinitialize a lock key - for cases where there is special locking or * special initialization of locks so that the validator gets the scope * of dependencies wrong: they are either too broad (they need a class-split) Index: linux-2.6/include/linux/rcupdate.h =================================================================== --- linux-2.6.orig/include/linux/rcupdate.h +++ linux-2.6/include/linux/rcupdate.h @@ -41,6 +41,7 @@ #include <linux/percpu.h> #include <linux/cpumask.h> #include <linux/seqlock.h> +#include <linux/lockdep.h> /** * struct rcu_head - callback structure for use with RCU @@ -133,6 +134,15 @@ static inline void rcu_bh_qsctr_inc(int extern int rcu_pending(int cpu); extern int rcu_needs_cpu(int cpu); +#ifdef CONFIG_DEBUG_LOCK_ALLOC +extern struct lockdep_map rcu_lock_map; +# define rcu_read_acquire() lock_acquire(&rcu_lock_map, 0, 0, 2, 1, _THIS_IP_) +# define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_) +#else +# define rcu_read_acquire() do { } while (0) +# define rcu_read_release() do { } while (0) +#endif + /** * rcu_read_lock - mark the beginning of an RCU read-side critical section. * @@ -166,6 +176,7 @@ extern int rcu_needs_cpu(int cpu); do { \ preempt_disable(); \ __acquire(RCU); \ + rcu_read_acquire(); \ } while(0) /** @@ -175,6 +186,7 @@ extern int rcu_needs_cpu(int cpu); */ #define rcu_read_unlock() \ do { \ + rcu_read_release(); \ __release(RCU); \ preempt_enable(); \ } while(0) @@ -216,6 +218,7 @@ extern struct lockdep_map rcu_lock_map; do { \ local_bh_disable(); \ __acquire(RCU_BH); \ + rcu_read_acquire(); \ } while(0) /* @@ -225,6 +228,7 @@ extern struct lockdep_map rcu_lock_map; */ #define rcu_read_unlock_bh() \ do { \ + rcu_read_release(); \ __release(RCU_BH); \ local_bh_enable(); \ } while(0) Index: linux-2.6/kernel/rcupdate.c =================================================================== --- linux-2.6.orig/kernel/rcupdate.c +++ linux-2.6/kernel/rcupdate.c @@ -48,6 +48,14 @@ #include <linux/cpu.h> #include <linux/mutex.h> +#ifdef CONFIG_DEBUG_LOCK_ALLOC +static struct lock_class_key rcu_lock_key; +struct lockdep_map rcu_lock_map = + STATIC_LOCKDEP_MAP_INIT("rcu_read_lock", &rcu_lock_key); + +EXPORT_SYMBOL_GPL(rcu_lock_map); +#endif + /* Definition for rcupdate control block. */ static struct rcu_ctrlblk rcu_ctrlblk = { .cur = -300, -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} 2007-09-19 10:41 ` [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} Peter Zijlstra @ 2007-09-19 23:06 ` Paul E. McKenney 0 siblings, 0 replies; 30+ messages in thread From: Paul E. McKenney @ 2007-09-19 23:06 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, Sep 19, 2007 at 12:41:26PM +0200, Peter Zijlstra wrote: > lockdep annotate rcu_read_{,un}lock{,_bh} in order to catch imbalanced > usage. In my message yesterday, I forgot about srcu_read_lock() and srcu_read_unlock(). :-/ Here is a proto-patch, untested, probably does not compile. One interesting side-effect of the overall patch is that one must select CONFIG_PREEMPT in order for a CONFIG_DEBUG_LOCK_ALLOC build to compile. Might be OK, but thought I should mention it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> diff -urpNa -X dontdiff linux-2.6.22-rcudep/kernel/srcu.c linux-2.6.22-srcudep/kernel/srcu.c --- linux-2.6.22-rcudep/kernel/srcu.c 2007-07-08 16:32:17.000000000 -0700 +++ linux-2.6.22-srcudep/kernel/srcu.c 2007-09-19 12:50:00.000000000 -0700 @@ -33,6 +33,7 @@ #include <linux/slab.h> #include <linux/smp.h> #include <linux/srcu.h> +#include <linux/rcupdate.h> /** * init_srcu_struct - initialize a sleep-RCU structure @@ -116,6 +117,7 @@ int srcu_read_lock(struct srcu_struct *s per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++; srcu_barrier(); /* ensure compiler won't misorder critical section. */ preempt_enable(); + rcu_read_acquire(); return idx; } @@ -131,6 +133,7 @@ int srcu_read_lock(struct srcu_struct *s */ void srcu_read_unlock(struct srcu_struct *sp, int idx) { + rcu_read_release(); preempt_disable(); srcu_barrier(); /* ensure compiler won't misorder critical section. */ per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--; > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > --- > include/linux/lockdep.h | 7 +++++++ > include/linux/rcupdate.h | 12 ++++++++++++ > kernel/rcupdate.c | 8 ++++++++ > 3 files changed, 27 insertions(+) > > Index: linux-2.6/include/linux/lockdep.h > =================================================================== > --- linux-2.6.orig/include/linux/lockdep.h > +++ linux-2.6/include/linux/lockdep.h > @@ -252,6 +252,13 @@ extern void lockdep_init_map(struct lock > struct lock_class_key *key, int subclass); > > /* > + * To initialize a lockdep_map statically use this macro. > + * Note that _name must not be NULL. > + */ > +#define STATIC_LOCKDEP_MAP_INIT(_name, _key) \ > + { .name = (_name), .key = (void *)(_key), } > + > +/* > * Reinitialize a lock key - for cases where there is special locking or > * special initialization of locks so that the validator gets the scope > * of dependencies wrong: they are either too broad (they need a class-split) > Index: linux-2.6/include/linux/rcupdate.h > =================================================================== > --- linux-2.6.orig/include/linux/rcupdate.h > +++ linux-2.6/include/linux/rcupdate.h > @@ -41,6 +41,7 @@ > #include <linux/percpu.h> > #include <linux/cpumask.h> > #include <linux/seqlock.h> > +#include <linux/lockdep.h> > > /** > * struct rcu_head - callback structure for use with RCU > @@ -133,6 +134,15 @@ static inline void rcu_bh_qsctr_inc(int > extern int rcu_pending(int cpu); > extern int rcu_needs_cpu(int cpu); > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > +extern struct lockdep_map rcu_lock_map; > +# define rcu_read_acquire() lock_acquire(&rcu_lock_map, 0, 0, 2, 1, _THIS_IP_) > +# define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_) > +#else > +# define rcu_read_acquire() do { } while (0) > +# define rcu_read_release() do { } while (0) > +#endif > + > /** > * rcu_read_lock - mark the beginning of an RCU read-side critical section. > * > @@ -166,6 +176,7 @@ extern int rcu_needs_cpu(int cpu); > do { \ > preempt_disable(); \ > __acquire(RCU); \ > + rcu_read_acquire(); \ > } while(0) > > /** > @@ -175,6 +186,7 @@ extern int rcu_needs_cpu(int cpu); > */ > #define rcu_read_unlock() \ > do { \ > + rcu_read_release(); \ > __release(RCU); \ > preempt_enable(); \ > } while(0) > @@ -216,6 +218,7 @@ extern struct lockdep_map rcu_lock_map; > do { \ > local_bh_disable(); \ > __acquire(RCU_BH); \ > + rcu_read_acquire(); \ > } while(0) > > /* > @@ -225,6 +228,7 @@ extern struct lockdep_map rcu_lock_map; > */ > #define rcu_read_unlock_bh() \ > do { \ > + rcu_read_release(); \ > __release(RCU_BH); \ > local_bh_enable(); \ > } while(0) > Index: linux-2.6/kernel/rcupdate.c > =================================================================== > --- linux-2.6.orig/kernel/rcupdate.c > +++ linux-2.6/kernel/rcupdate.c > @@ -48,6 +48,14 @@ > #include <linux/cpu.h> > #include <linux/mutex.h> > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > +static struct lock_class_key rcu_lock_key; > +struct lockdep_map rcu_lock_map = > + STATIC_LOCKDEP_MAP_INIT("rcu_read_lock", &rcu_lock_key); > + > +EXPORT_SYMBOL_GPL(rcu_lock_map); > +#endif > + > /* Definition for rcupdate control block. */ > static struct rcu_ctrlblk rcu_ctrlblk = { > .cur = -300, > > -- > ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra 2007-09-19 10:41 ` [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} Peter Zijlstra @ 2007-09-19 10:41 ` Peter Zijlstra 2007-09-19 14:17 ` Dmitry Torokhov 2007-09-19 10:41 ` [RFC][PATCH 3/6] lockdep: rcu_dereference() vs preempt_disable() Peter Zijlstra ` (4 subsequent siblings) 6 siblings, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw) To: linux-kernel Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra, Nick Piggin [-- Attachment #1: rcu-lockdep-validate.patch --] [-- Type: text/plain, Size: 4293 bytes --] Warn when rcu_dereference() is not used in combination with rcu_read_lock() Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- include/linux/lockdep.h | 4 +++ include/linux/rcupdate.h | 3 ++ kernel/lockdep.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+) Index: linux-2.6/include/linux/lockdep.h =================================================================== --- linux-2.6.orig/include/linux/lockdep.h +++ linux-2.6/include/linux/lockdep.h @@ -241,6 +241,7 @@ extern void lockdep_free_key_range(void extern void lockdep_off(void); extern void lockdep_on(void); +extern int lockdep_enabled(void); /* * These methods are used by specific locking variants (spinlocks, @@ -303,6 +304,8 @@ extern void lock_acquire(struct lockdep_ extern void lock_release(struct lockdep_map *lock, int nested, unsigned long ip); +extern int lock_is_held(struct lockdep_map *lock); + # define INIT_LOCKDEP .lockdep_recursion = 0, #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) @@ -319,6 +322,7 @@ static inline void lockdep_on(void) # define lock_acquire(l, s, t, r, c, i) do { } while (0) # define lock_release(l, n, i) do { } while (0) +# define lock_is_held(l) (0) # define lockdep_init() do { } while (0) # define lockdep_info() do { } while (0) # define lockdep_init_map(lock, name, key, sub) do { (void)(key); } while (0) Index: linux-2.6/include/linux/rcupdate.h =================================================================== --- linux-2.6.orig/include/linux/rcupdate.h +++ linux-2.6/include/linux/rcupdate.h @@ -138,9 +138,11 @@ extern int rcu_needs_cpu(int cpu); extern struct lockdep_map rcu_lock_map; # define rcu_read_acquire() lock_acquire(&rcu_lock_map, 0, 0, 2, 1, _THIS_IP_) # define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_) +# define rcu_read_held() WARN_ON_ONCE(lockdep_enabled() && !lock_is_held(&rcu_lock_map)) #else # define rcu_read_acquire() do { } while (0) # define rcu_read_release() do { } while (0) +# define rcu_read_held() do { } while (0) #endif /** @@ -256,6 +258,7 @@ extern struct lockdep_map rcu_lock_map; #define rcu_dereference(p) ({ \ typeof(p) _________p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); \ + rcu_read_held(); \ (_________p1); \ }) Index: linux-2.6/kernel/lockdep.c =================================================================== --- linux-2.6.orig/kernel/lockdep.c +++ linux-2.6/kernel/lockdep.c @@ -284,6 +284,13 @@ void lockdep_on(void) EXPORT_SYMBOL(lockdep_on); +int lockdep_enabled(void) +{ + return debug_locks && !current->lockdep_recursion; +} + +EXPORT_SYMBOL(lockdep_enabled); + /* * Debugging switches: */ @@ -2624,6 +2631,36 @@ static int lock_release_nested(struct ta return 1; } +static int __lock_is_held(struct lockdep_map *lock) +{ + struct task_struct *curr = current; + struct held_lock *hlock, *prev_hlock; + unsigned int depth; + int i; + + /* + * Check whether the lock exists in the current stack + * of held locks: + */ + depth = curr->lockdep_depth; + if (!depth) + return 0; + + prev_hlock = NULL; + for (i = depth-1; i >= 0; i--) { + hlock = curr->held_locks + i; + /* + * We must not cross into another context: + */ + if (prev_hlock && prev_hlock->irq_context != hlock->irq_context) + break; + if (hlock->instance == lock) + return 1; + prev_hlock = hlock; + } + return 0; +} + /* * Remove the lock to the list of currently held locks - this gets * called on mutex_unlock()/spin_unlock*() (or on a failed @@ -2727,6 +2764,29 @@ void lock_release(struct lockdep_map *lo EXPORT_SYMBOL_GPL(lock_release); +int lock_is_held(struct lockdep_map *lock) +{ + int ret = 0; + unsigned long flags; + + if (unlikely(!lock_stat && !prove_locking)) + return 0; + + if (unlikely(current->lockdep_recursion)) + return -EBUSY; + + raw_local_irq_save(flags); + check_flags(flags); + current->lockdep_recursion = 1; + ret = __lock_is_held(lock); + current->lockdep_recursion = 0; + raw_local_irq_restore(flags); + + return ret; +} + +EXPORT_SYMBOL_GPL(lock_is_held); + #ifdef CONFIG_LOCK_STAT static int print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock, -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 10:41 ` [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() Peter Zijlstra @ 2007-09-19 14:17 ` Dmitry Torokhov 2007-09-19 14:31 ` Peter Zijlstra 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 14:17 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Paul E. McKenney, Ingo Molnar, Andrew Morton, Nick Piggin Hi Peter, On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > According to Paul it is fine to use RCU primitives (when accompanied with proper comments) when the read-size critical section is guarded by spin_lock_irqsave()/spin_lock_irqsrestore() instead of rcu_read_lock()/rcu_read_unlock() and writers synchronize with synchronize_sched(), not synchronize_rcu(). Your patch will trigger warnign on such valid usages. -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 14:17 ` Dmitry Torokhov @ 2007-09-19 14:31 ` Peter Zijlstra 2007-09-19 15:16 ` Dmitry Torokhov 0 siblings, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 14:31 UTC (permalink / raw) To: Dmitry Torokhov Cc: linux-kernel, Paul E. McKenney, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > Hi Peter, > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > According to Paul it is fine to use RCU primitives (when accompanied > with proper comments) when the read-size critical section is guarded > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > warnign on such valid usages. > Sounds fragile to begin with. But you're right in that that is valid for Linux as you know it. However in -rt most/all spinlocks are converted to sleeping locks. In that case sync_sched() is not enough. So I'd rather recommend against proliferation of such schemes, as we'd have to clean them up later on. Still, I'm sure there are other false positives and we need to come up with proper annotations for those. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 14:31 ` Peter Zijlstra @ 2007-09-19 15:16 ` Dmitry Torokhov 2007-09-19 15:25 ` Peter Zijlstra 2007-09-19 15:37 ` Paul E. McKenney 0 siblings, 2 replies; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 15:16 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Paul E. McKenney, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > <dmitry.torokhov@gmail.com> wrote: > > > Hi Peter, > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > with proper comments) when the read-size critical section is guarded > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > warnign on such valid usages. > > > > Sounds fragile to begin with. But you're right in that that is valid > for Linux as you know it. However in -rt most/all spinlocks are > converted to sleeping locks. In that case sync_sched() is not enough. > OK, then it goes beyond RCU... We need to come up with something that can be used to synchronize with IRQ handlers (quite often in driver code one needs to be sure that current invocation of IRQ handler completed before doing something). And once we have it splinlock + RCU users can just use that method. -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 15:16 ` Dmitry Torokhov @ 2007-09-19 15:25 ` Peter Zijlstra 2007-09-19 15:37 ` Paul E. McKenney 1 sibling, 0 replies; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 15:25 UTC (permalink / raw) To: Dmitry Torokhov Cc: linux-kernel, Paul E. McKenney, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, 19 Sep 2007 11:16:21 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > > <dmitry.torokhov@gmail.com> wrote: > > > > > Hi Peter, > > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > > with proper comments) when the read-size critical section is guarded > > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > > warnign on such valid usages. > > > > > > > Sounds fragile to begin with. But you're right in that that is valid > > for Linux as you know it. However in -rt most/all spinlocks are > > converted to sleeping locks. In that case sync_sched() is not enough. > > > > OK, then it goes beyond RCU... We need to come up with something that > can be used to synchronize with IRQ handlers (quite often in driver > code one needs to be sure that current invocation of IRQ handler > completed before doing something). And once we have it splinlock + RCU > users can just use that method. Sound like you want a completion or workqueue. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 15:16 ` Dmitry Torokhov 2007-09-19 15:25 ` Peter Zijlstra @ 2007-09-19 15:37 ` Paul E. McKenney 2007-09-19 16:59 ` Dmitry Torokhov 1 sibling, 1 reply; 30+ messages in thread From: Paul E. McKenney @ 2007-09-19 15:37 UTC (permalink / raw) To: Dmitry Torokhov Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, Sep 19, 2007 at 11:16:21AM -0400, Dmitry Torokhov wrote: > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > > <dmitry.torokhov@gmail.com> wrote: > > > > > Hi Peter, > > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > > with proper comments) when the read-size critical section is guarded > > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > > warnign on such valid usages. > > > > > > > Sounds fragile to begin with. But you're right in that that is valid > > for Linux as you know it. However in -rt most/all spinlocks are > > converted to sleeping locks. In that case sync_sched() is not enough. > > OK, then it goes beyond RCU... We need to come up with something that > can be used to synchronize with IRQ handlers (quite often in driver > code one needs to be sure that current invocation of IRQ handler > completed before doing something). And once we have it splinlock + RCU > users can just use that method. But Peter's approach would not cause a problem here -- you wouldn't be doing an rcu_dereference from within the IRQ handler in this case, right? That said, we will need something to handle threaded interrupts, since synchronize_sched() only waits for hardirq, NMI, SMI, etc., and not threaded IRQs. Thanx, Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 15:37 ` Paul E. McKenney @ 2007-09-19 16:59 ` Dmitry Torokhov 2007-09-19 17:32 ` Paul E. McKenney 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 16:59 UTC (permalink / raw) To: paulmck Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Wed, Sep 19, 2007 at 11:16:21AM -0400, Dmitry Torokhov wrote: > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > Hi Peter, > > > > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > > > with proper comments) when the read-size critical section is guarded > > > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > > > warnign on such valid usages. > > > > > > > > > > Sounds fragile to begin with. But you're right in that that is valid > > > for Linux as you know it. However in -rt most/all spinlocks are > > > converted to sleeping locks. In that case sync_sched() is not enough. > > > > OK, then it goes beyond RCU... We need to come up with something that > > can be used to synchronize with IRQ handlers (quite often in driver > > code one needs to be sure that current invocation of IRQ handler > > completed before doing something). And once we have it splinlock + RCU > > users can just use that method. > > But Peter's approach would not cause a problem here -- you wouldn't be > doing an rcu_dereference from within the IRQ handler in this case, right? > Yes I do. Along with list_for_each_rcu(). > That said, we will need something to handle threaded interrupts, since > synchronize_sched() only waits for hardirq, NMI, SMI, etc., and not > threaded IRQs. > > Thanx, Paul > -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 16:59 ` Dmitry Torokhov @ 2007-09-19 17:32 ` Paul E. McKenney 2007-09-19 17:48 ` Paul E. McKenney 0 siblings, 1 reply; 30+ messages in thread From: Paul E. McKenney @ 2007-09-19 17:32 UTC (permalink / raw) To: Dmitry Torokhov Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, Sep 19, 2007 at 12:59:10PM -0400, Dmitry Torokhov wrote: > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > On Wed, Sep 19, 2007 at 11:16:21AM -0400, Dmitry Torokhov wrote: > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > > > Hi Peter, > > > > > > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > > > > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > > > > with proper comments) when the read-size critical section is guarded > > > > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > > > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > > > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > > > > warnign on such valid usages. > > > > > > > > > > > > > Sounds fragile to begin with. But you're right in that that is valid > > > > for Linux as you know it. However in -rt most/all spinlocks are > > > > converted to sleeping locks. In that case sync_sched() is not enough. > > > > > > OK, then it goes beyond RCU... We need to come up with something that > > > can be used to synchronize with IRQ handlers (quite often in driver > > > code one needs to be sure that current invocation of IRQ handler > > > completed before doing something). And once we have it splinlock + RCU > > > users can just use that method. > > > > But Peter's approach would not cause a problem here -- you wouldn't be > > doing an rcu_dereference from within the IRQ handler in this case, right? > > Yes I do. Along with list_for_each_rcu(). OK, in that case it does indeed need to be handled. Thanx, Paul > > That said, we will need something to handle threaded interrupts, since > > synchronize_sched() only waits for hardirq, NMI, SMI, etc., and not > > threaded IRQs. > > > > Thanx, Paul > > > > -- > Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 17:32 ` Paul E. McKenney @ 2007-09-19 17:48 ` Paul E. McKenney 2007-09-19 18:49 ` Dmitry Torokhov 0 siblings, 1 reply; 30+ messages in thread From: Paul E. McKenney @ 2007-09-19 17:48 UTC (permalink / raw) To: Dmitry Torokhov Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, Sep 19, 2007 at 10:32:49AM -0700, Paul E. McKenney wrote: > On Wed, Sep 19, 2007 at 12:59:10PM -0400, Dmitry Torokhov wrote: > > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > On Wed, Sep 19, 2007 at 11:16:21AM -0400, Dmitry Torokhov wrote: > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > > > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > > > > > Hi Peter, > > > > > > > > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > > > > > > > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > > > > > with proper comments) when the read-size critical section is guarded > > > > > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > > > > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > > > > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > > > > > warnign on such valid usages. > > > > > > > > > > > > > > > > Sounds fragile to begin with. But you're right in that that is valid > > > > > for Linux as you know it. However in -rt most/all spinlocks are > > > > > converted to sleeping locks. In that case sync_sched() is not enough. > > > > > > > > OK, then it goes beyond RCU... We need to come up with something that > > > > can be used to synchronize with IRQ handlers (quite often in driver > > > > code one needs to be sure that current invocation of IRQ handler > > > > completed before doing something). And once we have it splinlock + RCU > > > > users can just use that method. > > > > > > But Peter's approach would not cause a problem here -- you wouldn't be > > > doing an rcu_dereference from within the IRQ handler in this case, right? > > > > Yes I do. Along with list_for_each_rcu(). > > OK, in that case it does indeed need to be handled. PS to previous -- any problem with inserting rcu_read_lock() and rcu_read_unlock() around the portion of the IRQ handler that has these accesses? Thanx, Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 17:48 ` Paul E. McKenney @ 2007-09-19 18:49 ` Dmitry Torokhov 2007-09-19 19:41 ` Peter Zijlstra 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 18:49 UTC (permalink / raw) To: paulmck Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Wed, Sep 19, 2007 at 10:32:49AM -0700, Paul E. McKenney wrote: > > On Wed, Sep 19, 2007 at 12:59:10PM -0400, Dmitry Torokhov wrote: > > > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > On Wed, Sep 19, 2007 at 11:16:21AM -0400, Dmitry Torokhov wrote: > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > > > On Wed, 19 Sep 2007 10:17:25 -0400 "Dmitry Torokhov" > > > > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > > > > > > > Hi Peter, > > > > > > > > > > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > > > > > Warn when rcu_dereference() is not used in combination with rcu_read_lock() > > > > > > > > > > > > > > > > > > > > > > According to Paul it is fine to use RCU primitives (when accompanied > > > > > > > with proper comments) when the read-size critical section is guarded > > > > > > > by spin_lock_irqsave()/spin_lock_irqsrestore() instead of > > > > > > > rcu_read_lock()/rcu_read_unlock() and writers synchronize with > > > > > > > synchronize_sched(), not synchronize_rcu(). Your patch will trigger > > > > > > > warnign on such valid usages. > > > > > > > > > > > > > > > > > > > Sounds fragile to begin with. But you're right in that that is valid > > > > > > for Linux as you know it. However in -rt most/all spinlocks are > > > > > > converted to sleeping locks. In that case sync_sched() is not enough. > > > > > > > > > > OK, then it goes beyond RCU... We need to come up with something that > > > > > can be used to synchronize with IRQ handlers (quite often in driver > > > > > code one needs to be sure that current invocation of IRQ handler > > > > > completed before doing something). And once we have it splinlock + RCU > > > > > users can just use that method. > > > > > > > > But Peter's approach would not cause a problem here -- you wouldn't be > > > > doing an rcu_dereference from within the IRQ handler in this case, right? > > > > > > Yes I do. Along with list_for_each_rcu(). > > > > OK, in that case it does indeed need to be handled. > > PS to previous -- any problem with inserting rcu_read_lock() and > rcu_read_unlock() around the portion of the IRQ handler that has > these accesses? > I guess I could but it is an extra lock that needs to be managed and given the fact that it is not really needed (other to make a newly developed tool happy) I am hestsant to do that. -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 18:49 ` Dmitry Torokhov @ 2007-09-19 19:41 ` Peter Zijlstra 2007-09-19 19:49 ` Dmitry Torokhov 0 siblings, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 19:41 UTC (permalink / raw) To: Dmitry Torokhov Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, 19 Sep 2007 14:49:56 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > PS to previous -- any problem with inserting rcu_read_lock() and > > rcu_read_unlock() around the portion of the IRQ handler that has > > these accesses? > > > > I guess I could but it is an extra lock that needs to be managed and > given the fact that it is not really needed (other to make a newly > developed tool happy) I am hestsant to do that. As is, these sites are a bug in -rt and we'll need to fix them anyway. As for the code you pointed me to, the i8042 driver, it seems to play way to funny tricks for a simple 'slow' driver. If you replace the spin_lock() + sync_sched(), with rcu_read_lock() + rcu_call() it should work again without adding an extra lock. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 19:41 ` Peter Zijlstra @ 2007-09-19 19:49 ` Dmitry Torokhov 2007-09-19 20:13 ` Peter Zijlstra 2007-09-19 20:48 ` Paul E. McKenney 0 siblings, 2 replies; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 19:49 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Wed, 19 Sep 2007 14:49:56 -0400 "Dmitry Torokhov" > <dmitry.torokhov@gmail.com> wrote: > > > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > PS to previous -- any problem with inserting rcu_read_lock() and > > > rcu_read_unlock() around the portion of the IRQ handler that has > > > these accesses? > > > > > > > I guess I could but it is an extra lock that needs to be managed and > > given the fact that it is not really needed (other to make a newly > > developed tool happy) I am hestsant to do that. > > As is, these sites are a bug in -rt and we'll need to fix them anyway. > > As for the code you pointed me to, the i8042 driver, it seems to play > way to funny tricks for a simple 'slow' driver. Even "slow" driver should try not to slow down the rest of the system if it can help it. I am sorry if the thing it does do not quite fit in with the changes you are proposing but it does not make the exeisting code invalid. > > If you replace the spin_lock() + sync_sched(), with rcu_read_lock() + > rcu_call() it should work again without adding an extra lock. > Except that I need spin_lock_irq for other reasons. I could take the same lock in write-side code and not use RCU at all but using RCU allows opening/closing input devices without slowing down interrupt handlers so why not use it? -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 19:49 ` Dmitry Torokhov @ 2007-09-19 20:13 ` Peter Zijlstra 2007-09-19 20:41 ` Dmitry Torokhov 2007-09-19 20:48 ` Paul E. McKenney 1 sibling, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 20:13 UTC (permalink / raw) To: Dmitry Torokhov Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, 19 Sep 2007 15:49:24 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Wed, 19 Sep 2007 14:49:56 -0400 "Dmitry Torokhov" > > <dmitry.torokhov@gmail.com> wrote: > > > > > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > > > PS to previous -- any problem with inserting rcu_read_lock() and > > > > rcu_read_unlock() around the portion of the IRQ handler that has > > > > these accesses? > > > > > > > > > > I guess I could but it is an extra lock that needs to be managed and > > > given the fact that it is not really needed (other to make a newly > > > developed tool happy) I am hestsant to do that. > > > > As is, these sites are a bug in -rt and we'll need to fix them anyway. > > > > As for the code you pointed me to, the i8042 driver, it seems to play > > way to funny tricks for a simple 'slow' driver. > > Even "slow" driver should try not to slow down the rest of the system > if it can help it. I am sorry if the thing it does do not quite fit in > with the changes you are proposing but it does not make the exeisting > code invalid. > > > > > If you replace the spin_lock() + sync_sched(), with rcu_read_lock() + > > rcu_call() it should work again without adding an extra lock. > > > > Except that I need spin_lock_irq for other reasons. I could take the > same lock in write-side code and not use RCU at all but using RCU > allows opening/closing input devices without slowing down interrupt > handlers so why not use it? If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() function does sync_rcu() instead of _sched(), it should be good again. It will not affect anything else than the task that calls _stop(). And even there the only change is that the sleep might be a tad longer. I find it curious that a driver that is 'low performant' and does not suffer lock contention pioneers locking schemes. I agree with optimizing, but this is not the place to push the envelope. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 20:13 ` Peter Zijlstra @ 2007-09-19 20:41 ` Dmitry Torokhov 2007-09-19 21:19 ` Peter Zijlstra 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 20:41 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Wed, 19 Sep 2007 15:49:24 -0400 "Dmitry Torokhov" > <dmitry.torokhov@gmail.com> wrote: > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > On Wed, 19 Sep 2007 14:49:56 -0400 "Dmitry Torokhov" > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > > > > > PS to previous -- any problem with inserting rcu_read_lock() and > > > > > rcu_read_unlock() around the portion of the IRQ handler that has > > > > > these accesses? > > > > > > > > > > > > > I guess I could but it is an extra lock that needs to be managed and > > > > given the fact that it is not really needed (other to make a newly > > > > developed tool happy) I am hestsant to do that. > > > > > > As is, these sites are a bug in -rt and we'll need to fix them anyway. > > > > > > As for the code you pointed me to, the i8042 driver, it seems to play > > > way to funny tricks for a simple 'slow' driver. > > > > Even "slow" driver should try not to slow down the rest of the system > > if it can help it. I am sorry if the thing it does do not quite fit in > > with the changes you are proposing but it does not make the exeisting > > code invalid. > > > > > > > > If you replace the spin_lock() + sync_sched(), with rcu_read_lock() + > > > rcu_call() it should work again without adding an extra lock. > > > > > > > Except that I need spin_lock_irq for other reasons. I could take the > > same lock in write-side code and not use RCU at all but using RCU > > allows opening/closing input devices without slowing down interrupt > > handlers so why not use it? > > If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() > function does sync_rcu() instead of _sched(), it should be good again. > It will not affect anything else than the task that calls _stop(). And > even there the only change is that the sleep might be a tad longer. And the IRQ handler needs to do some extra job... Anyway, it looks -rt breaks synchronize_sched() and needs to have it fixed: "/** * synchronize_sched - block until all CPUs have exited any non-preemptive * kernel code sequences. * * This means that all preempt_disable code sequences, including NMI and * hardware-interrupt handlers, in progress on entry will have completed * before this primitive returns." > > I find it curious that a driver that is 'low performant' and does not > suffer lock contention pioneers locking schemes. I agree with > optimizing, but this is not the place to push the envelope. Please realize that evey microsecond wasted on a 'low performant' driver is taken from high performers and if we can help it why shouldn't we? -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 20:41 ` Dmitry Torokhov @ 2007-09-19 21:19 ` Peter Zijlstra 2007-09-19 21:29 ` Dmitry Torokhov 0 siblings, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 21:19 UTC (permalink / raw) To: Dmitry Torokhov Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, 19 Sep 2007 16:41:04 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > > If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() > > function does sync_rcu() instead of _sched(), it should be good again. > > It will not affect anything else than the task that calls _stop(). And > > even there the only change is that the sleep might be a tad longer. > > And the IRQ handler needs to do some extra job... Anyway, it looks -rt > breaks synchronize_sched() and needs to have it fixed: > > "/** > * synchronize_sched - block until all CPUs have exited any non-preemptive > * kernel code sequences. > * > * This means that all preempt_disable code sequences, including NMI and > * hardware-interrupt handlers, in progress on entry will have completed > * before this primitive returns." That still does as it says in -rt. Its just that the interrupt handler will be preemptible so the guarantees it gives are useless. > > I find it curious that a driver that is 'low performant' and does not > > suffer lock contention pioneers locking schemes. I agree with > > optimizing, but this is not the place to push the envelope. > > Please realize that evey microsecond wasted on a 'low performant' > driver is taken from high performers and if we can help it why > shouldn't we? sure, but the cache eviction caused by running the driver will have more impact than the added rcu_read_{,un}lock() calls. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 21:19 ` Peter Zijlstra @ 2007-09-19 21:29 ` Dmitry Torokhov 2007-09-19 21:47 ` Peter Zijlstra 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-19 21:29 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Wed, 19 Sep 2007 16:41:04 -0400 "Dmitry Torokhov" > <dmitry.torokhov@gmail.com> wrote: > > > > If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() > > > function does sync_rcu() instead of _sched(), it should be good again. > > > It will not affect anything else than the task that calls _stop(). And > > > even there the only change is that the sleep might be a tad longer. > > > > And the IRQ handler needs to do some extra job... Anyway, it looks -rt > > breaks synchronize_sched() and needs to have it fixed: > > > > "/** > > * synchronize_sched - block until all CPUs have exited any non-preemptive > > * kernel code sequences. > > * > > * This means that all preempt_disable code sequences, including NMI and > > * hardware-interrupt handlers, in progress on entry will have completed > > * before this primitive returns." > > That still does as it says in -rt. Its just that the interrupt handler > will be preemptible so the guarantees it gives are useless. Please note "... including NMI and hardware-interrupt handlers ..." > > > > I find it curious that a driver that is 'low performant' and does not > > > suffer lock contention pioneers locking schemes. I agree with > > > optimizing, but this is not the place to push the envelope. > > > > Please realize that evey microsecond wasted on a 'low performant' > > driver is taken from high performers and if we can help it why > > shouldn't we? > > sure, but the cache eviction caused by running the driver will have > more impact than the added rcu_read_{,un}lock() calls. Are you saying that adding rcu_read_{,un}lock() will help with cache eviction? How? -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 21:29 ` Dmitry Torokhov @ 2007-09-19 21:47 ` Peter Zijlstra 2007-09-20 17:31 ` Dmitry Torokhov 0 siblings, 1 reply; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 21:47 UTC (permalink / raw) To: Dmitry Torokhov Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, 19 Sep 2007 17:29:09 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Wed, 19 Sep 2007 16:41:04 -0400 "Dmitry Torokhov" > > <dmitry.torokhov@gmail.com> wrote: > > > > > > If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() > > > > function does sync_rcu() instead of _sched(), it should be good again. > > > > It will not affect anything else than the task that calls _stop(). And > > > > even there the only change is that the sleep might be a tad longer. > > > > > > And the IRQ handler needs to do some extra job... Anyway, it looks -rt > > > breaks synchronize_sched() and needs to have it fixed: > > > > > > "/** > > > * synchronize_sched - block until all CPUs have exited any non-preemptive > > > * kernel code sequences. > > > * > > > * This means that all preempt_disable code sequences, including NMI and > > > * hardware-interrupt handlers, in progress on entry will have completed > > > * before this primitive returns." > > > > That still does as it says in -rt. Its just that the interrupt handler > > will be preemptible so the guarantees it gives are useless. > > Please note "... including NMI and hardware-interrupt handlers ..." -rt doesn't run interrupt handlers in hardware irq context anymore. > > > > > > I find it curious that a driver that is 'low performant' and does not > > > > suffer lock contention pioneers locking schemes. I agree with > > > > optimizing, but this is not the place to push the envelope. > > > > > > Please realize that evey microsecond wasted on a 'low performant' > > > driver is taken from high performers and if we can help it why > > > shouldn't we? > > > > sure, but the cache eviction caused by running the driver will have > > more impact than the added rcu_read_{,un}lock() calls. > > Are you saying that adding rcu_read_{,un}lock() will help with cache > eviction? How? No, I'm saying that its noise compared to the cache eviction overhead it causes for others. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 21:47 ` Peter Zijlstra @ 2007-09-20 17:31 ` Dmitry Torokhov 2007-09-21 0:01 ` Paul E. McKenney 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-20 17:31 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Wed, 19 Sep 2007 17:29:09 -0400 "Dmitry Torokhov" > <dmitry.torokhov@gmail.com> wrote: > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > On Wed, 19 Sep 2007 16:41:04 -0400 "Dmitry Torokhov" > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > > If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() > > > > > function does sync_rcu() instead of _sched(), it should be good again. > > > > > It will not affect anything else than the task that calls _stop(). And > > > > > even there the only change is that the sleep might be a tad longer. > > > > > > > > And the IRQ handler needs to do some extra job... Anyway, it looks -rt > > > > breaks synchronize_sched() and needs to have it fixed: > > > > > > > > "/** > > > > * synchronize_sched - block until all CPUs have exited any non-preemptive > > > > * kernel code sequences. > > > > * > > > > * This means that all preempt_disable code sequences, including NMI and > > > > * hardware-interrupt handlers, in progress on entry will have completed > > > > * before this primitive returns." > > > > > > That still does as it says in -rt. Its just that the interrupt handler > > > will be preemptible so the guarantees it gives are useless. > > > > Please note "... including NMI and hardware-interrupt handlers ..." > > -rt doesn't run interrupt handlers in hardware irq context anymore. OK, then what is the purpose of synchronize_sched() in -rt? You really need to provide users with a replacement. There are several drivers that use it and for example r8169 is not what you'd call a 'low performer'. I guess I can switch i8042 to use synchronize_irq(). That still works in -rt, doesn't it? That still leaves atkbd... > > > > > > > > > I find it curious that a driver that is 'low performant' and does not > > > > > suffer lock contention pioneers locking schemes. I agree with > > > > > optimizing, but this is not the place to push the envelope. > > > > > > > > Please realize that evey microsecond wasted on a 'low performant' > > > > driver is taken from high performers and if we can help it why > > > > shouldn't we? > > > > > > sure, but the cache eviction caused by running the driver will have > > > more impact than the added rcu_read_{,un}lock() calls. > > > > Are you saying that adding rcu_read_{,un}lock() will help with cache > > eviction? How? > > No, I'm saying that its noise compared to the cache eviction overhead > it causes for others. > What about udelay(10)? It is probably also a noise but we shoudl not go and sprinkle it through drivers, should we? ;) -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-20 17:31 ` Dmitry Torokhov @ 2007-09-21 0:01 ` Paul E. McKenney 2007-09-21 14:15 ` Dmitry Torokhov 0 siblings, 1 reply; 30+ messages in thread From: Paul E. McKenney @ 2007-09-21 0:01 UTC (permalink / raw) To: Dmitry Torokhov Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Thu, Sep 20, 2007 at 01:31:35PM -0400, Dmitry Torokhov wrote: > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Wed, 19 Sep 2007 17:29:09 -0400 "Dmitry Torokhov" > > <dmitry.torokhov@gmail.com> wrote: > > > > > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > > On Wed, 19 Sep 2007 16:41:04 -0400 "Dmitry Torokhov" > > > > <dmitry.torokhov@gmail.com> wrote: > > > > > > > > > > If the IRQ handler does rcu_read_lock(),unlock() and the i8042_stop() > > > > > > function does sync_rcu() instead of _sched(), it should be good again. > > > > > > It will not affect anything else than the task that calls _stop(). And > > > > > > even there the only change is that the sleep might be a tad longer. > > > > > > > > > > And the IRQ handler needs to do some extra job... Anyway, it looks -rt > > > > > breaks synchronize_sched() and needs to have it fixed: > > > > > > > > > > "/** > > > > > * synchronize_sched - block until all CPUs have exited any non-preemptive > > > > > * kernel code sequences. > > > > > * > > > > > * This means that all preempt_disable code sequences, including NMI and > > > > > * hardware-interrupt handlers, in progress on entry will have completed > > > > > * before this primitive returns." > > > > > > > > That still does as it says in -rt. Its just that the interrupt handler > > > > will be preemptible so the guarantees it gives are useless. > > > > > > Please note "... including NMI and hardware-interrupt handlers ..." > > > > -rt doesn't run interrupt handlers in hardware irq context anymore. > > OK, then what is the purpose of synchronize_sched() in -rt? To wait for all preempt-disable, irq-disable, hard-irq, and SMI/NMI code sequences to complete. > You really need to provide users with a replacement. There are several > drivers that use it and for example r8169 is not what you'd call a > 'low performer'. I did look at making a synchronize_all_irq() some time back, and all the approaches I came up with at the time were busted. But I just took another look, and I think I see a way to handle it. Either that, or I simply forgot the way in which this approach is broken... I will stare at is some more. > I guess I can switch i8042 to use synchronize_irq(). That still works > in -rt, doesn't it? That still leaves atkbd... Yep, looks that way to me. The only difference that I can see is that in -rt, concurrent synchronize_irq() calls on the same descriptor mean that the guy that gets there second has to wait for the next interrupt to happen. > > > > > > I find it curious that a driver that is 'low performant' and does not > > > > > > suffer lock contention pioneers locking schemes. I agree with > > > > > > optimizing, but this is not the place to push the envelope. > > > > > > > > > > Please realize that evey microsecond wasted on a 'low performant' > > > > > driver is taken from high performers and if we can help it why > > > > > shouldn't we? > > > > > > > > sure, but the cache eviction caused by running the driver will have > > > > more impact than the added rcu_read_{,un}lock() calls. > > > > > > Are you saying that adding rcu_read_{,un}lock() will help with cache > > > eviction? How? > > > > No, I'm saying that its noise compared to the cache eviction overhead > > it causes for others. > > What about udelay(10)? It is probably also a noise but we shoudl not > go and sprinkle it through drivers, should we? ;) Agreed! On the other hand, udelay(10) is more than two orders of magnitude slower than an rcu_read_lock() / rcu_read_unlock() round trip in -rt, and a full three orders of magnitude slower in CONFIG_PREEMPT. As for non-CONFIG_PREEMPT, well, "free is a very good price". ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-21 0:01 ` Paul E. McKenney @ 2007-09-21 14:15 ` Dmitry Torokhov 2007-09-21 14:30 ` Peter Zijlstra 0 siblings, 1 reply; 30+ messages in thread From: Dmitry Torokhov @ 2007-09-21 14:15 UTC (permalink / raw) To: paulmck Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On 9/20/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Thu, Sep 20, 2007 at 01:31:35PM -0400, Dmitry Torokhov wrote: > > > > OK, then what is the purpose of synchronize_sched() in -rt? > > To wait for all preempt-disable, irq-disable, hard-irq, and SMI/NMI code > sequences to complete. OK, so what spin_lock_irq[save]? Does is disable IRQs in -rt or not anymore? If IRQs are disabled it appears that I can continue using synchronize_sched(). > > > You really need to provide users with a replacement. There are several > > drivers that use it and for example r8169 is not what you'd call a > > 'low performer'. > > I did look at making a synchronize_all_irq() some time back, and all > the approaches I came up with at the time were busted. > > But I just took another look, and I think I see a way to handle it. > Either that, or I simply forgot the way in which this approach is > broken... > > I will stare at is some more. > Thank you. > > I guess I can switch i8042 to use synchronize_irq(). That still works > > in -rt, doesn't it? That still leaves atkbd... > > Yep, looks that way to me. The only difference that I can see is that > in -rt, concurrent synchronize_irq() calls on the same descriptor mean > that the guy that gets there second has to wait for the next interrupt > to happen. > Does this mean that there is a possibility for a thread to hang in synchronize_irq() if that second IRQ never comes? -- Dmitry ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-21 14:15 ` Dmitry Torokhov @ 2007-09-21 14:30 ` Peter Zijlstra 0 siblings, 0 replies; 30+ messages in thread From: Peter Zijlstra @ 2007-09-21 14:30 UTC (permalink / raw) To: Dmitry Torokhov Cc: paulmck, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Fri, 21 Sep 2007 10:15:10 -0400 "Dmitry Torokhov" <dmitry.torokhov@gmail.com> wrote: > On 9/20/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > On Thu, Sep 20, 2007 at 01:31:35PM -0400, Dmitry Torokhov wrote: > > > > > > OK, then what is the purpose of synchronize_sched() in -rt? > > > > To wait for all preempt-disable, irq-disable, hard-irq, and SMI/NMI code > > sequences to complete. > > OK, so what spin_lock_irq[save]? Does is disable IRQs in -rt or not anymore? > If IRQs are disabled it appears that I can continue using synchronize_sched(). No it will not disable IRQs anymore. > > > > > You really need to provide users with a replacement. There are several > > > drivers that use it and for example r8169 is not what you'd call a > > > 'low performer'. > > > > I did look at making a synchronize_all_irq() some time back, and all > > the approaches I came up with at the time were busted. > > > > But I just took another look, and I think I see a way to handle it. > > Either that, or I simply forgot the way in which this approach is > > broken... > > > > I will stare at is some more. > > > > Thank you. I think the patch posted by Paul earlier today should work. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() 2007-09-19 19:49 ` Dmitry Torokhov 2007-09-19 20:13 ` Peter Zijlstra @ 2007-09-19 20:48 ` Paul E. McKenney 1 sibling, 0 replies; 30+ messages in thread From: Paul E. McKenney @ 2007-09-19 20:48 UTC (permalink / raw) To: Dmitry Torokhov Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton, Nick Piggin On Wed, Sep 19, 2007 at 03:49:24PM -0400, Dmitry Torokhov wrote: > On 9/19/07, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Wed, 19 Sep 2007 14:49:56 -0400 "Dmitry Torokhov" > > <dmitry.torokhov@gmail.com> wrote: > > > > > On 9/19/07, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > > > PS to previous -- any problem with inserting rcu_read_lock() and > > > > rcu_read_unlock() around the portion of the IRQ handler that has > > > > these accesses? > > > > > > > > > > I guess I could but it is an extra lock that needs to be managed and > > > given the fact that it is not really needed (other to make a newly > > > developed tool happy) I am hestsant to do that. > > > > As is, these sites are a bug in -rt and we'll need to fix them anyway. > > > > As for the code you pointed me to, the i8042 driver, it seems to play > > way to funny tricks for a simple 'slow' driver. > > Even "slow" driver should try not to slow down the rest of the system > if it can help it. I am sorry if the thing it does do not quite fit in > with the changes you are proposing but it does not make the exeisting > code invalid. > > > If you replace the spin_lock() + sync_sched(), with rcu_read_lock() + > > rcu_call() it should work again without adding an extra lock. > > Except that I need spin_lock_irq for other reasons. I could take the > same lock in write-side code and not use RCU at all but using RCU > allows opening/closing input devices without slowing down interrupt > handlers so why not use it? One approach would be to make rcu_read_held() check for in_interrupt() or some such. This would allow Dmitry's code to stay as it is, for the moment at least. Thanx, Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 3/6] lockdep: rcu_dereference() vs preempt_disable() 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra 2007-09-19 10:41 ` [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} Peter Zijlstra 2007-09-19 10:41 ` [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() Peter Zijlstra @ 2007-09-19 10:41 ` Peter Zijlstra 2007-09-19 10:41 ` [RFC][PATCH 4/6] implicit vs explicit preempt_disable() Peter Zijlstra ` (3 subsequent siblings) 6 siblings, 0 replies; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw) To: linux-kernel Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra, Nick Piggin [-- Attachment #1: rcu-lockdep-preempt.patch --] [-- Type: text/plain, Size: 3392 bytes --] Don't warn when preemption is disabled using preempt_disable() Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- include/linux/preempt.h | 19 +++++++++++-------- kernel/sched.c | 14 ++++++++++---- lib/Kconfig.debug | 1 + 3 files changed, 22 insertions(+), 12 deletions(-) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -3396,7 +3396,7 @@ void scheduler_tick(void) #if defined(CONFIG_PREEMPT) && defined(CONFIG_DEBUG_PREEMPT) -void fastcall add_preempt_count(int val) +void fastcall __add_preempt_count(int val, int exp) { /* * Underflow? @@ -3409,10 +3409,13 @@ void fastcall add_preempt_count(int val) */ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >= PREEMPT_MASK - 10); + + if (exp) + rcu_read_acquire(); } -EXPORT_SYMBOL(add_preempt_count); +EXPORT_SYMBOL(__add_preempt_count); -void fastcall sub_preempt_count(int val) +void fastcall __sub_preempt_count(int val, int exp) { /* * Underflow? @@ -3427,8 +3430,11 @@ void fastcall sub_preempt_count(int val) return; preempt_count() -= val; + + if (exp) + rcu_read_release(); } -EXPORT_SYMBOL(sub_preempt_count); +EXPORT_SYMBOL(__sub_preempt_count); #endif Index: linux-2.6/include/linux/preempt.h =================================================================== --- linux-2.6.orig/include/linux/preempt.h +++ linux-2.6/include/linux/preempt.h @@ -11,15 +11,18 @@ #include <linux/list.h> #ifdef CONFIG_DEBUG_PREEMPT - extern void fastcall add_preempt_count(int val); - extern void fastcall sub_preempt_count(int val); + extern void fastcall __add_preempt_count(int val, int exp); + extern void fastcall __sub_preempt_count(int val, int exp); #else -# define add_preempt_count(val) do { preempt_count() += (val); } while (0) -# define sub_preempt_count(val) do { preempt_count() -= (val); } while (0) +# define __add_preempt_count(val, exp) do { preempt_count() += (val); } while (0) +# define __sub_preempt_count(val, exp) do { preempt_count() -= (val); } while (0) #endif -#define inc_preempt_count() add_preempt_count(1) -#define dec_preempt_count() sub_preempt_count(1) +#define add_preempt_count(val) __add_preempt_count(val, 0); +#define sub_preempt_count(val) __sub_preempt_count(val, 0); + +#define inc_preempt_count() __add_preempt_count(1, 0) +#define dec_preempt_count() __sub_preempt_count(1, 0) #define preempt_count() (current_thread_info()->preempt_count) @@ -29,14 +32,14 @@ asmlinkage void preempt_schedule(void); #define preempt_disable() \ do { \ - inc_preempt_count(); \ + __add_preempt_count(1, 1); \ barrier(); \ } while (0) #define preempt_enable_no_resched() \ do { \ barrier(); \ - dec_preempt_count(); \ + __sub_preempt_count(1, 1); \ } while (0) #define preempt_check_resched() \ Index: linux-2.6/lib/Kconfig.debug =================================================================== --- linux-2.6.orig/lib/Kconfig.debug +++ linux-2.6/lib/Kconfig.debug @@ -237,6 +237,7 @@ config DEBUG_SEMAPHORE config DEBUG_LOCK_ALLOC bool "Lock debugging: detect incorrect freeing of live locks" depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT + select DEBUG_PREEMPT select DEBUG_SPINLOCK select DEBUG_MUTEXES select LOCKDEP -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 4/6] implicit vs explicit preempt_disable() 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra ` (2 preceding siblings ...) 2007-09-19 10:41 ` [RFC][PATCH 3/6] lockdep: rcu_dereference() vs preempt_disable() Peter Zijlstra @ 2007-09-19 10:41 ` Peter Zijlstra 2007-09-19 10:41 ` [RFC][PATCH 5/6] fixup funny preemption tricks in irq_exit Peter Zijlstra ` (2 subsequent siblings) 6 siblings, 0 replies; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw) To: linux-kernel Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra, Nick Piggin [-- Attachment #1: preempt-spinlock.patch --] [-- Type: text/plain, Size: 14165 bytes --] Decouple preempt_disable() from the locking primitives, so as that we can test for proper rcu_dereference() context regardless of the locking model. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- include/linux/preempt.h | 37 +++++++++++++++++++---- kernel/sched.c | 10 +++--- kernel/spinlock.c | 76 ++++++++++++++++++++++++------------------------ lib/kernel_lock.c | 16 +++++----- 4 files changed, 82 insertions(+), 57 deletions(-) Index: linux-2.6/include/linux/preempt.h =================================================================== --- linux-2.6.orig/include/linux/preempt.h +++ linux-2.6/include/linux/preempt.h @@ -30,6 +30,33 @@ asmlinkage void preempt_schedule(void); +#define preempt_check_resched() \ +do { \ + if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \ + preempt_schedule(); \ +} while (0) + + +#define _preempt_disable() \ +do { \ + __add_preempt_count(1, 0); \ + barrier(); \ +} while (0) + +#define _preempt_enable_no_resched() \ +do { \ + barrier(); \ + __sub_preempt_count(1, 0); \ +} while (0) + +#define _preempt_enable() \ +do { \ + _preempt_enable_no_resched(); \ + barrier(); \ + preempt_check_resched(); \ +} while (0) + + #define preempt_disable() \ do { \ __add_preempt_count(1, 1); \ @@ -42,12 +69,6 @@ do { \ __sub_preempt_count(1, 1); \ } while (0) -#define preempt_check_resched() \ -do { \ - if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \ - preempt_schedule(); \ -} while (0) - #define preempt_enable() \ do { \ preempt_enable_no_resched(); \ @@ -57,6 +78,10 @@ do { \ #else +#define _preempt_disable() do { } while (0) +#define _preempt_enable_no_resched() do { } while (0) +#define _preempt_enable() do { } while (0) + #define preempt_disable() do { } while (0) #define preempt_enable_no_resched() do { } while (0) #define preempt_enable() do { } while (0) Index: linux-2.6/kernel/spinlock.c =================================================================== --- linux-2.6.orig/kernel/spinlock.c +++ linux-2.6/kernel/spinlock.c @@ -23,39 +23,39 @@ int __lockfunc _spin_trylock(spinlock_t *lock) { - preempt_disable(); + _preempt_disable(); if (_raw_spin_trylock(lock)) { spin_acquire(&lock->dep_map, 0, 1, _RET_IP_); return 1; } - preempt_enable(); + _preempt_enable(); return 0; } EXPORT_SYMBOL(_spin_trylock); int __lockfunc _read_trylock(rwlock_t *lock) { - preempt_disable(); + _preempt_disable(); if (_raw_read_trylock(lock)) { rwlock_acquire_read(&lock->dep_map, 0, 1, _RET_IP_); return 1; } - preempt_enable(); + _preempt_enable(); return 0; } EXPORT_SYMBOL(_read_trylock); int __lockfunc _write_trylock(rwlock_t *lock) { - preempt_disable(); + _preempt_disable(); if (_raw_write_trylock(lock)) { rwlock_acquire(&lock->dep_map, 0, 1, _RET_IP_); return 1; } - preempt_enable(); + _preempt_enable(); return 0; } EXPORT_SYMBOL(_write_trylock); @@ -70,7 +70,7 @@ EXPORT_SYMBOL(_write_trylock); void __lockfunc _read_lock(rwlock_t *lock) { - preempt_disable(); + _preempt_disable(); rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock); } @@ -81,7 +81,7 @@ unsigned long __lockfunc _spin_lock_irqs unsigned long flags; local_irq_save(flags); - preempt_disable(); + _preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); /* * On lockdep we dont want the hand-coded irq-enable of @@ -100,7 +100,7 @@ EXPORT_SYMBOL(_spin_lock_irqsave); void __lockfunc _spin_lock_irq(spinlock_t *lock) { local_irq_disable(); - preempt_disable(); + _preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock); } @@ -109,7 +109,7 @@ EXPORT_SYMBOL(_spin_lock_irq); void __lockfunc _spin_lock_bh(spinlock_t *lock) { local_bh_disable(); - preempt_disable(); + _preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock); } @@ -120,7 +120,7 @@ unsigned long __lockfunc _read_lock_irqs unsigned long flags; local_irq_save(flags); - preempt_disable(); + _preempt_disable(); rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock); return flags; @@ -130,7 +130,7 @@ EXPORT_SYMBOL(_read_lock_irqsave); void __lockfunc _read_lock_irq(rwlock_t *lock) { local_irq_disable(); - preempt_disable(); + _preempt_disable(); rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock); } @@ -139,7 +139,7 @@ EXPORT_SYMBOL(_read_lock_irq); void __lockfunc _read_lock_bh(rwlock_t *lock) { local_bh_disable(); - preempt_disable(); + _preempt_disable(); rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_read_trylock, _raw_read_lock); } @@ -150,7 +150,7 @@ unsigned long __lockfunc _write_lock_irq unsigned long flags; local_irq_save(flags); - preempt_disable(); + _preempt_disable(); rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock); return flags; @@ -160,7 +160,7 @@ EXPORT_SYMBOL(_write_lock_irqsave); void __lockfunc _write_lock_irq(rwlock_t *lock) { local_irq_disable(); - preempt_disable(); + _preempt_disable(); rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock); } @@ -169,7 +169,7 @@ EXPORT_SYMBOL(_write_lock_irq); void __lockfunc _write_lock_bh(rwlock_t *lock) { local_bh_disable(); - preempt_disable(); + _preempt_disable(); rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock); } @@ -177,7 +177,7 @@ EXPORT_SYMBOL(_write_lock_bh); void __lockfunc _spin_lock(spinlock_t *lock) { - preempt_disable(); + _preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock); } @@ -186,7 +186,7 @@ EXPORT_SYMBOL(_spin_lock); void __lockfunc _write_lock(rwlock_t *lock) { - preempt_disable(); + _preempt_disable(); rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_write_trylock, _raw_write_lock); } @@ -207,10 +207,10 @@ EXPORT_SYMBOL(_write_lock); void __lockfunc _##op##_lock(locktype##_t *lock) \ { \ for (;;) { \ - preempt_disable(); \ + _preempt_disable(); \ if (likely(_raw_##op##_trylock(lock))) \ break; \ - preempt_enable(); \ + _preempt_enable(); \ \ if (!(lock)->break_lock) \ (lock)->break_lock = 1; \ @@ -227,12 +227,12 @@ unsigned long __lockfunc _##op##_lock_ir unsigned long flags; \ \ for (;;) { \ - preempt_disable(); \ + _preempt_disable(); \ local_irq_save(flags); \ if (likely(_raw_##op##_trylock(lock))) \ break; \ local_irq_restore(flags); \ - preempt_enable(); \ + _preempt_enable(); \ \ if (!(lock)->break_lock) \ (lock)->break_lock = 1; \ @@ -287,7 +287,7 @@ BUILD_LOCK_OPS(write, rwlock); void __lockfunc _spin_lock_nested(spinlock_t *lock, int subclass) { - preempt_disable(); + _preempt_disable(); spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_); LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock); } @@ -298,7 +298,7 @@ unsigned long __lockfunc _spin_lock_irqs unsigned long flags; local_irq_save(flags); - preempt_disable(); + _preempt_disable(); spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_); /* * On lockdep we dont want the hand-coded irq-enable of @@ -321,7 +321,7 @@ void __lockfunc _spin_unlock(spinlock_t { spin_release(&lock->dep_map, 1, _RET_IP_); _raw_spin_unlock(lock); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_spin_unlock); @@ -329,7 +329,7 @@ void __lockfunc _write_unlock(rwlock_t * { rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_write_unlock(lock); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_write_unlock); @@ -337,7 +337,7 @@ void __lockfunc _read_unlock(rwlock_t *l { rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_read_unlock(lock); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_read_unlock); @@ -346,7 +346,7 @@ void __lockfunc _spin_unlock_irqrestore( spin_release(&lock->dep_map, 1, _RET_IP_); _raw_spin_unlock(lock); local_irq_restore(flags); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_spin_unlock_irqrestore); @@ -355,7 +355,7 @@ void __lockfunc _spin_unlock_irq(spinloc spin_release(&lock->dep_map, 1, _RET_IP_); _raw_spin_unlock(lock); local_irq_enable(); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_spin_unlock_irq); @@ -363,7 +363,7 @@ void __lockfunc _spin_unlock_bh(spinlock { spin_release(&lock->dep_map, 1, _RET_IP_); _raw_spin_unlock(lock); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); local_bh_enable_ip((unsigned long)__builtin_return_address(0)); } EXPORT_SYMBOL(_spin_unlock_bh); @@ -373,7 +373,7 @@ void __lockfunc _read_unlock_irqrestore( rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_read_unlock(lock); local_irq_restore(flags); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_read_unlock_irqrestore); @@ -382,7 +382,7 @@ void __lockfunc _read_unlock_irq(rwlock_ rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_read_unlock(lock); local_irq_enable(); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_read_unlock_irq); @@ -390,7 +390,7 @@ void __lockfunc _read_unlock_bh(rwlock_t { rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_read_unlock(lock); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); local_bh_enable_ip((unsigned long)__builtin_return_address(0)); } EXPORT_SYMBOL(_read_unlock_bh); @@ -400,7 +400,7 @@ void __lockfunc _write_unlock_irqrestore rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_write_unlock(lock); local_irq_restore(flags); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_write_unlock_irqrestore); @@ -409,7 +409,7 @@ void __lockfunc _write_unlock_irq(rwlock rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_write_unlock(lock); local_irq_enable(); - preempt_enable(); + _preempt_enable(); } EXPORT_SYMBOL(_write_unlock_irq); @@ -417,7 +417,7 @@ void __lockfunc _write_unlock_bh(rwlock_ { rwlock_release(&lock->dep_map, 1, _RET_IP_); _raw_write_unlock(lock); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); local_bh_enable_ip((unsigned long)__builtin_return_address(0)); } EXPORT_SYMBOL(_write_unlock_bh); @@ -425,13 +425,13 @@ EXPORT_SYMBOL(_write_unlock_bh); int __lockfunc _spin_trylock_bh(spinlock_t *lock) { local_bh_disable(); - preempt_disable(); + _preempt_disable(); if (_raw_spin_trylock(lock)) { spin_acquire(&lock->dep_map, 0, 1, _RET_IP_); return 1; } - preempt_enable_no_resched(); + _preempt_enable_no_resched(); local_bh_enable_ip((unsigned long)__builtin_return_address(0)); return 0; } Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -1847,7 +1847,7 @@ asmlinkage void schedule_tail(struct tas finish_task_switch(rq, prev); #ifdef __ARCH_WANT_UNLOCKED_CTXSW /* In this case, finish_task_switch does not reenable preemption */ - preempt_enable(); + _preempt_enable(); #endif if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); @@ -3512,7 +3512,7 @@ asmlinkage void __sched schedule(void) int cpu; need_resched: - preempt_disable(); + _preempt_disable(); cpu = smp_processor_id(); rq = cpu_rq(cpu); rcu_qsctr_inc(cpu); @@ -3560,7 +3560,7 @@ need_resched_nonpreemptible: rq = cpu_rq(cpu); goto need_resched_nonpreemptible; } - preempt_enable_no_resched(); + _preempt_enable_no_resched(); if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) goto need_resched; } @@ -4605,7 +4605,7 @@ asmlinkage long sys_sched_yield(void) __release(rq->lock); spin_release(&rq->lock.dep_map, 1, _THIS_IP_); _raw_spin_unlock(&rq->lock); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); schedule(); @@ -4661,7 +4661,7 @@ int cond_resched_lock(spinlock_t *lock) if (need_resched() && system_state == SYSTEM_RUNNING) { spin_release(&lock->dep_map, 1, _THIS_IP_); _raw_spin_unlock(lock); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); __cond_resched(); ret = 1; spin_lock(lock); Index: linux-2.6/lib/kernel_lock.c =================================================================== --- linux-2.6.orig/lib/kernel_lock.c +++ linux-2.6/lib/kernel_lock.c @@ -44,11 +44,11 @@ int __lockfunc __reacquire_kernel_lock(v BUG_ON(saved_lock_depth < 0); task->lock_depth = -1; - preempt_enable_no_resched(); + _preempt_enable_no_resched(); down(&kernel_sem); - preempt_disable(); + _preempt_disable(); task->lock_depth = saved_lock_depth; return 0; @@ -121,14 +121,14 @@ int __lockfunc __reacquire_kernel_lock(v return -EAGAIN; cpu_relax(); } - preempt_disable(); + _preempt_disable(); return 0; } void __lockfunc __release_kernel_lock(void) { _raw_spin_unlock(&kernel_flag); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); } /* @@ -139,7 +139,7 @@ void __lockfunc __release_kernel_lock(vo #ifdef CONFIG_PREEMPT static inline void __lock_kernel(void) { - preempt_disable(); + _preempt_disable(); if (unlikely(!_raw_spin_trylock(&kernel_flag))) { /* * If preemption was disabled even before this @@ -156,10 +156,10 @@ static inline void __lock_kernel(void) * with preemption enabled.. */ do { - preempt_enable(); + _preempt_enable(); while (spin_is_locked(&kernel_flag)) cpu_relax(); - preempt_disable(); + _preempt_disable(); } while (!_raw_spin_trylock(&kernel_flag)); } } @@ -182,7 +182,7 @@ static inline void __unlock_kernel(void) * unlocking sequence (and thus avoid the dep-chain ops): */ _raw_spin_unlock(&kernel_flag); - preempt_enable(); + _preempt_enable(); } /* -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 5/6] fixup funny preemption tricks in irq_exit 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra ` (3 preceding siblings ...) 2007-09-19 10:41 ` [RFC][PATCH 4/6] implicit vs explicit preempt_disable() Peter Zijlstra @ 2007-09-19 10:41 ` Peter Zijlstra 2007-09-19 10:41 ` [RFC][PATCH 6/6] fixup early boot Peter Zijlstra 2007-09-19 13:38 ` [RFC][PATCH 0/6] using lockdep to validate rcu usage Ingo Molnar 6 siblings, 0 replies; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw) To: linux-kernel Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra, Nick Piggin [-- Attachment #1: preempt-fixup-irq-exit.patch --] [-- Type: text/plain, Size: 535 bytes --] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- kernel/softirq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/kernel/softirq.c =================================================================== --- linux-2.6.orig/kernel/softirq.c +++ linux-2.6/kernel/softirq.c @@ -312,7 +312,7 @@ void irq_exit(void) if (!in_interrupt() && idle_cpu(smp_processor_id()) && !need_resched()) tick_nohz_stop_sched_tick(); #endif - preempt_enable_no_resched(); + _preempt_enable_no_resched(); } /* -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 6/6] fixup early boot 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra ` (4 preceding siblings ...) 2007-09-19 10:41 ` [RFC][PATCH 5/6] fixup funny preemption tricks in irq_exit Peter Zijlstra @ 2007-09-19 10:41 ` Peter Zijlstra 2007-09-19 13:38 ` [RFC][PATCH 0/6] using lockdep to validate rcu usage Ingo Molnar 6 siblings, 0 replies; 30+ messages in thread From: Peter Zijlstra @ 2007-09-19 10:41 UTC (permalink / raw) To: linux-kernel Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton, Peter Zijlstra, Nick Piggin [-- Attachment #1: preempt-fixup-rest_init.patch --] [-- Type: text/plain, Size: 880 bytes --] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- init/main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) Index: linux-2.6/init/main.c =================================================================== --- linux-2.6.orig/init/main.c +++ linux-2.6/init/main.c @@ -452,7 +452,8 @@ static void noinline __init_refok rest_i * at least once to get things moving: */ init_idle_bootup_task(current); - preempt_enable_no_resched(); + _preempt_enable_no_resched(); + schedule(); preempt_disable(); @@ -556,7 +557,7 @@ asmlinkage void __init start_kernel(void * Disable preemption - early bootup scheduling is extremely * fragile until we cpu_idle() for the first time. */ - preempt_disable(); + _preempt_disable(); build_all_zonelists(); page_alloc_init(); printk(KERN_NOTICE "Kernel command line: %s\n", boot_command_line); -- ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/6] using lockdep to validate rcu usage 2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra ` (5 preceding siblings ...) 2007-09-19 10:41 ` [RFC][PATCH 6/6] fixup early boot Peter Zijlstra @ 2007-09-19 13:38 ` Ingo Molnar 6 siblings, 0 replies; 30+ messages in thread From: Ingo Molnar @ 2007-09-19 13:38 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel, Paul E. McKenney, Andrew Morton, Nick Piggin * Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > This patch set uses lockdep to validate rcu usage. > > It annotates rcu_read_{,un}lock{,_bh}() to catch imbalances. And > further uses that information to establish a proper context for > rcu_dereference(). > > It also separates implicit from explicit preempt_disable() usage, in > order to separate rcu_dereference() from the locking model. > > A kernel (2.6.23-rc4-mm1) with these patches boots but does have some > funnies - I suspect it calls printf from places it doesn't like. > > The first patch should be safe to apply, the rest is RFC. great work! The patches look good to me. Signed-off-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2007-09-21 14:30 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-19 10:41 [RFC][PATCH 0/6] using lockdep to validate rcu usage Peter Zijlstra
2007-09-19 10:41 ` [RFC][PATCH 1/6] lockdep: annotate rcu_read_{,un}lock{,_bh} Peter Zijlstra
2007-09-19 23:06 ` Paul E. McKenney
2007-09-19 10:41 ` [RFC][PATCH 2/6] lockdep: validate rcu_dereference() vs rcu_read_lock() Peter Zijlstra
2007-09-19 14:17 ` Dmitry Torokhov
2007-09-19 14:31 ` Peter Zijlstra
2007-09-19 15:16 ` Dmitry Torokhov
2007-09-19 15:25 ` Peter Zijlstra
2007-09-19 15:37 ` Paul E. McKenney
2007-09-19 16:59 ` Dmitry Torokhov
2007-09-19 17:32 ` Paul E. McKenney
2007-09-19 17:48 ` Paul E. McKenney
2007-09-19 18:49 ` Dmitry Torokhov
2007-09-19 19:41 ` Peter Zijlstra
2007-09-19 19:49 ` Dmitry Torokhov
2007-09-19 20:13 ` Peter Zijlstra
2007-09-19 20:41 ` Dmitry Torokhov
2007-09-19 21:19 ` Peter Zijlstra
2007-09-19 21:29 ` Dmitry Torokhov
2007-09-19 21:47 ` Peter Zijlstra
2007-09-20 17:31 ` Dmitry Torokhov
2007-09-21 0:01 ` Paul E. McKenney
2007-09-21 14:15 ` Dmitry Torokhov
2007-09-21 14:30 ` Peter Zijlstra
2007-09-19 20:48 ` Paul E. McKenney
2007-09-19 10:41 ` [RFC][PATCH 3/6] lockdep: rcu_dereference() vs preempt_disable() Peter Zijlstra
2007-09-19 10:41 ` [RFC][PATCH 4/6] implicit vs explicit preempt_disable() Peter Zijlstra
2007-09-19 10:41 ` [RFC][PATCH 5/6] fixup funny preemption tricks in irq_exit Peter Zijlstra
2007-09-19 10:41 ` [RFC][PATCH 6/6] fixup early boot Peter Zijlstra
2007-09-19 13:38 ` [RFC][PATCH 0/6] using lockdep to validate rcu usage Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox