[RFC/PATCH 0/6] lockdep: Maintain read/write states for lock

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires
@ 2009-05-11 12:39 Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 1/6] lockdep: Remove redundant read checks Gautham R Shenoy
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

Hi,

Lockdep, while maintaining it's dependency chains does currently not
differentiate between the read/write acquires of a lock.

Thus a dependency
	rlock(A) ---> wlock(B)
is treated the same as
	wlock(A) ---> rlock(B).

If the implementation of the reader writer lock is such that upon the arrival
of a writer, it ensures that all the future readers will wait till the
completion of the writer, this non-distinction between the read and the write
acquires doesn't matter. This is because, the reads in such implementations
aren't recursive, since a recursive-read can deadlock in the presence of a
writer.
i.e in :
	Thread1: rlock(A) ------------> rlock(A)
	Thread2:          wlock(A)

Thread 1 will end up blocking on the second rlock(A), while Thread 2 is blocked
on wlock(A).
The RW-Semaphores in the linux kernel are implemented in this fashion.

However, if the implementation of the reader-writer lock allows
"reader-preference", i.e it allows the writer to grab the lock iff there are
no readers in the system, the distinction between a read/write acquire may be
necessary, since the implementation permits, read-side recursion as a
side-effect. rwlocks in the linux kernel follow a similar implementation.

lockdep however refrains from making this read/write acquire distinction for
such locks. It tries to solve the problem by not maintaining dependencies for
the reads, and maintains dependencies only for the writes.

As a result, ABBA deadlocks of the following type, go unnoticed by lockdep:

	Thread 1
	=====================
	read_lock(A);
	spin_lock(B);
	/* A-->B dependency is not maintained for reason mentioned */

	+

	Thread 2
	=====================
	spin_lock(B);
	write_lock(A);
	/* B-->A dependency maintained. But no cycle in graph.*/

Peter Zijlstra had a patch to solve this by considering the only the
first recursive-read instance of a lock in the current context while
maintaining the dependencies. http://lkml.org/lkml/2008/4/29/212

However, this had a nasty side effect since now, dependencies of the type:
	read_lock(A) --> spin_lock(B)
	spin_lock(B) --> read_lock(A)
and it's variants would also be reported as a deadlock.

This patchset solves the problem by recording the read/write states of the
locks acquired in a dependency.

i.e, for a dependency of the
type
	spin_lock(B) --> read_lock(A),

in B's "locks_after" entry for this dependency, we not only store the class of A,
but also store the the "rw-state" of both A and B. Ditto for A's
"locks_before" entry.

We define three rw-states namely:
	STATE_WRITE
	STATE_READ
	STATE_RECURSIVE_READ.

For each of these states, we define a set of conflicting states which can cause
a deadlock when involved in a dependency with the former . i.e,
	conflict_states(STATE_WRITE) =
			STATE_WRITE | STATE_READ | STATE_RECURSIVE_READ
	conflict_state(STATE_READ)   =  STATE_WRITE | STATE_READ
	conflict_state(STATE_RECURSIVE_READ) = STATE_WRITE

So, when we're walking the dependency graph to see if there is a cycle in the
graph, we follow only those entries in the "locks_after" list of a given lock,
where this lock was acquired in a rw-state which conflicts with that lock's
current rw-state.

i.e, if say lock A's current rw-state is STATE_RECURSIVE_READ,
and the locks_after list for A looks something like this:

this_class: A

			A->locks_after
	                        |
				|
				V
	---------------------------------------------------
	| dep_class: B                                    |
	| dep_rw_state: STATE_WRITE                       |
	| this_rw_state: STATE_WRITE                      |
	---------------------------------------------------
	                        |
				|
				V
	---------------------------------------------------
	| dep_class: C                                    |
	| dep_rw_state: STATE_RECURSIVE_READ              |
	| this_rw_state: STATE_RECURSIVE_READ             |
	---------------------------------------------------
	                        |
				|
				V
	---------------------------------------------------
	| dep_class: D                                    |
	| dep_rw_state: STATE_RECURSIVE_READ              |
	| this_rw_state: STATE_WRITE                      |
	---------------------------------------------------

Here, we follow only the entries where dep_class is B and D since these are the
only entries where A's rw_state (STATE_WRITE) conflicts with
it's current rw_state (STATE_RECURSIVE_READ). However, if A's current rw_state
had been STATE_WRITE, we would have followed all the three entries.

The patch-set has been tested against 2.6.30-rc3. It passes the boot-up
self-test. However more test cases need to be added to the locking-self test
which will be done in the future iterations of the patch.

Feedback is very much appreciated.

---
Gautham R Shenoy (6):
      lockdep: Consider the rw_state of lock while validating the chain.
      lockdep: Maintain rw_state entries in locklist
      lockdep: Seperate lock ids for read/write acquires.
      lockdep: Annotate Read/write states
      lockdep: Remove strict read checks.
      lockdep: Remove redundant read checks.

 include/linux/lockdep.h |  153 +++++++++++++++++++++++++++++++------
 kernel/lockdep.c        |  193 +++++++++++++++++++++++++++++------------------
 kernel/lockdep_proc.c   |    4 -
 3 files changed, 247 insertions(+), 103 deletions(-)

-- 
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC/PATCH PATCH 1/6] lockdep: Remove redundant read checks.
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
@ 2009-05-11 12:39 ` Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 2/6] lockdep: Remove strict " Gautham R Shenoy
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

1) In kernel/lockdep.c::validate_chain()
	ret = check_deadlock(curr, hlock, lock, hlock->read);
ret = 2 only if hlock->read = 2.

Hence,
	if (ret == 2)
		hlock->read = 2;
is redundant, hence can be removed.

2) In kernel/lockdep.c::check_prevs_add(curr, next)
	if (hlock->read != 2)
		check_prev_add(curr, hlock, next, distance);

Thus, check_prev_add is called only when hlock->read != 2.

>From the conclusions of 2),
kernel/lockdep.c::check_prev_add(curr, prev, next, distance) gets called
iff prev->read != 2.
Hence, in kernel/lockdep.c::check_prev_add(curr, prev, next, distance)
	if (prev->read == 2)
		return 1;
is redunant, hence can be removed.


Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---

 kernel/lockdep.c |    9 +--------
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index accb40c..8d3c2b5 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -1496,7 +1496,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 * write-lock never takes any other locks, then the reads are
 	 * equivalent to a NOP.
 	 */
-	if (next->read == 2 || prev->read == 2)
+	if (next->read == 2)
 		return 1;
 	/*
 	 * Is the <prev> -> <next> dependency already present?
@@ -1754,13 +1754,6 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
 		if (!ret)
 			return 0;
 		/*
-		 * Mark recursive read, as we jump over it when
-		 * building dependencies (just like we jump over
-		 * trylock entries):
-		 */
-		if (ret == 2)
-			hlock->read = 2;
-		/*
 		 * Add dependency only if this lock is not the head
 		 * of the chain, and if it's not a secondary read-lock:
 		 */


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH PATCH 2/6] lockdep: Remove strict read checks.
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 1/6] lockdep: Remove redundant read checks Gautham R Shenoy
@ 2009-05-11 12:39 ` Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 3/6] lockdep: Annotate Read/write states Gautham R Shenoy
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

Read locks would cause IRQ inversion problems only when they're being taken in
a IRQ context. So, except for this case, skip the strict read checks for every
other case.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---

 kernel/lockdep.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 8d3c2b5..597479a 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -1986,8 +1986,6 @@ static int RECLAIM_FS_verbose(struct lock_class *class)
 	return 0;
 }
 
-#define STRICT_READ_CHECKS	1
-
 static int (*state_verbose_f[])(struct lock_class *class) = {
 #define LOCKDEP_STATE(__STATE) \
 	__STATE##_verbose,
@@ -2032,9 +2030,13 @@ mark_lock_irq(struct task_struct *curr, struct held_lock *this,
 	/*
 	 * Validate that the lock dependencies don't have conflicting usage
 	 * states.
+	 *
+	 * Skip the part if it's a read_lock with LOCK_ENABLED_XXX.
+	 * We'll do that check when a write variant of the same lock is taken,
+	 * or we take the read variant of lock in the LOCK_USED_IN_XXX state.
 	 */
-	if ((!read || !dir || STRICT_READ_CHECKS) &&
-			!usage(curr, this, excl_bit, state_name(new_bit & ~1)))
+	if (!(read && dir) && !usage(curr, this, excl_bit,
+					state_name(new_bit & ~1)))
 		return 0;
 
 	/*
@@ -2044,8 +2046,7 @@ mark_lock_irq(struct task_struct *curr, struct held_lock *this,
 		if (!valid_state(curr, this, new_bit, excl_bit + 1))
 			return 0;
 
-		if (STRICT_READ_CHECKS &&
-			!usage(curr, this, excl_bit + 1,
+		if (!usage(curr, this, excl_bit + 1,
 				state_name(new_bit + 1)))
 			return 0;
 	}


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH PATCH 3/6] lockdep: Annotate Read/write states
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 1/6] lockdep: Remove redundant read checks Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 2/6] lockdep: Remove strict " Gautham R Shenoy
@ 2009-05-11 12:39 ` Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 4/6] lockdep: Seperate lock ids for read/write acquires Gautham R Shenoy
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

Currently we do not save the recursive read dependencies in the dependency
chain. As a result, a deadlock caused by the following chains are not spotted,
since we never have the chain 1 in our dependency list.

1: Rlock(A)-->lock(B)
2: lock(B) --> Wlock(A), where A is a recursive read lock.

Before adding the Recursive Read locks to the dependency chains, we need to
distinguish them from the normal read locks since the conflicting states for
these two are quite different.

Currently the read/write status of a lock while it's acquired is denoted by a
monotonically increasing variable where
0 - WRITE
1 - READ
2 - RECURSIVE READ.

In this patch, we propose to modify this distinction from a monotonically
increasing variable to a bit mask where
0x1 - WRITE
0x2 - READ
0x4 - RECURSIVE READ.

This helps us to define the conflicting states for each lock with ease:
Thereby, the conflicting states for a given states are defined as follows:
Conflicting_states(WRITE): RECURSIVE_READ | READ | WRITE
Conflicting_states(READ):  READ | WRITE
Conflicting_states(RECURSIVE_READ): WRITE

Also, we use one more bit in the bitmask to distinguish the first recursive
read in the current chain from the others, since it is sufficient to add only
this dependency to the dependency list.


Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---

 include/linux/lockdep.h |  103 ++++++++++++++++++++++++++++++++++++-----------
 kernel/lockdep.c        |   52 +++++++++++++-----------
 2 files changed, 107 insertions(+), 48 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index da5a5a1..161deee 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -209,7 +209,7 @@ struct held_lock {
 	 */
 	unsigned int irq_context:2; /* bit 0 - soft, bit 1 - hard */
 	unsigned int trylock:1;
-	unsigned int read:2;        /* see lock_acquire() comment */
+	unsigned int rw_state:4;        /* see lock states comment */
 	unsigned int check:2;       /* see lock_acquire() comment */
 	unsigned int hardirqs_off:1;
 };
@@ -259,14 +259,49 @@ extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
 		lockdep_init_map(&(lock)->dep_map, #lock, \
 				 (lock)->dep_map.key, sub)
 
+/* lock_state bits */
+#define _W_		0
+#define _R_		(_W_ + 1)
+#define _RR_		(_W_ + 2)
+#define _FIRST_RR_	(_W_ + 3)
+
+#define get_lock_state(lock_bit)	(1 << lock_bit)
+
 /*
- * Acquire a lock.
+ * lock states:
  *
- * Values for "read":
+ * A given lock can have one of the following states:
+ * - STATE_WRITE: Exclusive Write.
+ * - STATE_READ: Non-recursive read.
+ * - STATE_RECURSIVE_READ: Recursive Read.
+ *
+ * These states are represented as bitmasks
+ */
+#define STATE_WRITE		get_lock_state(_W_)
+#define STATE_READ		get_lock_state(_R_)
+#define STATE_RECURSIVE_READ	get_lock_state(_RR_)
+
+/* Helpers to check the current lock's state */
+#define is_simple_read(state)	(state & STATE_READ)
+#define is_rec_read(state)	(state & STATE_RECURSIVE_READ)
+#define is_read(state)		(is_simple_read(state) | is_rec_read(state))
+#define is_write(state)		(state & STATE_WRITE)
+
+/*
+ * When maintaining dependency chains for locks acquired with
+ * STATE_RECURSIVE_READ state, we add a particular instance of the lock to the
+ * dependency chain only if it's the first recursive read instance in the
+ * current dependency chain. We don't add recurring instances of the lock
+ * to the dependency chains.
+ */
+#define STATE_FIRST_RECURSIVE_READ	get_lock_state(_FIRST_RR_)
+#define is_first_rec_read(state)	(state & STATE_FIRST_RECURSIVE_READ)
+
+/*
+ * Acquire a lock.
  *
- *   0: exclusive (write) acquire
- *   1: read-acquire (no recursion allowed)
- *   2: read-acquire with same-instance recursion allowed
+ * Values for "rw_state":
+ *	See comment for lock states:
  *
  * Values for check:
  *
@@ -275,7 +310,7 @@ extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
  *   2: full validation
  */
 extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			 int trylock, int read, int check,
+			 int trylock, int rw_state, int check,
 			 struct lockdep_map *nest_lock, unsigned long ip);
 
 extern void lock_release(struct lockdep_map *lock, int nested,
@@ -419,11 +454,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define spin_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
-#  define spin_acquire_nest(l, s, t, n, i)	lock_acquire(l, s, t, 0, 2, n, i)
+#  define spin_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
+#  define spin_acquire_nest(l, s, t, n, i)	\
+		lock_acquire(l, s, t, STATE_WRITE, 2, n, i)
 # else
-#  define spin_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
-#  define spin_acquire_nest(l, s, t, n, i)	lock_acquire(l, s, t, 0, 1, NULL, i)
+#  define spin_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
+#  define spin_acquire_nest(l, s, t, n, i)	\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
 # endif
 # define spin_release(l, n, i)			lock_release(l, n, i)
 #else
@@ -433,11 +472,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define rwlock_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
-#  define rwlock_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 2, 2, NULL, i)
+#  define rwlock_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
+#  define rwlock_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_RECURSIVE_READ, 2, NULL, i)
 # else
-#  define rwlock_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
-#  define rwlock_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 2, 1, NULL, i)
+#  define rwlock_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
+#  define rwlock_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_RECURSIVE_READ, 1, NULL, i)
 # endif
 # define rwlock_release(l, n, i)		lock_release(l, n, i)
 #else
@@ -448,9 +491,11 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define mutex_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
+#  define mutex_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
 # else
-#  define mutex_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
+#  define mutex_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
 # endif
 # define mutex_release(l, n, i)			lock_release(l, n, i)
 #else
@@ -460,11 +505,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define rwsem_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
-#  define rwsem_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 1, 2, NULL, i)
+#  define rwsem_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
+#  define rwsem_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_READ, 2, NULL, i)
 # else
-#  define rwsem_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
-#  define rwsem_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 1, 1, NULL, i)
+#  define rwsem_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
+#  define rwsem_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_READ, 1, NULL, i)
 # endif
 # define rwsem_release(l, n, i)			lock_release(l, n, i)
 #else
@@ -475,9 +524,11 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define lock_map_acquire(l)		lock_acquire(l, 0, 0, 0, 2, NULL, _THIS_IP_)
+#  define lock_map_acquire(l)		\
+		lock_acquire(l, 0, 0, STATE_WRITE, 2, NULL, _THIS_IP_)
 # else
-#  define lock_map_acquire(l)		lock_acquire(l, 0, 0, 0, 1, NULL, _THIS_IP_)
+#  define lock_map_acquire(l)		\
+		lock_acquire(l, 0, 0, STATE_WRITE, 1, NULL, _THIS_IP_)
 # endif
 # define lock_map_release(l)			lock_release(l, 1, _THIS_IP_)
 #else
@@ -489,13 +540,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 # define might_lock(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, STATE_WRITE,		\
+						2, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 # define might_lock_read(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, STATE_READ,		\
+						2, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 #else
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 597479a..5f92ea4 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -239,7 +239,7 @@ static void lock_release_holdtime(struct held_lock *hlock)
 	holdtime = sched_clock() - hlock->holdtime_stamp;
 
 	stats = get_lock_stats(hlock_class(hlock));
-	if (hlock->read)
+	if (is_read(hlock->rw_state))
 		lock_time_inc(&stats->read_holdtime, holdtime);
 	else
 		lock_time_inc(&stats->write_holdtime, holdtime);
@@ -1408,7 +1408,7 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
  */
 static int
 check_deadlock(struct task_struct *curr, struct held_lock *next,
-	       struct lockdep_map *next_instance, int read)
+	       struct lockdep_map *next_instance, int rw_state)
 {
 	struct held_lock *prev;
 	struct held_lock *nest = NULL;
@@ -1427,7 +1427,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next,
 		 * Allow read-after-read recursion of the same
 		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
 		 */
-		if ((read == 2) && prev->read)
+		if (is_rec_read(rw_state) && is_read(prev->rw_state))
 			return 2;
 
 		/*
@@ -1496,7 +1496,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 * write-lock never takes any other locks, then the reads are
 	 * equivalent to a NOP.
 	 */
-	if (next->read == 2)
+	if (is_rec_read(next->rw_state))
 		return 1;
 	/*
 	 * Is the <prev> -> <next> dependency already present?
@@ -1581,7 +1581,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (hlock->read != 2) {
+		if (!is_rec_read(hlock->rw_state)) {
 			if (!check_prev_add(curr, hlock, next, distance))
 				return 0;
 			/*
@@ -1749,7 +1749,7 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
 		 * any of these scenarios could lead to a deadlock. If
 		 * All validations
 		 */
-		int ret = check_deadlock(curr, hlock, lock, hlock->read);
+		int ret = check_deadlock(curr, hlock, lock, hlock->rw_state);
 
 		if (!ret)
 			return 0;
@@ -2077,7 +2077,7 @@ mark_held_locks(struct task_struct *curr, enum mark_type mark)
 		hlock = curr->held_locks + i;
 
 		usage_bit = 2 + (mark << 2); /* ENABLED */
-		if (hlock->read)
+		if (hlock->rw_state)
 			usage_bit += 1; /* READ */
 
 		BUG_ON(usage_bit >= LOCK_USAGE_STATES);
@@ -2301,7 +2301,7 @@ static int mark_irqflags(struct task_struct *curr, struct held_lock *hlock)
 	 * mark the lock as used in these contexts:
 	 */
 	if (!hlock->trylock) {
-		if (hlock->read) {
+		if (is_read(hlock->rw_state)) {
 			if (curr->hardirq_context)
 				if (!mark_lock(curr, hlock,
 						LOCK_USED_IN_HARDIRQ_READ))
@@ -2320,7 +2320,7 @@ static int mark_irqflags(struct task_struct *curr, struct held_lock *hlock)
 		}
 	}
 	if (!hlock->hardirqs_off) {
-		if (hlock->read) {
+		if (is_read(hlock->rw_state)) {
 			if (!mark_lock(curr, hlock,
 					LOCK_ENABLED_HARDIRQ_READ))
 				return 0;
@@ -2346,7 +2346,7 @@ static int mark_irqflags(struct task_struct *curr, struct held_lock *hlock)
 	 * allocation).
 	 */
 	if (!hlock->trylock && (curr->lockdep_reclaim_gfp & __GFP_FS)) {
-		if (hlock->read) {
+		if (is_read(hlock->rw_state)) {
 			if (!mark_lock(curr, hlock, LOCK_USED_IN_RECLAIM_FS_READ))
 					return 0;
 		} else {
@@ -2521,8 +2521,9 @@ EXPORT_SYMBOL_GPL(lockdep_init_map);
  * We maintain the dependency maps and validate the locking attempt:
  */
 static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check, int hardirqs_off,
-			  struct lockdep_map *nest_lock, unsigned long ip)
+			  int trylock, int rw_state, int check,
+			  int hardirqs_off, struct lockdep_map *nest_lock,
+			  unsigned long ip)
 {
 	struct task_struct *curr = current;
 	struct lock_class *class = NULL;
@@ -2584,7 +2585,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	hlock->instance = lock;
 	hlock->nest_lock = nest_lock;
 	hlock->trylock = trylock;
-	hlock->read = read;
+	hlock->rw_state = rw_state;
 	hlock->check = check;
 	hlock->hardirqs_off = !!hardirqs_off;
 #ifdef CONFIG_LOCK_STAT
@@ -2736,8 +2737,9 @@ found_it:
 		hlock = curr->held_locks + i;
 		if (!__lock_acquire(hlock->instance,
 			hlock_class(hlock)->subclass, hlock->trylock,
-				hlock->read, hlock->check, hlock->hardirqs_off,
-				hlock->nest_lock, hlock->acquire_ip))
+				hlock->rw_state, hlock->check,
+				hlock->hardirqs_off, hlock->nest_lock,
+				hlock->acquire_ip))
 			return 0;
 	}
 
@@ -2797,8 +2799,9 @@ found_it:
 		hlock = curr->held_locks + i;
 		if (!__lock_acquire(hlock->instance,
 			hlock_class(hlock)->subclass, hlock->trylock,
-				hlock->read, hlock->check, hlock->hardirqs_off,
-				hlock->nest_lock, hlock->acquire_ip))
+				hlock->rw_state, hlock->check,
+				hlock->hardirqs_off, hlock->nest_lock,
+				hlock->acquire_ip))
 			return 0;
 	}
 
@@ -2936,12 +2939,13 @@ DEFINE_TRACE(lock_acquire);
  * and also avoid lockdep recursion:
  */
 void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check,
+			  int trylock, int rw_state, int check,
 			  struct lockdep_map *nest_lock, unsigned long ip)
 {
 	unsigned long flags;
 
-	trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
+	trace_lock_acquire(lock, subclass, trylock, rw_state, check,
+							nest_lock, ip);
 
 	if (unlikely(current->lockdep_recursion))
 		return;
@@ -2950,7 +2954,7 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	check_flags(flags);
 
 	current->lockdep_recursion = 1;
-	__lock_acquire(lock, subclass, trylock, read, check,
+	__lock_acquire(lock, subclass, trylock, rw_state, check,
 		       irqs_disabled_flags(flags), nest_lock, ip);
 	current->lockdep_recursion = 0;
 	raw_local_irq_restore(flags);
@@ -3057,7 +3061,8 @@ found_it:
 	if (contending_point < LOCKSTAT_POINTS)
 		stats->contending_point[contending_point]++;
 	if (lock->cpu != smp_processor_id())
-		stats->bounces[bounce_contended + !!hlock->read]++;
+		stats->bounces[bounce_contended +
+				!!is_read(hlock->rw_state)]++;
 	put_lock_stats(stats);
 }
 
@@ -3101,13 +3106,14 @@ found_it:
 
 	stats = get_lock_stats(hlock_class(hlock));
 	if (waittime) {
-		if (hlock->read)
+		if (is_read(hlock->rw_state))
 			lock_time_inc(&stats->read_waittime, waittime);
 		else
 			lock_time_inc(&stats->write_waittime, waittime);
 	}
 	if (lock->cpu != cpu)
-		stats->bounces[bounce_acquired + !!hlock->read]++;
+		stats->bounces[bounce_acquired +
+				!!is_read(hlock->rw_state)]++;
 	put_lock_stats(stats);
 
 	lock->cpu = cpu;


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH PATCH 4/6] lockdep: Seperate lock ids for read/write acquires.
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
                   ` (2 preceding siblings ...)
  2009-05-11 12:39 ` [RFC/PATCH PATCH 3/6] lockdep: Annotate Read/write states Gautham R Shenoy
@ 2009-05-11 12:39 ` Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 5/6] lockdep: Maintain rw_state entries in locklist Gautham R Shenoy
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

The key of a dependency chain is obtained from the keys of the locks
involved in the dependency chains. The keys of the locks are defined
as their lock_class id's. Since we don't distinguish between the read/write
acquire of a particular lock, we get the same key's for two different
chains involving the same locks in the same dependency order, but in
different read/write states.
Eg:

	lock(A) ---> Rlock(B)
	lock(A) ---> Wlock(B)

Fix this by distinguishing between the read/write acquires of a lock
by defining the lock's key as follows:
	class_id + is_read_lock(lock) * MAX_LOCKDEP_KEYS.

Thus for the purpose of chain key calculation, a lock's key can be in
the range 1 to 2*MAX_LOCKDEP_KEYS - 1.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---

 include/linux/lockdep.h |   14 ++++++++++++++
 kernel/lockdep.c        |   19 +++++++++++++++++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 161deee..98cb434 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -163,6 +163,7 @@ struct lock_chain {
 };
 
 #define MAX_LOCKDEP_KEYS_BITS		13
+
 /*
  * Subtract one because we offset hlock->class_idx by 1 in order
  * to make 0 mean no class. This avoids overflowing the class_idx
@@ -170,6 +171,19 @@ struct lock_chain {
  */
 #define MAX_LOCKDEP_KEYS		((1UL << MAX_LOCKDEP_KEYS_BITS) - 1)
 
+/*
+ * A lock's class id is used to calculate the chain-key. Since we need to
+ * differentiate between the chains which contain the read acquire of
+ * a lock from the chains having write acquire of the same lock,
+ * we offset the class_idx by  MAX_LOCKDEP_KEYS if it is a read acquire.
+ *
+ * Thus the the lock's key during a chain-key calculation can be in the range
+ * 1 to 2 * MAX_LOCKDEP_KEYS - 1.
+ *
+ * LOCKDEP_CHAIN_KEY_BITS holds the number of bits required to
+ * represent this range.
+ */
+#define LOCKDEP_CHAIN_KEY_BITS	(MAX_LOCKDEP_KEYS_BITS + 1)
 struct held_lock {
 	/*
 	 * One-way hash of the dependency chain up to this point. We
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 5f92ea4..c644af8 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -286,8 +286,8 @@ static struct list_head chainhash_table[CHAINHASH_SIZE];
  * unique.
  */
 #define iterate_chain_key(key1, key2) \
-	(((key1) << MAX_LOCKDEP_KEYS_BITS) ^ \
-	((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ \
+	(((key1) << LOCKDEP_CHAIN_KEY_BITS) ^ \
+	((key1) >> (64 - LOCKDEP_CHAIN_KEY_BITS)) ^ \
 	(key2))
 
 void lockdep_off(void)
@@ -1802,6 +1802,9 @@ static void check_chain_key(struct task_struct *curr)
 		if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
 			return;
 
+		if (is_read(hlock->rw_state))
+			id += MAX_LOCKDEP_KEYS;
+
 		if (prev_hlock && (prev_hlock->irq_context !=
 							hlock->irq_context))
 			chain_key = 0;
@@ -2614,6 +2617,18 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
 		return 0;
 
+	/*
+	 * Factor in the read/write state in the chain key calculation.
+	 *
+	 * Two chain containing lock dependencies in the same order can
+	 * still differ due to their read/write state
+	 * eg: lock(A)->Rlock(B) is different from lock(A)->Wlock(B)
+	 *
+	 * Hence distinguish between such chains.
+	 */
+	if (is_read(rw_state))
+		id += MAX_LOCKDEP_KEYS;
+
 	chain_key = curr->curr_chain_key;
 	if (!depth) {
 		if (DEBUG_LOCKS_WARN_ON(chain_key != 0))


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH PATCH 5/6] lockdep: Maintain rw_state entries in locklist
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
                   ` (3 preceding siblings ...)
  2009-05-11 12:39 ` [RFC/PATCH PATCH 4/6] lockdep: Seperate lock ids for read/write acquires Gautham R Shenoy
@ 2009-05-11 12:39 ` Gautham R Shenoy
  2009-05-11 12:39 ` [RFC/PATCH PATCH 6/6] lockdep: Consider the rw_state of lock while validating the chain Gautham R Shenoy
  2009-05-11 13:09 ` [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Peter Zijlstra
  6 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

The dependencies are currently maintained using a structure named locklist.
For a dependency A --> B, it saves B's lock_class in an entry that would be
linked to A's locks_after list.

Enhance this infrastructure to save the read/write states of A and B for each
dependency. Also, rename a variable in locklist to reflect it's usage better.

Also change the parameter list to add_lock_to_list() to allow us record the r/w values for
the two locks involved in a dependency.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---

 include/linux/lockdep.h |   16 ++++++++++++++
 kernel/lockdep.c        |   52 ++++++++++++++++++++++++++++++-----------------
 kernel/lockdep_proc.c   |    4 ++--
 3 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 98cb434..d5c246f 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -141,14 +141,28 @@ struct lockdep_map {
 };
 
 /*
+ * lock_list: List of the other locks taken before or after a particular lock.
+ *
+ * @entry -	List head linking this particular entry to the
+ *		lock_after/lock_before list of a particular lock.
+ * @dep_class - lock_class of the lock which is involved in a dependency with
+ *		 the lock to which this entry is linked to.
+ * @this_lock_rw_state - The read/write state of the lock to which this
+ *			 dependency entry belongs to.
+ * @dep_lock_rw_state - The read/write state of the lock with the lock class
+ *			dep_class in this particular dependecy involvement.
+ *
+ *
  * Every lock has a list of other locks that were taken after it.
  * We only grow the list, never remove from it:
  */
 struct lock_list {
 	struct list_head		entry;
-	struct lock_class		*class;
+	struct lock_class		*dep_class;
 	struct stack_trace		trace;
 	int				distance;
+	unsigned int			this_lock_rw_state:3;
+	unsigned int			dep_lock_rw_state:3;
 };
 
 /*
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index c644af8..e3134f0 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -590,10 +590,10 @@ print_lock_dependencies(struct lock_class *class, int depth)
 	print_lock_class_header(class, depth);
 
 	list_for_each_entry(entry, &class->locks_after, entry) {
-		if (DEBUG_LOCKS_WARN_ON(!entry->class))
+		if (DEBUG_LOCKS_WARN_ON(!entry->dep_class))
 			return;
 
-		print_lock_dependencies(entry->class, depth + 1);
+		print_lock_dependencies(entry->dep_class, depth + 1);
 
 		printk("%*s ... acquired at:\n",depth,"");
 		print_stack_trace(&entry->trace, 2);
@@ -866,10 +866,13 @@ static struct lock_list *alloc_list_entry(void)
 /*
  * Add a new dependency to the head of the list:
  */
-static int add_lock_to_list(struct lock_class *class, struct lock_class *this,
-			    struct list_head *head, unsigned long ip, int distance)
+static int add_lock_to_list(struct held_lock *this_hlock,
+			    struct held_lock *dep_hlock,
+			    struct list_head *head, unsigned long ip,
+			    int distance)
 {
 	struct lock_list *entry;
+	struct lock_class *dep_class = hlock_class(dep_hlock);
 	/*
 	 * Lock not present yet - get a new dependency struct and
 	 * add it to the list:
@@ -881,8 +884,10 @@ static int add_lock_to_list(struct lock_class *class, struct lock_class *this,
 	if (!save_trace(&entry->trace))
 		return 0;
 
-	entry->class = this;
+	entry->dep_class = dep_class;
 	entry->distance = distance;
+	entry->this_lock_rw_state = this_hlock->rw_state;
+	entry->dep_lock_rw_state = dep_hlock->rw_state;
 	/*
 	 * Since we never remove from the dependency list, the list can
 	 * be walked lockless by other CPUs, it's only allocation
@@ -916,7 +921,7 @@ print_circular_bug_entry(struct lock_list *target, unsigned int depth)
 	if (debug_locks_silent)
 		return 0;
 	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
+	print_lock_name(target->dep_class);
 	printk(":\n");
 	print_stack_trace(&target->trace, 6);
 
@@ -960,7 +965,7 @@ static noinline int print_circular_bug_tail(void)
 	if (debug_locks_silent)
 		return 0;
 
-	this.class = hlock_class(check_source);
+	this.dep_class = hlock_class(check_source);
 	if (!save_trace(&this.trace))
 		return 0;
 
@@ -1000,7 +1005,8 @@ unsigned long __lockdep_count_forward_deps(struct lock_class *class,
 	 * Recurse this class's dependency list:
 	 */
 	list_for_each_entry(entry, &class->locks_after, entry)
-		ret += __lockdep_count_forward_deps(entry->class, depth + 1);
+		ret += __lockdep_count_forward_deps(entry->dep_class,
+								depth + 1);
 
 	return ret;
 }
@@ -1030,7 +1036,8 @@ unsigned long __lockdep_count_backward_deps(struct lock_class *class,
 	 * Recurse this class's dependency list:
 	 */
 	list_for_each_entry(entry, &class->locks_before, entry)
-		ret += __lockdep_count_backward_deps(entry->class, depth + 1);
+		ret += __lockdep_count_backward_deps(entry->dep_class,
+								depth + 1);
 
 	return ret;
 }
@@ -1069,10 +1076,10 @@ check_noncircular(struct lock_class *source, unsigned int depth)
 	 * Check this lock's dependency list:
 	 */
 	list_for_each_entry(entry, &source->locks_after, entry) {
-		if (entry->class == hlock_class(check_target))
+		if (entry->dep_class == hlock_class(check_target))
 			return print_circular_bug_header(entry, depth+1);
 		debug_atomic_inc(&nr_cyclic_checks);
-		if (!check_noncircular(entry->class, depth+1))
+		if (!check_noncircular(entry->dep_class, depth+1))
 			return print_circular_bug_entry(entry, depth+1);
 	}
 	return 1;
@@ -1122,7 +1129,7 @@ find_usage_forwards(struct lock_class *source, unsigned int depth)
 	 */
 	list_for_each_entry(entry, &source->locks_after, entry) {
 		debug_atomic_inc(&nr_find_usage_forwards_recursions);
-		ret = find_usage_forwards(entry->class, depth+1);
+		ret = find_usage_forwards(entry->dep_class, depth+1);
 		if (ret == 2 || ret == 0)
 			return ret;
 	}
@@ -1172,7 +1179,7 @@ find_usage_backwards(struct lock_class *source, unsigned int depth)
 	 */
 	list_for_each_entry(entry, &source->locks_before, entry) {
 		debug_atomic_inc(&nr_find_usage_backwards_recursions);
-		ret = find_usage_backwards(entry->class, depth+1);
+		ret = find_usage_backwards(entry->dep_class, depth+1);
 		if (ret == 2 || ret == 0)
 			return ret;
 	}
@@ -1507,26 +1514,33 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 *  L2 added to its dependency list, due to the first chain.)
 	 */
 	list_for_each_entry(entry, &hlock_class(prev)->locks_after, entry) {
-		if (entry->class == hlock_class(next)) {
+		if (entry->dep_class == hlock_class(next)) {
 			if (distance == 1)
 				entry->distance = 1;
+			entry->this_lock_rw_state |= prev->rw_state;
+			entry->dep_lock_rw_state |= next->rw_state;
 			return 2;
 		}
 	}
 
+	list_for_each_entry(entry, &hlock_class(next)->locks_before, entry) {
+		if (entry->dep_class == hlock_class(prev)) {
+			entry->this_lock_rw_state |= next->rw_state;
+			entry->dep_lock_rw_state |= prev->rw_state;
+		}
+	}
+
 	/*
 	 * Ok, all validations passed, add the new lock
 	 * to the previous lock's dependency list:
 	 */
-	ret = add_lock_to_list(hlock_class(prev), hlock_class(next),
-			       &hlock_class(prev)->locks_after,
+	ret = add_lock_to_list(prev, next, &hlock_class(prev)->locks_after,
 			       next->acquire_ip, distance);
 
 	if (!ret)
 		return 0;
 
-	ret = add_lock_to_list(hlock_class(next), hlock_class(prev),
-			       &hlock_class(next)->locks_before,
+	ret = add_lock_to_list(next, prev, &hlock_class(next)->locks_before,
 			       next->acquire_ip, distance);
 	if (!ret)
 		return 0;
@@ -3215,7 +3229,7 @@ static void zap_class(struct lock_class *class)
 	 * involved in:
 	 */
 	for (i = 0; i < nr_list_entries; i++) {
-		if (list_entries[i].class == class)
+		if (list_entries[i].dep_class == class)
 			list_del_rcu(&list_entries[i].entry);
 	}
 	/*
diff --git a/kernel/lockdep_proc.c b/kernel/lockdep_proc.c
index d7135aa..da4880f 100644
--- a/kernel/lockdep_proc.c
+++ b/kernel/lockdep_proc.c
@@ -109,8 +109,8 @@ static int l_show(struct seq_file *m, void *v)
 
 	list_for_each_entry(entry, &class->locks_after, entry) {
 		if (entry->distance == 1) {
-			seq_printf(m, " -> [%p] ", entry->class->key);
-			print_name(m, entry->class);
+			seq_printf(m, " -> [%p] ", entry->dep_class->key);
+			print_name(m, entry->dep_class);
 			seq_puts(m, "\n");
 		}
 	}


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH PATCH 6/6] lockdep: Consider the rw_state of lock while validating the chain.
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
                   ` (4 preceding siblings ...)
  2009-05-11 12:39 ` [RFC/PATCH PATCH 5/6] lockdep: Maintain rw_state entries in locklist Gautham R Shenoy
@ 2009-05-11 12:39 ` Gautham R Shenoy
  2009-05-11 13:09 ` [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Peter Zijlstra
  6 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2009-05-11 12:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul E McKenney, Oleg Nesterov; +Cc: linux-kernel

Currently, while validating a particular dependency chain, we check if by
following the the locks_after list of current lock A, we can reach one of the
previous locks in the current chain.

i.e, if current chain is : A --> B -->C,
then starting from C -->locks_after, can we reach B.

However, if there are dependency chains:

	C--> D --> Rlock(E)

	Rlock(E) --> A,

where E is a Recursive Read lock, this chain cannot cause a deadlock because
of the recursive nature of E.

However, if the dependency chain was
	Wlock(E) --> A,
it would have caused a deadlock.

We solve this problem by defining conflicting states of each lock where

	Conflict(WRITE)		= WRITE, READ, RECURSIVE_READ
	Conflict(READ)		= WRITE, READ
	Conflict(RECURSIVE_READ)= WRITE

Thus, while traversing the locks_after chain of a particular lock A, we traverse
through only those entries in the chain where the state of the lock A in the
entry conflicts with the state of the lock A with which we started the
traversal.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---

 include/linux/lockdep.h |   20 ++++++++++++++++
 kernel/lockdep.c        |   60 +++++++++++++++++++++++++++++------------------
 2 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index d5c246f..83d92a6 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -326,6 +326,26 @@ extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
 #define is_first_rec_read(state)	(state & STATE_FIRST_RECURSIVE_READ)
 
 /*
+ * Conflicting states:
+ * - A Recursive read conflicts only with a write.
+ * - A Non-recursive read can conflict with a non-recursive read and write.
+ * - A Write conflicts with Write, simple read and recursive read.
+ */
+
+#define get_rec_read_conflict(state)		\
+	((((is_write(state)) << (_RR_ - _W_))) & STATE_RECURSIVE_READ)
+
+#define get_simple_read_conflict(state)		\
+	(((((is_write(state)) << (_R_ - _W_))) | (is_simple_read(state))) \
+							& STATE_READ)
+#define get_write_conflict(state)	STATE_WRITE
+
+#define get_lock_conflict_states(state)		\
+	(get_rec_read_conflict(state) |	\
+	get_simple_read_conflict(state) |	\
+	get_write_conflict(state))
+
+/*
  * Acquire a lock.
  *
  * Values for "rw_state":
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index e3134f0..19936a5 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -1060,7 +1060,8 @@ unsigned long lockdep_count_backward_deps(struct lock_class *class)
  * lead to <target>. Print an error and return 0 if it does.
  */
 static noinline int
-check_noncircular(struct lock_class *source, unsigned int depth)
+check_noncircular(struct lock_class *source, unsigned int depth,
+					unsigned int conflict_state)
 {
 	struct lock_list *entry;
 
@@ -1076,10 +1077,13 @@ check_noncircular(struct lock_class *source, unsigned int depth)
 	 * Check this lock's dependency list:
 	 */
 	list_for_each_entry(entry, &source->locks_after, entry) {
+		if (!(entry->this_lock_rw_state & conflict_state))
+			continue;
 		if (entry->dep_class == hlock_class(check_target))
 			return print_circular_bug_header(entry, depth+1);
 		debug_atomic_inc(&nr_cyclic_checks);
-		if (!check_noncircular(entry->dep_class, depth+1))
+		if (!check_noncircular(entry->dep_class, depth+1,
+			get_lock_conflict_states(entry->dep_lock_rw_state)))
 			return print_circular_bug_entry(entry, depth+1);
 	}
 	return 1;
@@ -1105,7 +1109,8 @@ static struct lock_class *forwards_match, *backwards_match;
  * Return 0 on error.
  */
 static noinline int
-find_usage_forwards(struct lock_class *source, unsigned int depth)
+find_usage_forwards(struct lock_class *source, unsigned int depth,
+						unsigned int conflict_states)
 {
 	struct lock_list *entry;
 	int ret;
@@ -1129,7 +1134,10 @@ find_usage_forwards(struct lock_class *source, unsigned int depth)
 	 */
 	list_for_each_entry(entry, &source->locks_after, entry) {
 		debug_atomic_inc(&nr_find_usage_forwards_recursions);
-		ret = find_usage_forwards(entry->dep_class, depth+1);
+		if (!(entry->this_lock_rw_state & conflict_states))
+			continue;
+		ret = find_usage_forwards(entry->dep_class, depth+1,
+			get_lock_conflict_states(entry->dep_lock_rw_state));
 		if (ret == 2 || ret == 0)
 			return ret;
 	}
@@ -1147,7 +1155,8 @@ find_usage_forwards(struct lock_class *source, unsigned int depth)
  * Return 0 on error.
  */
 static noinline int
-find_usage_backwards(struct lock_class *source, unsigned int depth)
+find_usage_backwards(struct lock_class *source, unsigned int depth,
+						unsigned int conflict_states)
 {
 	struct lock_list *entry;
 	int ret;
@@ -1179,7 +1188,10 @@ find_usage_backwards(struct lock_class *source, unsigned int depth)
 	 */
 	list_for_each_entry(entry, &source->locks_before, entry) {
 		debug_atomic_inc(&nr_find_usage_backwards_recursions);
-		ret = find_usage_backwards(entry->dep_class, depth+1);
+		if (!(entry->this_lock_rw_state & conflict_states))
+			continue;
+		ret = find_usage_backwards(entry->dep_class, depth+1,
+			get_lock_conflict_states(entry->dep_lock_rw_state));
 		if (ret == 2 || ret == 0)
 			return ret;
 	}
@@ -1256,12 +1268,14 @@ check_usage(struct task_struct *curr, struct held_lock *prev,
 
 	find_usage_bit = bit_backwards;
 	/* fills in <backwards_match> */
-	ret = find_usage_backwards(hlock_class(prev), 0);
+	ret = find_usage_backwards(hlock_class(prev), 0,
+			get_lock_conflict_states(prev->rw_state));
 	if (!ret || ret == 1)
 		return ret;
 
 	find_usage_bit = bit_forwards;
-	ret = find_usage_forwards(hlock_class(next), 0);
+	ret = find_usage_forwards(hlock_class(next), 0,
+			get_lock_conflict_states(next->rw_state));
 	if (!ret || ret == 1)
 		return ret;
 	/* ret == 2 */
@@ -1489,23 +1503,14 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 */
 	check_source = next;
 	check_target = prev;
-	if (!(check_noncircular(hlock_class(next), 0)))
+	if (!(check_noncircular(hlock_class(next), 0,
+		get_lock_conflict_states(next->rw_state))))
 		return print_circular_bug_tail();
 
 	if (!check_prev_add_irq(curr, prev, next))
 		return 0;
 
 	/*
-	 * For recursive read-locks we do all the dependency checks,
-	 * but we dont store read-triggered dependencies (only
-	 * write-triggered dependencies). This ensures that only the
-	 * write-side dependencies matter, and that if for example a
-	 * write-lock never takes any other locks, then the reads are
-	 * equivalent to a NOP.
-	 */
-	if (is_rec_read(next->rw_state))
-		return 1;
-	/*
 	 * Is the <prev> -> <next> dependency already present?
 	 *
 	 * (this may occur even though this is a new chain: consider
@@ -1595,7 +1600,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (!is_rec_read(hlock->rw_state)) {
+		if (!is_rec_read(hlock->rw_state) ||
+				is_first_rec_read(hlock->rw_state)) {
 			if (!check_prev_add(curr, hlock, next, distance))
 				return 0;
 			/*
@@ -1767,9 +1773,15 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
 
 		if (!ret)
 			return 0;
+
+		if (ret != 2 && is_rec_read(hlock->rw_state))
+			hlock->rw_state |= STATE_FIRST_RECURSIVE_READ;
+
+
 		/*
 		 * Add dependency only if this lock is not the head
-		 * of the chain, and if it's not a secondary read-lock:
+		 * of the chain, and if it's not the first instance of
+		 * the recursive read-lock:
 		 */
 		if (!chain_head && ret != 2)
 			if (!check_prevs_add(curr, hlock))
@@ -1940,7 +1952,8 @@ check_usage_forwards(struct task_struct *curr, struct held_lock *this,
 
 	find_usage_bit = bit;
 	/* fills in <forwards_match> */
-	ret = find_usage_forwards(hlock_class(this), 0);
+	ret = find_usage_forwards(hlock_class(this), 0,
+			get_lock_conflict_states(this->rw_state));
 	if (!ret || ret == 1)
 		return ret;
 
@@ -1959,7 +1972,8 @@ check_usage_backwards(struct task_struct *curr, struct held_lock *this,
 
 	find_usage_bit = bit;
 	/* fills in <backwards_match> */
-	ret = find_usage_backwards(hlock_class(this), 0);
+	ret = find_usage_backwards(hlock_class(this), 0,
+			get_lock_conflict_states(this->rw_state));
 	if (!ret || ret == 1)
 		return ret;
 


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires
  2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
                   ` (5 preceding siblings ...)
  2009-05-11 12:39 ` [RFC/PATCH PATCH 6/6] lockdep: Consider the rw_state of lock while validating the chain Gautham R Shenoy
@ 2009-05-11 13:09 ` Peter Zijlstra
  6 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2009-05-11 13:09 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Ingo Molnar, Paul E McKenney, Oleg Nesterov, linux-kernel

On Mon, 2009-05-11 at 18:09 +0530, Gautham R Shenoy wrote:

> Feedback is very much appreciated.

Awesome work :-)

Didn't spot any obvious (in so far as that that word can apply to this
series) boo-boos.

More in-depth review will follow a bit later.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-05-11 13:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-11 12:39 [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Gautham R Shenoy
2009-05-11 12:39 ` [RFC/PATCH PATCH 1/6] lockdep: Remove redundant read checks Gautham R Shenoy
2009-05-11 12:39 ` [RFC/PATCH PATCH 2/6] lockdep: Remove strict " Gautham R Shenoy
2009-05-11 12:39 ` [RFC/PATCH PATCH 3/6] lockdep: Annotate Read/write states Gautham R Shenoy
2009-05-11 12:39 ` [RFC/PATCH PATCH 4/6] lockdep: Seperate lock ids for read/write acquires Gautham R Shenoy
2009-05-11 12:39 ` [RFC/PATCH PATCH 5/6] lockdep: Maintain rw_state entries in locklist Gautham R Shenoy
2009-05-11 12:39 ` [RFC/PATCH PATCH 6/6] lockdep: Consider the rw_state of lock while validating the chain Gautham R Shenoy
2009-05-11 13:09 ` [RFC/PATCH 0/6] lockdep: Maintain read/write states for lock_acquires Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox