[RFC][PATCH 0/7] lockdep: Support recurise-read locks

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH 0/7] lockdep: Support recurise-read locks
@ 2011-04-17  9:45 Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 1/7] lockdep: Implement extra recursive-read lock tests Peter Zijlstra
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

This series is a rebase of an earlier series by Gautham.

  http://lkml.org/lkml/2009/5/11/203

I left out the second patch:

  http://lkml.org/lkml/2009/5/11/205

Because I couldn't reconstruct the thinking there, nor could I find
a good reason for it. Furthermore, I added a patch at the start that
implements a few extra locking self-tests.

This series wants testing and the self-tests really want to be extended
to cover more of the recursive read lock to gain confidence that all
does indeed work as intended.

I've also rewritten most of the changelogs to (hopefully) be better.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 1/7] lockdep: Implement extra recursive-read lock tests
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 2/7] lockdep: Remove redundant read checks Peter Zijlstra
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: lockdep-test-recursive-read.patch --]
[-- Type: text/plain, Size: 2658 bytes --]

Lockdep is not able to detect simple rw deadlocks because it is not
tracking recursive read dependencies:

        read_lock(A) --> spin_lock(B)
        spin_lock(B) --> write_lock(A) /* should fail */

Furthermore, the following should not result in a deadlock for
recursive read locks:

        read_lock(A) --> spin_lock(B)
        spin_lock(B) --> read_lock(A) /* success */

Add these checks so that we may improve lockdep to accurately track
recursive read locks and report the correct answers.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 lib/locking-selftest.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

Index: tip/lib/locking-selftest.c
===================================================================
--- tip.orig/lib/locking-selftest.c
+++ tip/lib/locking-selftest.c
@@ -185,13 +185,16 @@ static void init_shared_classes(void)
 
 #define ML(x)			mutex_lock(&mutex_##x)
 #define MU(x)			mutex_unlock(&mutex_##x)
+#define MLU(x)			ML(x); MU(x)
 #define MI(x)			mutex_init(&mutex_##x)
 
 #define WSL(x)			down_write(&rwsem_##x)
 #define WSU(x)			up_write(&rwsem_##x)
+#define WSLU(x)			WSL(x); WSU(x)
 
 #define RSL(x)			down_read(&rwsem_##x)
 #define RSU(x)			up_read(&rwsem_##x)
+#define RSLU(x)			RSL(x); RSU(x)
 #define RWSI(x)			init_rwsem(&rwsem_##x)
 
 #define LOCK_UNLOCK_2(x,y)	LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)
@@ -300,6 +303,50 @@ static void rsem_AA3(void)
 	RSL(X2); // this one should fail
 }
 
+static void rwlock_spinlock_ABBA(void)
+{
+	RL(A);
+	LU(B);
+	RU(A);
+
+	L(B);
+	WLU(A);	/* fail */
+	U(B);
+}
+
+static void rlock_spinlock_ABBA(void)
+{
+	RL(A);
+	LU(B);
+	RU(A);
+
+	L(B);
+	RLU(A); /* not fail */
+	U(B);
+}
+
+static void rwsem_mutex_ABBA(void)
+{
+	RSL(A);
+	MLU(B);
+	RSU(A);
+
+	ML(B);
+	WSLU(A); /* fail */
+	MU(B);
+}
+
+static void rsem_mutex_ABBA(void)
+{
+	RSL(A);
+	MLU(B);
+	RSU(A);
+
+	ML(B);
+	RSLU(A); /* fail */
+	MU(B);
+}
+
 /*
  * ABBA deadlock:
  */
@@ -1174,6 +1221,20 @@ void locking_selftest(void)
 	dotest(rsem_AA3, FAILURE, LOCKTYPE_RWSEM);
 	printk("\n");
 
+	print_testname("mixed write-spin-read-lock");
+	printk("             |");
+	dotest(rwlock_spinlock_ABBA, FAILURE, LOCKTYPE_RWLOCK);
+	printk("             |");
+	dotest(rwsem_mutex_ABBA, FAILURE, LOCKTYPE_RWSEM);
+	printk("\n");
+
+	print_testname("mixed read-spin-read-lock");
+	printk("             |");
+	dotest(rlock_spinlock_ABBA, SUCCESS, LOCKTYPE_RWLOCK);
+	printk("             |");
+	dotest(rsem_mutex_ABBA, FAILURE, LOCKTYPE_RWSEM);
+	printk("\n");
+
 	printk("  --------------------------------------------------------------------------\n");
 
 	/*



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 2/7] lockdep: Remove redundant read checks
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 1/7] lockdep: Implement extra recursive-read lock tests Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-18 14:28   ` Steven Rostedt
  2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: gautham_r_shenoy-lockdep-remove_redundant_read_checks_.patch --]
[-- Type: text/plain, Size: 1773 bytes --]

From: Gautham R Shenoy <ego@in.ibm.com>

Do various simplifications:

1) In kernel/lockdep.c::validate_chain():

	ret = check_deadlock(curr, hlock, lock, hlock->read);

ret == 2 only if hlock->read == 2.

Hence:

	if (ret == 2)
		hlock->read = 2;

is redundant and can be removed.

2) In kernel/lockdep.c::check_prevs_add(curr, next):

	if (hlock->read != 2)
		check_prev_add(curr, hlock, next, distance);

Thus, check_prev_add is called only when hlock->read != 2.

>From the conclusions of 2):

kernel/lockdep.c::check_prev_add(curr, prev, next, distance) gets called
iff prev->read != 2.

Hence, in kernel/lockdep.c::check_prev_add(curr, prev, next, distance):

	if (prev->read == 2)
		return 1;

is redunant and can be removed.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/lockdep.c |    9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

Index: tip/kernel/lockdep.c
===================================================================
--- tip.orig/kernel/lockdep.c
+++ tip/kernel/lockdep.c
@@ -1676,7 +1676,7 @@ check_prev_add(struct task_struct *curr,
 	 * write-lock never takes any other locks, then the reads are
 	 * equivalent to a NOP.
 	 */
-	if (next->read == 2 || prev->read == 2)
+	if (next->read == 2)
 		return 1;
 	/*
 	 * Is the <prev> -> <next> dependency already present?
@@ -1940,13 +1940,6 @@ static int validate_chain(struct task_st
 		if (!ret)
 			return 0;
 		/*
-		 * Mark recursive read, as we jump over it when
-		 * building dependencies (just like we jump over
-		 * trylock entries):
-		 */
-		if (ret == 2)
-			hlock->read = 2;
-		/*
 		 * Add dependency only if this lock is not the head
 		 * of the chain, and if it's not a secondary read-lock:
 		 */



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 1/7] lockdep: Implement extra recursive-read lock tests Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 2/7] lockdep: Remove redundant read checks Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-18 13:34   ` Yong Zhang
                     ` (3 more replies)
  2011-04-17  9:45 ` [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires Peter Zijlstra
                   ` (4 subsequent siblings)
  7 siblings, 4 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: gautham_r_shenoy-lockdep-annotate_read_write_states.patch --]
[-- Type: text/plain, Size: 15491 bytes --]

From: Gautham R Shenoy <ego@in.ibm.com>

Currently we do not save the recursive read dependencies in the dependency
chain. As a result, a deadlock caused by the following chains are not spotted,
since we never have the chain 1 in our dependency list:

 1: Rlock(A) --> lock(B)
 2: lock(B)  --> Wlock(A), where A is a recursive read lock.

Before adding the Recursive Read locks to the dependency chains, we need to
distinguish them from the normal read locks since the conflicting states for
these two are quite different.

Currently the read/write status of a lock while it's acquired is denoted by a
monotonically increasing variable where:

 0 - WRITE
 1 - READ
 2 - RECURSIVE READ

In this patch, we propose to modify this distinction from a monotonically
increasing variable to a bit mask where:

 0x1 - WRITE
 0x2 - READ
 0x4 - RECURSIVE READ

This helps us to define the conflicting states for each lock with ease:
Thereby, the conflicting states for a given states are defined as follows:

 Conflicting_states(WRITE):             RECURSIVE_READ | READ | WRITE
 Conflicting_states(READ):                               READ | WRITE
 Conflicting_states(RECURSIVE_READ):                            WRITE

Also, we use one more bit in the bitmask to distinguish the first recursive
read in the current chain from the others, since it is sufficient to add only
this dependency to the dependency list.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |  107 ++++++++++++++++++++++++++++++++++++------------
 kernel/lockdep.c        |   46 ++++++++++----------
 2 files changed, 105 insertions(+), 48 deletions(-)

Index: tip/include/linux/lockdep.h
===================================================================
--- tip.orig/include/linux/lockdep.h
+++ tip/include/linux/lockdep.h
@@ -233,10 +233,11 @@ struct held_lock {
 	unsigned int irq_context:2; /* bit 0 - soft, bit 1 - hard */
 	unsigned int trylock:1;						/* 16 bits */
 
-	unsigned int read:2;        /* see lock_acquire() comment */
+	unsigned int rw_state:4;        /* see lock states comment */
 	unsigned int check:2;       /* see lock_acquire() comment */
 	unsigned int hardirqs_off:1;
-	unsigned int references:11;					/* 32 bits */
+
+	unsigned short references;
 };
 
 /*
@@ -286,6 +287,15 @@ extern void lockdep_init_map(struct lock
 
 #define lockdep_set_novalidate_class(lock) \
 	lockdep_set_class(lock, &__lockdep_no_validate__)
+
+/* lock_state bits */
+#define _W_		0
+#define _R_		(_W_ + 1)
+#define _RR_		(_W_ + 2)
+#define _FIRST_RR_	(_W_ + 3)
+
+#define get_lock_state(lock_bit)	(1 << lock_bit)
+
 /*
  * Compare locking classes
  */
@@ -298,13 +308,40 @@ static inline int lockdep_match_key(stru
 }
 
 /*
- * Acquire a lock.
+ * lock states:
+ *
+ * A given lock can have one of the following states:
+ * - STATE_WRITE: Exclusive Write.
+ * - STATE_READ: Non-recursive read.
+ * - STATE_RECURSIVE_READ: Recursive Read.
  *
- * Values for "read":
+ * These states are represented as bitmasks
+ */
+#define STATE_WRITE		get_lock_state(_W_)
+#define STATE_READ		get_lock_state(_R_)
+#define STATE_RECURSIVE_READ	get_lock_state(_RR_)
+
+/* Helpers to check the current lock's state */
+#define is_simple_read(state)	(state & STATE_READ)
+#define is_rec_read(state)	(state & STATE_RECURSIVE_READ)
+#define is_read(state)		(is_simple_read(state) | is_rec_read(state))
+#define is_write(state)		(state & STATE_WRITE)
+
+/*
+ * When maintaining dependency chains for locks acquired with
+ * STATE_RECURSIVE_READ state, we add a particular instance of the lock to the
+ * dependency chain only if it's the first recursive read instance in the
+ * current dependency chain. We don't add recurring instances of the lock
+ * to the dependency chains.
+ */
+#define STATE_FIRST_RECURSIVE_READ	get_lock_state(_FIRST_RR_)
+#define is_first_rec_read(state)	(state & STATE_FIRST_RECURSIVE_READ)
+
+/*
+ * Acquire a lock.
  *
- *   0: exclusive (write) acquire
- *   1: read-acquire (no recursion allowed)
- *   2: read-acquire with same-instance recursion allowed
+ * Values for "rw_state":
+ *	See comment for lock states:
  *
  * Values for check:
  *
@@ -313,7 +350,7 @@ static inline int lockdep_match_key(stru
  *   2: full validation
  */
 extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			 int trylock, int read, int check,
+			 int trylock, int rw_state, int check,
 			 struct lockdep_map *nest_lock, unsigned long ip);
 
 extern void lock_release(struct lockdep_map *lock, int nested,
@@ -457,11 +494,15 @@ static inline void print_irqtrace_events
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define spin_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
-#  define spin_acquire_nest(l, s, t, n, i)	lock_acquire(l, s, t, 0, 2, n, i)
+#  define spin_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
+#  define spin_acquire_nest(l, s, t, n, i)	\
+		lock_acquire(l, s, t, STATE_WRITE, 2, n, i)
 # else
-#  define spin_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
-#  define spin_acquire_nest(l, s, t, n, i)	lock_acquire(l, s, t, 0, 1, NULL, i)
+#  define spin_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
+#  define spin_acquire_nest(l, s, t, n, i)	\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
 # endif
 # define spin_release(l, n, i)			lock_release(l, n, i)
 #else
@@ -471,11 +512,15 @@ static inline void print_irqtrace_events
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define rwlock_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
-#  define rwlock_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 2, 2, NULL, i)
+#  define rwlock_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
+#  define rwlock_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_RECURSIVE_READ, 2, NULL, i)
 # else
-#  define rwlock_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
-#  define rwlock_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 2, 1, NULL, i)
+#  define rwlock_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
+#  define rwlock_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_RECURSIVE_READ, 1, NULL, i)
 # endif
 # define rwlock_release(l, n, i)		lock_release(l, n, i)
 #else
@@ -486,9 +531,11 @@ static inline void print_irqtrace_events
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define mutex_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
+#  define mutex_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
 # else
-#  define mutex_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
+#  define mutex_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
 # endif
 # define mutex_release(l, n, i)			lock_release(l, n, i)
 #else
@@ -498,11 +545,15 @@ static inline void print_irqtrace_events
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define rwsem_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 2, NULL, i)
-#  define rwsem_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 1, 2, NULL, i)
+#  define rwsem_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
+#  define rwsem_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_READ, 2, NULL, i)
 # else
-#  define rwsem_acquire(l, s, t, i)		lock_acquire(l, s, t, 0, 1, NULL, i)
-#  define rwsem_acquire_read(l, s, t, i)	lock_acquire(l, s, t, 1, 1, NULL, i)
+#  define rwsem_acquire(l, s, t, i)		\
+		lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
+#  define rwsem_acquire_read(l, s, t, i)	\
+		lock_acquire(l, s, t, STATE_READ, 1, NULL, i)
 # endif
 # define rwsem_release(l, n, i)			lock_release(l, n, i)
 #else
@@ -513,11 +564,13 @@ static inline void print_irqtrace_events
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # ifdef CONFIG_PROVE_LOCKING
-#  define lock_map_acquire(l)		lock_acquire(l, 0, 0, 0, 2, NULL, _THIS_IP_)
 #  define lock_map_acquire_read(l)	lock_acquire(l, 0, 0, 2, 2, NULL, _THIS_IP_)
+#  define lock_map_acquire(l)		\
+		lock_acquire(l, 0, 0, STATE_WRITE, 2, NULL, _THIS_IP_)
 # else
-#  define lock_map_acquire(l)		lock_acquire(l, 0, 0, 0, 1, NULL, _THIS_IP_)
 #  define lock_map_acquire_read(l)	lock_acquire(l, 0, 0, 2, 1, NULL, _THIS_IP_)
+#  define lock_map_acquire(l)		\
+		lock_acquire(l, 0, 0, STATE_WRITE, 1, NULL, _THIS_IP_)
 # endif
 # define lock_map_release(l)			lock_release(l, 1, _THIS_IP_)
 #else
@@ -530,13 +583,15 @@ static inline void print_irqtrace_events
 # define might_lock(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, STATE_WRITE,		\
+						2, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 # define might_lock_read(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, STATE_READ,		\
+						2, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 #else
Index: tip/kernel/lockdep.c
===================================================================
--- tip.orig/kernel/lockdep.c
+++ tip/kernel/lockdep.c
@@ -256,7 +256,7 @@ static void lock_release_holdtime(struct
 	holdtime = lockstat_clock() - hlock->holdtime_stamp;
 
 	stats = get_lock_stats(hlock_class(hlock));
-	if (hlock->read)
+	if (is_read(hlock->rw_state))
 		lock_time_inc(&stats->read_holdtime, holdtime);
 	else
 		lock_time_inc(&stats->write_holdtime, holdtime);
@@ -1575,7 +1575,7 @@ print_deadlock_bug(struct task_struct *c
  */
 static int
 check_deadlock(struct task_struct *curr, struct held_lock *next,
-	       struct lockdep_map *next_instance, int read)
+	       struct lockdep_map *next_instance, int rw_state)
 {
 	struct held_lock *prev;
 	struct held_lock *nest = NULL;
@@ -1594,7 +1594,7 @@ check_deadlock(struct task_struct *curr,
 		 * Allow read-after-read recursion of the same
 		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
 		 */
-		if ((read == 2) && prev->read)
+		if (is_rec_read(rw_state) && is_read(prev->rw_state))
 			return 2;
 
 		/*
@@ -1676,7 +1676,7 @@ check_prev_add(struct task_struct *curr,
 	 * write-lock never takes any other locks, then the reads are
 	 * equivalent to a NOP.
 	 */
-	if (next->read == 2)
+	if (is_rec_read(next->rw_state))
 		return 1;
 	/*
 	 * Is the <prev> -> <next> dependency already present?
@@ -1765,7 +1765,7 @@ check_prevs_add(struct task_struct *curr
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (hlock->read != 2) {
+		if (!is_rec_read(hlock->rw_state)) {
 			if (!check_prev_add(curr, hlock, next,
 						distance, trylock_loop))
 				return 0;
@@ -1935,7 +1935,7 @@ static int validate_chain(struct task_st
 		 * any of these scenarios could lead to a deadlock. If
 		 * All validations
 		 */
-		int ret = check_deadlock(curr, hlock, lock, hlock->read);
+		int ret = check_deadlock(curr, hlock, lock, hlock->rw_state);
 
 		if (!ret)
 			return 0;
@@ -2273,7 +2273,7 @@ mark_held_locks(struct task_struct *curr
 		hlock = curr->held_locks + i;
 
 		usage_bit = 2 + (mark << 2); /* ENABLED */
-		if (hlock->read)
+		if (hlock->rw_state)
 			usage_bit += 1; /* READ */
 
 		BUG_ON(usage_bit >= LOCK_USAGE_STATES);
@@ -2486,7 +2486,7 @@ static int mark_irqflags(struct task_str
 	 * mark the lock as used in these contexts:
 	 */
 	if (!hlock->trylock) {
-		if (hlock->read) {
+		if (is_read(hlock->rw_state)) {
 			if (curr->hardirq_context)
 				if (!mark_lock(curr, hlock,
 						LOCK_USED_IN_HARDIRQ_READ))
@@ -2505,7 +2505,7 @@ static int mark_irqflags(struct task_str
 		}
 	}
 	if (!hlock->hardirqs_off) {
-		if (hlock->read) {
+		if (is_read(hlock->rw_state)) {
 			if (!mark_lock(curr, hlock,
 					LOCK_ENABLED_HARDIRQ_READ))
 				return 0;
@@ -2531,7 +2531,7 @@ static int mark_irqflags(struct task_str
 	 * allocation).
 	 */
 	if (!hlock->trylock && (curr->lockdep_reclaim_gfp & __GFP_FS)) {
-		if (hlock->read) {
+		if (is_read(hlock->rw_state)) {
 			if (!mark_lock(curr, hlock, LOCK_USED_IN_RECLAIM_FS_READ))
 					return 0;
 		} else {
@@ -2712,9 +2712,9 @@ struct lock_class_key __lockdep_no_valid
  * We maintain the dependency maps and validate the locking attempt:
  */
 static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check, int hardirqs_off,
-			  struct lockdep_map *nest_lock, unsigned long ip,
-			  int references)
+			  int trylock, int rw_state, int check,
+			  int hardirqs_off, struct lockdep_map *nest_lock,
+			  unsigned long ip, int references)
 {
 	struct task_struct *curr = current;
 	struct lock_class *class = NULL;
@@ -2786,7 +2786,7 @@ static int __lock_acquire(struct lockdep
 	hlock->instance = lock;
 	hlock->nest_lock = nest_lock;
 	hlock->trylock = trylock;
-	hlock->read = read;
+	hlock->rw_state = rw_state;
 	hlock->check = check;
 	hlock->hardirqs_off = !!hardirqs_off;
 	hlock->references = references;
@@ -2963,7 +2963,7 @@ found_it:
 		hlock = curr->held_locks + i;
 		if (!__lock_acquire(hlock->instance,
 			hlock_class(hlock)->subclass, hlock->trylock,
-				hlock->read, hlock->check, hlock->hardirqs_off,
+				hlock->rw_state, hlock->check, hlock->hardirqs_off,
 				hlock->nest_lock, hlock->acquire_ip,
 				hlock->references))
 			return 0;
@@ -3039,7 +3039,7 @@ found_it:
 		hlock = curr->held_locks + i;
 		if (!__lock_acquire(hlock->instance,
 			hlock_class(hlock)->subclass, hlock->trylock,
-				hlock->read, hlock->check, hlock->hardirqs_off,
+				hlock->rw_state, hlock->check, hlock->hardirqs_off,
 				hlock->nest_lock, hlock->acquire_ip,
 				hlock->references))
 			return 0;
@@ -3192,7 +3192,7 @@ EXPORT_SYMBOL_GPL(lock_set_class);
  * and also avoid lockdep recursion:
  */
 void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check,
+			  int trylock, int rw_state, int check,
 			  struct lockdep_map *nest_lock, unsigned long ip)
 {
 	unsigned long flags;
@@ -3204,8 +3204,8 @@ void lock_acquire(struct lockdep_map *lo
 	check_flags(flags);
 
 	current->lockdep_recursion = 1;
-	trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
-	__lock_acquire(lock, subclass, trylock, read, check,
+	trace_lock_acquire(lock, subclass, trylock, rw_state, check, nest_lock, ip);
+	__lock_acquire(lock, subclass, trylock, rw_state, check,
 		       irqs_disabled_flags(flags), nest_lock, ip, 0);
 	current->lockdep_recursion = 0;
 	raw_local_irq_restore(flags);
@@ -3332,7 +3332,8 @@ found_it:
 	if (contending_point < LOCKSTAT_POINTS)
 		stats->contending_point[contending_point]++;
 	if (lock->cpu != smp_processor_id())
-		stats->bounces[bounce_contended + !!hlock->read]++;
+		stats->bounces[bounce_contended +
+				!!is_read(hlock->rw_state)]++;
 	put_lock_stats(stats);
 }
 
@@ -3380,13 +3381,14 @@ found_it:
 
 	stats = get_lock_stats(hlock_class(hlock));
 	if (waittime) {
-		if (hlock->read)
+		if (is_read(hlock->rw_state))
 			lock_time_inc(&stats->read_waittime, waittime);
 		else
 			lock_time_inc(&stats->write_waittime, waittime);
 	}
 	if (lock->cpu != cpu)
-		stats->bounces[bounce_acquired + !!hlock->read]++;
+		stats->bounces[bounce_acquired +
+				!!is_read(hlock->rw_state)]++;
 	put_lock_stats(stats);
 
 	lock->cpu = cpu;



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
                   ` (2 preceding siblings ...)
  2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-18 16:46   ` Steven Rostedt
  2011-04-18 22:07   ` Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 5/7] lockdep: Rename lock_list::class Peter Zijlstra
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: gautham_r_shenoy-lockdep-seperate_lock_ids_for_read_write.patch --]
[-- Type: text/plain, Size: 4522 bytes --]

From: Gautham R Shenoy <ego@in.ibm.com>

In order to support recursive read locks we need to support the
previously mentioned lock state conflict matrix:

 Conflicting_states(WRITE):             RECURSIVE_READ | READ | WRITE
 Conflicting_states(READ):                               READ | WRITE
 Conflicting_states(RECURSIVE_READ):                            WRITE

Since this introduces asymmetry between recursive read and write, we
need to split the lock dependency chains such that we can traverse
WRITE chains without observing RECURSIVE_READ|READ chains.

Previously we didn't distinguish between the read/write acquire of a
particular lock, thus we get the same key's for two different chains
involving the same locks in the same dependency order, but in
different read/write states.

For example:

	lock(A) ---> Rlock(B)
	lock(A) ---> Wlock(B)

Will overlap and produce a single chain. This is fine for the
symmetric conflict states of exlusive locks (trivial, WRITE exludes
WRITE) and fair read/write locks (note that both READ and WRITE
exclude each other), but is not sufficient for the reader preferenced
read/write lock since the recursive read state does not exclude
itself. 

The implementation is straight forward since the key of a dependency
chain is obtained from the keys of the locks involved in the
dependency chains. The keys of the locks are defined as their
lock_class id's.

By changing this to distinguishing between the read/write acquires of
a lock by defining the lock's key as follows:

	class_id + is_read_lock(lock) * MAX_LOCKDEP_KEYS

We gain the proposed split and thus for the purpose of chain key
calculation a lock's key can be in the range 1 to 2*MAX_LOCKDEP_KEYS-1.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |   14 ++++++++++++++
 kernel/lockdep.c        |   19 +++++++++++++++++--
 2 files changed, 31 insertions(+), 2 deletions(-)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h
+++ linux-2.6/include/linux/lockdep.h
@@ -186,6 +186,7 @@ struct lock_chain {
 };
 
 #define MAX_LOCKDEP_KEYS_BITS		13
+
 /*
  * Subtract one because we offset hlock->class_idx by 1 in order
  * to make 0 mean no class. This avoids overflowing the class_idx
@@ -193,6 +194,19 @@ struct lock_chain {
  */
 #define MAX_LOCKDEP_KEYS		((1UL << MAX_LOCKDEP_KEYS_BITS) - 1)
 
+/*
+ * A lock's class id is used to calculate the chain-key. Since we need to
+ * differentiate between the chains which contain the read acquire of
+ * a lock from the chains having write acquire of the same lock,
+ * we offset the class_idx by MAX_LOCKDEP_KEYS if it is a read acquire.
+ *
+ * Thus the the lock's key during a chain-key calculation can be in the range
+ * 1 to 2 * MAX_LOCKDEP_KEYS - 1.
+ *
+ * LOCKDEP_CHAIN_KEY_BITS holds the number of bits required to
+ * represent this range.
+ */
+#define LOCKDEP_CHAIN_KEY_BITS	(MAX_LOCKDEP_KEYS_BITS + 1)
 struct held_lock {
 	/*
 	 * One-way hash of the dependency chain up to this point. We
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c
+++ linux-2.6/kernel/lockdep.c
@@ -303,8 +303,8 @@ static struct list_head chainhash_table[
  * unique.
  */
 #define iterate_chain_key(key1, key2) \
-	(((key1) << MAX_LOCKDEP_KEYS_BITS) ^ \
-	((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ \
+	(((key1) << LOCKDEP_CHAIN_KEY_BITS) ^ \
+	((key1) >> (64 - LOCKDEP_CHAIN_KEY_BITS)) ^ \
 	(key2))
 
 void lockdep_off(void)
@@ -1988,6 +1988,9 @@ static void check_chain_key(struct task_
 		if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
 			return;
 
+		if (is_read(hlock->rw_state))
+			id += MAX_LOCKDEP_KEYS;
+
 		if (prev_hlock && (prev_hlock->irq_context !=
 							hlock->irq_context))
 			chain_key = 0;
@@ -2815,6 +2818,18 @@ static int __lock_acquire(struct lockdep
 	if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
 		return 0;
 
+	/*
+	 * Factor in the read/write state in the chain key calculation.
+	 *
+	 * Two chains containing lock dependencies in the same order can
+	 * still differ due to their read/write state
+	 * eg: lock(A)->Rlock(B) is different from lock(A)->Wlock(B)
+	 *
+	 * Hence distinguish between such chains.
+	 */
+	if (is_read(rw_state))
+		id += MAX_LOCKDEP_KEYS;
+
 	chain_key = curr->curr_chain_key;
 	if (!depth) {
 		if (DEBUG_LOCKS_WARN_ON(chain_key != 0))



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 5/7] lockdep: Rename lock_list::class
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
                   ` (3 preceding siblings ...)
  2011-04-17  9:45 ` [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-17  9:45 ` [RFC][PATCH 6/7] lockdep: Maintain rw_state entries in locklist Peter Zijlstra
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: gautham_r_shenoy-lockdep-rename-lock_list-class.patch --]
[-- Type: text/plain, Size: 9332 bytes --]

From: Gautham R Shenoy <ego@in.ibm.com>

Rename a variable in locklist to reflect its usage better.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |    9 ++++++-
 kernel/lockdep.c        |   56 ++++++++++++++++++++++++------------------------
 kernel/lockdep_proc.c   |    4 +--
 3 files changed, 38 insertions(+), 31 deletions(-)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h
+++ linux-2.6/include/linux/lockdep.h
@@ -158,12 +158,19 @@ struct lockdep_map {
 };
 
 /*
+ * lock_list: List of the other locks taken before or after a particular lock.
+ *
+ * @entry -	List head linking this particular entry to the
+ *		lock_after/lock_before list of a particular lock.
+ * @dep_class - lock_class of the lock which is involved in a dependency with
+ *		 the lock to which this entry is linked to.
+ *
  * Every lock has a list of other locks that were taken after it.
  * We only grow the list, never remove from it:
  */
 struct lock_list {
 	struct list_head		entry;
-	struct lock_class		*class;
+	struct lock_class		*dep_class;
 	struct stack_trace		trace;
 	int				distance;
 
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c
+++ linux-2.6/kernel/lockdep.c
@@ -829,7 +829,7 @@ static int add_lock_to_list(struct lock_
 	if (!entry)
 		return 0;
 
-	entry->class = this;
+	entry->dep_class = this;
 	entry->distance = distance;
 	entry->trace = *trace;
 	/*
@@ -916,7 +916,7 @@ static inline void mark_lock_accessed(st
 	nr = lock - list_entries;
 	WARN_ON(nr >= nr_list_entries);
 	lock->parent = parent;
-	lock->class->dep_gen_id = lockdep_dependency_gen_id;
+	lock->dep_class->dep_gen_id = lockdep_dependency_gen_id;
 }
 
 static inline unsigned long lock_accessed(struct lock_list *lock)
@@ -925,7 +925,7 @@ static inline unsigned long lock_accesse
 
 	nr = lock - list_entries;
 	WARN_ON(nr >= nr_list_entries);
-	return lock->class->dep_gen_id == lockdep_dependency_gen_id;
+	return lock->dep_class->dep_gen_id == lockdep_dependency_gen_id;
 }
 
 static inline struct lock_list *get_lock_parent(struct lock_list *child)
@@ -963,9 +963,9 @@ static int __bfs(struct lock_list *sourc
 	}
 
 	if (forward)
-		head = &source_entry->class->locks_after;
+		head = &source_entry->dep_class->locks_after;
 	else
-		head = &source_entry->class->locks_before;
+		head = &source_entry->dep_class->locks_before;
 
 	if (list_empty(head))
 		goto exit;
@@ -978,15 +978,15 @@ static int __bfs(struct lock_list *sourc
 
 		__cq_dequeue(cq, (unsigned long *)&lock);
 
-		if (!lock->class) {
+		if (!lock->dep_class) {
 			ret = -2;
 			goto exit;
 		}
 
 		if (forward)
-			head = &lock->class->locks_after;
+			head = &lock->dep_class->locks_after;
 		else
-			head = &lock->class->locks_before;
+			head = &lock->dep_class->locks_before;
 
 		list_for_each_entry(entry, head, entry) {
 			if (!lock_accessed(entry)) {
@@ -1046,7 +1046,7 @@ print_circular_bug_entry(struct lock_lis
 	if (debug_locks_silent)
 		return 0;
 	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
+	print_lock_name(target->dep_class);
 	printk(":\n");
 	print_stack_trace(&target->trace, 6);
 
@@ -1086,7 +1086,7 @@ print_circular_bug_header(struct lock_li
 
 static inline int class_equal(struct lock_list *entry, void *data)
 {
-	return entry->class == data;
+	return entry->dep_class == data;
 }
 
 static noinline int print_circular_bug(struct lock_list *this,
@@ -1155,7 +1155,7 @@ unsigned long lockdep_count_forward_deps
 	struct lock_list this;
 
 	this.parent = NULL;
-	this.class = class;
+	this.dep_class = class;
 
 	local_irq_save(flags);
 	arch_spin_lock(&lockdep_lock);
@@ -1182,7 +1182,7 @@ unsigned long lockdep_count_backward_dep
 	struct lock_list this;
 
 	this.parent = NULL;
-	this.class = class;
+	this.dep_class = class;
 
 	local_irq_save(flags);
 	arch_spin_lock(&lockdep_lock);
@@ -1219,14 +1219,14 @@ check_noncircular(struct lock_list *root
 
 static inline int usage_match(struct lock_list *entry, void *bit)
 {
-	return entry->class->usage_mask & (1 << (enum lock_usage_bit)bit);
+	return entry->dep_class->usage_mask & (1 << (enum lock_usage_bit)bit);
 }
 
 
 
 /*
  * Find a node in the forwards-direction dependency sub-graph starting
- * at @root->class that matches @bit.
+ * at @root->dep_class that matches @bit.
  *
  * Return 0 if such a node exists in the subgraph, and put that node
  * into *@target_entry.
@@ -1249,7 +1249,7 @@ find_usage_forwards(struct lock_list *ro
 
 /*
  * Find a node in the backwards-direction dependency sub-graph starting
- * at @root->class that matches @bit.
+ * at @root->dep_class that matches @bit.
  *
  * Return 0 if such a node exists in the subgraph, and put that node
  * into *@target_entry.
@@ -1308,7 +1308,7 @@ print_shortest_lock_dependencies(struct
 	depth = get_lock_depth(leaf);
 
 	do {
-		print_lock_class_header(entry->class, depth);
+		print_lock_class_header(entry->dep_class, depth);
 		printk("%*s ... acquired at:\n", depth, "");
 		print_stack_trace(&entry->trace, 2);
 		printk("\n");
@@ -1363,17 +1363,17 @@ print_bad_irq_dependency(struct task_str
 
 	printk("\nbut this new dependency connects a %s-irq-safe lock:\n",
 		irqclass);
-	print_lock_name(backwards_entry->class);
+	print_lock_name(backwards_entry->dep_class);
 	printk("\n... which became %s-irq-safe at:\n", irqclass);
 
-	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
+	print_stack_trace(backwards_entry->dep_class->usage_traces + bit1, 1);
 
 	printk("\nto a %s-irq-unsafe lock:\n", irqclass);
-	print_lock_name(forwards_entry->class);
+	print_lock_name(forwards_entry->dep_class);
 	printk("\n... which became %s-irq-unsafe at:\n", irqclass);
 	printk("...");
 
-	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
+	print_stack_trace(forwards_entry->dep_class->usage_traces + bit2, 1);
 
 	printk("\nother info that might help us debug this:\n\n");
 	lockdep_print_held_locks(curr);
@@ -1408,7 +1408,7 @@ check_usage(struct task_struct *curr, st
 
 	this.parent = NULL;
 
-	this.class = hlock_class(prev);
+	this.dep_class = hlock_class(prev);
 	ret = find_usage_backwards(&this, bit_backwards, &target_entry);
 	if (ret < 0)
 		return print_bfs_bug(ret);
@@ -1416,7 +1416,7 @@ check_usage(struct task_struct *curr, st
 		return ret;
 
 	that.parent = NULL;
-	that.class = hlock_class(next);
+	that.dep_class = hlock_class(next);
 	ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
 	if (ret < 0)
 		return print_bfs_bug(ret);
@@ -1657,7 +1657,7 @@ check_prev_add(struct task_struct *curr,
 	 * We are using global variables to control the recursion, to
 	 * keep the stackframe size of the recursive functions low:
 	 */
-	this.class = hlock_class(next);
+	this.dep_class = hlock_class(next);
 	this.parent = NULL;
 	ret = check_noncircular(&this, hlock_class(prev), &target_entry);
 	if (unlikely(!ret))
@@ -1687,7 +1687,7 @@ check_prev_add(struct task_struct *curr,
 	 *  L2 added to its dependency list, due to the first chain.)
 	 */
 	list_for_each_entry(entry, &hlock_class(prev)->locks_after, entry) {
-		if (entry->class == hlock_class(next)) {
+		if (entry->dep_class == hlock_class(next)) {
 			if (distance == 1)
 				entry->distance = 1;
 			return 2;
@@ -2083,7 +2083,7 @@ print_irq_inversion_bug(struct task_stru
 		printk("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
 	else
 		printk("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
-	print_lock_name(other->class);
+	print_lock_name(other->dep_class);
 	printk("\n\nand interrupts could create inverse lock ordering between them.\n\n");
 
 	printk("\nother info that might help us debug this:\n");
@@ -2113,7 +2113,7 @@ check_usage_forwards(struct task_struct
 	struct lock_list *uninitialized_var(target_entry);
 
 	root.parent = NULL;
-	root.class = hlock_class(this);
+	root.dep_class = hlock_class(this);
 	ret = find_usage_forwards(&root, bit, &target_entry);
 	if (ret < 0)
 		return print_bfs_bug(ret);
@@ -2137,7 +2137,7 @@ check_usage_backwards(struct task_struct
 	struct lock_list *uninitialized_var(target_entry);
 
 	root.parent = NULL;
-	root.class = hlock_class(this);
+	root.dep_class = hlock_class(this);
 	ret = find_usage_backwards(&root, bit, &target_entry);
 	if (ret < 0)
 		return print_bfs_bug(ret);
@@ -3482,7 +3482,7 @@ static void zap_class(struct lock_class
 	 * involved in:
 	 */
 	for (i = 0; i < nr_list_entries; i++) {
-		if (list_entries[i].class == class)
+		if (list_entries[i].dep_class == class)
 			list_del_rcu(&list_entries[i].entry);
 	}
 	/*
Index: linux-2.6/kernel/lockdep_proc.c
===================================================================
--- linux-2.6.orig/kernel/lockdep_proc.c
+++ linux-2.6/kernel/lockdep_proc.c
@@ -83,8 +83,8 @@ static int l_show(struct seq_file *m, vo
 
 	list_for_each_entry(entry, &class->locks_after, entry) {
 		if (entry->distance == 1) {
-			seq_printf(m, " -> [%p] ", entry->class->key);
-			print_name(m, entry->class);
+			seq_printf(m, " -> [%p] ", entry->dep_class->key);
+			print_name(m, entry->dep_class);
 			seq_puts(m, "\n");
 		}
 	}



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 6/7] lockdep: Maintain rw_state entries in locklist
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
                   ` (4 preceding siblings ...)
  2011-04-17  9:45 ` [RFC][PATCH 5/7] lockdep: Rename lock_list::class Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-18 13:37   ` Yong Zhang
  2011-04-17  9:45 ` [RFC][PATCH 7/7] lockdep: Consider the rw_state of lock while validating the chain Peter Zijlstra
  2011-04-18  3:41 ` [RFC][PATCH 0/7] lockdep: Support recurise-read locks Tetsuo Handa
  7 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: gautham_r_shenoy-lockdep-maintain_rw_state_entries_in.patch --]
[-- Type: text/plain, Size: 4216 bytes --]

From: Gautham R Shenoy <ego@in.ibm.com>

The dependencies are currently maintained using a structure named
locklist. For a dependency A --> B, it saves B's lock_class in an
entry that would be linked to A's locks_after list.

However, in order to make use of the split chains introduced in the
previous patch, we need to enhance this infrastructure to save the
read/write states of A and B for each dependency such that we might
distinguish between the read and write chains.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |    6 ++++++
 kernel/lockdep.c        |   23 +++++++++++++++++------
 2 files changed, 23 insertions(+), 6 deletions(-)

Index: linux-2.6/include/linux/lockdep.h
===================================================================
--- linux-2.6.orig/include/linux/lockdep.h
+++ linux-2.6/include/linux/lockdep.h
@@ -164,6 +164,10 @@ struct lockdep_map {
  *		lock_after/lock_before list of a particular lock.
  * @dep_class - lock_class of the lock which is involved in a dependency with
  *		 the lock to which this entry is linked to.
+ * @this_lock_rw_state - The read/write state of the lock to which this
+ *			 dependency entry belongs to.
+ * @dep_lock_rw_state - The read/write state of the lock with the lock class
+ *			dep_class in this particular dependecy involvement.
  *
  * Every lock has a list of other locks that were taken after it.
  * We only grow the list, never remove from it:
@@ -173,6 +177,8 @@ struct lock_list {
 	struct lock_class		*dep_class;
 	struct stack_trace		trace;
 	int				distance;
+	unsigned int			this_lock_rw_state:3;
+	unsigned int			dep_lock_rw_state:3;
 
 	/*
 	 * The parent field is used to implement breadth-first search, and the
Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c
+++ linux-2.6/kernel/lockdep.c
@@ -816,11 +816,13 @@ static struct lock_list *alloc_list_entr
 /*
  * Add a new dependency to the head of the list:
  */
-static int add_lock_to_list(struct lock_class *class, struct lock_class *this,
+static int add_lock_to_list(struct held_lock *this_hlock,
+			    struct held_lock *dep_hlock,
 			    struct list_head *head, unsigned long ip,
 			    int distance, struct stack_trace *trace)
 {
 	struct lock_list *entry;
+	struct lock_class *dep_class = hlock_class(dep_hlock);
 	/*
 	 * Lock not present yet - get a new dependency struct and
 	 * add it to the list:
@@ -829,9 +831,11 @@ static int add_lock_to_list(struct lock_
 	if (!entry)
 		return 0;
 
-	entry->dep_class = this;
+	entry->dep_class = dep_class;
 	entry->distance = distance;
 	entry->trace = *trace;
+	entry->this_lock_rw_state = this_hlock->rw_state;
+	entry->dep_lock_rw_state = dep_hlock->rw_state;
 	/*
 	 * Since we never remove from the dependency list, the list can
 	 * be walked lockless by other CPUs, it's only allocation
@@ -1690,6 +1694,8 @@ check_prev_add(struct task_struct *curr,
 		if (entry->dep_class == hlock_class(next)) {
 			if (distance == 1)
 				entry->distance = 1;
+			entry->this_lock_rw_state |= prev->rw_state;
+			entry->dep_lock_rw_state |= next->rw_state;
 			return 2;
 		}
 	}
@@ -1697,19 +1703,24 @@ check_prev_add(struct task_struct *curr,
 	if (!trylock_loop && !save_trace(&trace))
 		return 0;
 
+	list_for_each_entry(entry, &hlock_class(next)->locks_before, entry) {
+		if (entry->dep_class == hlock_class(prev)) {
+			entry->this_lock_rw_state |= next->rw_state;
+			entry->dep_lock_rw_state |= prev->rw_state;
+		}
+	}
+
 	/*
 	 * Ok, all validations passed, add the new lock
 	 * to the previous lock's dependency list:
 	 */
-	ret = add_lock_to_list(hlock_class(prev), hlock_class(next),
-			       &hlock_class(prev)->locks_after,
+	ret = add_lock_to_list(prev, next, &hlock_class(prev)->locks_after,
 			       next->acquire_ip, distance, &trace);
 
 	if (!ret)
 		return 0;
 
-	ret = add_lock_to_list(hlock_class(next), hlock_class(prev),
-			       &hlock_class(next)->locks_before,
+	ret = add_lock_to_list(next, prev, &hlock_class(next)->locks_before,
 			       next->acquire_ip, distance, &trace);
 	if (!ret)
 		return 0;



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 7/7] lockdep: Consider the rw_state of lock while validating the chain
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
                   ` (5 preceding siblings ...)
  2011-04-17  9:45 ` [RFC][PATCH 6/7] lockdep: Maintain rw_state entries in locklist Peter Zijlstra
@ 2011-04-17  9:45 ` Peter Zijlstra
  2011-04-18  3:41 ` [RFC][PATCH 0/7] lockdep: Support recurise-read locks Tetsuo Handa
  7 siblings, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-17  9:45 UTC (permalink / raw)
  To: Ingo Molnar, LKML
  Cc: Tetsuo Handa, Steven Rostedt, Thomas Gleixner, Peter Zijlstra

[-- Attachment #1: gautham_r_shenoy-lockdep-consider_the_rw_state_of_lock_while.patch --]
[-- Type: text/plain, Size: 4919 bytes --]

From: Gautham R Shenoy <ego@in.ibm.com>

Currently, while validating a particular dependency chain, we check if by
following the the locks_after list of current lock A, we can reach one of the
previous locks in the current chain.

i.e, if current chain is : A --> B --> C, then starting from C -->
locks_after, can we reach B.

However, if there are dependency chains:

	C--> D --> Rlock(E)

	Rlock(E) --> A,

where E is a Recursive Read lock, this chain cannot cause a deadlock because
of the recursive nature of E.

However, if the dependency chain was:

	Wlock(E) --> A,

it could have caused a deadlock.

We solve this problem by defining conflicting states of each lock where:

	Conflict(WRITE)		  =  WRITE, READ, RECURSIVE_READ
	Conflict(READ)		  =  WRITE, READ
	Conflict(RECURSIVE_READ)  =  WRITE

Thus, while traversing the locks_after chain of a particular lock A, we traverse
through only those entries in the chain where the state of the lock A in the
entry conflicts with the state of the lock A with which we started the
traversal.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/lockdep.c |   43 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 40 insertions(+), 3 deletions(-)

Index: linux-2.6/kernel/lockdep.c
===================================================================
--- linux-2.6.orig/kernel/lockdep.c
+++ linux-2.6/kernel/lockdep.c
@@ -949,6 +949,26 @@ static inline int get_lock_depth(struct
 	return depth;
 }
 
+/*
+ * Conflicting states:
+ * - A Recursive read conflicts only with a write.
+ * - A Non-recursive read can conflict with a non-recursive read and write.
+ * - A Write conflicts with Write, simple read and recursive read.
+ */
+
+#define get_rec_read_conflict(state)		\
+	((((is_write(state)) << (_RR_ - _W_))) & STATE_RECURSIVE_READ)
+
+#define get_simple_read_conflict(state)		\
+	(((((is_write(state)) << (_R_ - _W_))) | (is_simple_read(state))) \
+							& STATE_READ)
+#define get_write_conflict(state)	STATE_WRITE
+
+#define get_lock_conflict_states(state)		\
+	(get_rec_read_conflict(state) |	\
+	get_simple_read_conflict(state) |	\
+	get_write_conflict(state))
+
 static int __bfs(struct lock_list *source_entry,
 		 void *data,
 		 int (*match)(struct lock_list *entry, void *data),
@@ -958,6 +978,7 @@ static int __bfs(struct lock_list *sourc
 	struct lock_list *entry;
 	struct list_head *head;
 	struct circular_queue *cq = &lock_cq;
+	unsigned int conflict;
 	int ret = 1;
 
 	if (match(source_entry, data)) {
@@ -992,9 +1013,13 @@ static int __bfs(struct lock_list *sourc
 		else
 			head = &lock->dep_class->locks_before;
 
+		conflict = get_lock_conflict_states(lock->dep_lock_rw_state);
+
 		list_for_each_entry(entry, head, entry) {
 			if (!lock_accessed(entry)) {
 				unsigned int cq_depth;
+				if (!(entry->this_lock_rw_state & conflict))
+					continue;
 				mark_lock_accessed(entry, lock);
 				if (match(entry, data)) {
 					*target_entry = entry;
@@ -1411,8 +1436,9 @@ check_usage(struct task_struct *curr, st
 	struct lock_list *uninitialized_var(target_entry1);
 
 	this.parent = NULL;
-
 	this.dep_class = hlock_class(prev);
+	this.this_lock_rw_state = next->rw_state;
+	this.dep_lock_rw_state = prev->rw_state;
 	ret = find_usage_backwards(&this, bit_backwards, &target_entry);
 	if (ret < 0)
 		return print_bfs_bug(ret);
@@ -1421,6 +1447,8 @@ check_usage(struct task_struct *curr, st
 
 	that.parent = NULL;
 	that.dep_class = hlock_class(next);
+	that.this_lock_rw_state = prev->rw_state;
+	that.dep_lock_rw_state = next->rw_state;
 	ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
 	if (ret < 0)
 		return print_bfs_bug(ret);
@@ -1663,6 +1691,8 @@ check_prev_add(struct task_struct *curr,
 	 */
 	this.dep_class = hlock_class(next);
 	this.parent = NULL;
+	this.this_lock_rw_state = prev->rw_state;
+	this.dep_lock_rw_state = next->rw_state;
 	ret = check_noncircular(&this, hlock_class(prev), &target_entry);
 	if (unlikely(!ret))
 		return print_circular_bug(&this, target_entry, next, prev);
@@ -1776,7 +1806,8 @@ check_prevs_add(struct task_struct *curr
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (!is_rec_read(hlock->rw_state)) {
+		if (!is_rec_read(hlock->rw_state) ||
+				is_first_rec_read(hlock->rw_state)) {
 			if (!check_prev_add(curr, hlock, next,
 						distance, trylock_loop))
 				return 0;
@@ -1950,9 +1981,15 @@ static int validate_chain(struct task_st
 
 		if (!ret)
 			return 0;
+
+		if (ret != 2 && is_rec_read(hlock->rw_state))
+			hlock->rw_state |= STATE_FIRST_RECURSIVE_READ;
+
+
 		/*
 		 * Add dependency only if this lock is not the head
-		 * of the chain, and if it's not a secondary read-lock:
+		 * of the chain, and if it's not the first instance of
+		 * the recursive read-lock:
 		 */
 		if (!chain_head && ret != 2)
 			if (!check_prevs_add(curr, hlock))



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
                   ` (6 preceding siblings ...)
  2011-04-17  9:45 ` [RFC][PATCH 7/7] lockdep: Consider the rw_state of lock while validating the chain Peter Zijlstra
@ 2011-04-18  3:41 ` Tetsuo Handa
  2011-04-22  7:19   ` Yong Zhang
  7 siblings, 1 reply; 30+ messages in thread
From: Tetsuo Handa @ 2011-04-18  3:41 UTC (permalink / raw)
  To: a.p.zijlstra; +Cc: rostedt, tglx, mingo, linux-kernel

(Continued from https://lkml.org/lkml/2011/3/31/240 "Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin().")
Peter Zijlstra wrote:
> On Wed, 2011-03-30 at 21:17 +0900, Tetsuo Handa wrote:
> > Peter Zijlstra wrote:
> > > >  Also, I assume you meant to call
> > > > spin_acquire() before entering the spin state (as with
> > > > 
> > > >   static inline void __raw_spin_lock(raw_spinlock_t *lock)
> > > >   {
> > > >         preempt_disable();
> > > >         spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
> > > >         LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
> > > >   }
> > > > 
> > > > . Otherwise, lockdep cannot report it when hit this bug upon the first call to
> > > > this function). 
> > > 
> > > Huh no, of course not, a seqlock read side cannot contend in the classic
> > > sense.
> > 
> > I couldn't understand what 'contend' means. I think
> > 
> >   static __always_inline unsigned read_seqbegin(const seqlock_t *sl)
> >   {
> >   	unsigned ret;
> >   repeat:
> >   	ret = sl->sequence;
> >   	smp_rmb();
> >   	if (unlikely(ret & 1)) {
> >   		cpu_relax();
> >   		goto repeat;
> >   	}
> >   	return ret;
> >   }
> > 
> > is equivalent (except that above one will not write to any kernel memory) to
> > 
> >   static __always_inline unsigned read_seqbegin(seqlock_t *sl)
> >   {
> >   	unsigned ret;
> >   	unsigned long flags;
> >   	spin_lock_irqsave(&sl->lock, flags);
> >   	ret = sl->sequence;
> >   	spin_unlock_irqrestore(&sl->lock, flags);
> >   	return ret;
> >   }
> > 
> > because read_seqbegin() cannot return to the reader until the writer (if there
> > is one) calls write_sequnlock().
> 
> It more or less it, but conceptually the seqlock read-side is a
> non-blocking algorithm and thus doesn't block or contend. The above
> initial wait is merely an optimization to avoid having to retry, which
> could be more expensive than simply waiting there.

I agree that the loop in read_seqbegin() is an optimization. But I don't agree
with the location to insert lockdep annotation, for the location where I got a
freeze between "seqlock_t rename_lock" and "DEFINE_BRLOCK(vfsmount_lock)" was
the loop in read_seqbegin(). Conceptually the seqlock read-side is a
non-blocking algorithm and thus doesn't block or contend. But when locking
dependency is inappropriate, the seqlock read-side actually blocks forever in
read_seqbegin(). In order to catch inappropriate locking dependency, I still
think that lockdep annotation should be inserted in a way equivalent with

  static __always_inline unsigned read_seqbegin(seqlock_t *sl)
  {
  	unsigned ret;
  	unsigned long flags;
  	spin_lock_irqsave(&sl->lock, flags);
  	ret = sl->sequence;
  	spin_unlock_irqrestore(&sl->lock, flags);
  	return ret;
  }

.

> > Anyway, could you show me read_seqbegin2()/read_seqretry2() for testing with
> > locktest module?
> 
> Like I wrote before:
> 
> > @@ -94,6 +94,7 @@ static __always_inline unsigned read_seqbegin(const seqlock_t *sl)
> >                 cpu_relax();
> >                 goto repeat;
> >         }
> > +       rwlock_acquire_read(&sl->lock->dep_map, 0, 0, _RET_IP_);
> >  
> >         return ret;
> >  }
> > @@ -107,6 +108,8 @@ static __always_inline int read_seqretry(const seqlock_t *sl, unsigned start)
> >  {
> >         smp_rmb();
> >  
> > +       rwlock_release(&sl->lock->dep_map, 1, _RET_IP_);
> > +
> >         return unlikely(sl->sequence != start);
> >  }
> 
> Should do, except that lockdep doesn't properly work for read-recursive
> locks, which needs to get fixed.

I assume this patchset is meant to fix this problem. I applied it on d733ed6c
"Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block" and ran the
locktest module.

---------- Start of locktest.c ----------
#include <linux/module.h>
#include <linux/seqlock.h>
#include <linux/lglock.h>
#include <linux/proc_fs.h>

static seqlock_t seqlock1;
static DEFINE_BRLOCK(brlock1);
static DEFINE_RWLOCK(rwlock1);

static __always_inline unsigned read_seqbegin2(seqlock_t *sl)
{
	unsigned ret;

repeat:
	ret = sl->sequence;
	smp_rmb();
	if (unlikely(ret & 1)) {
		cpu_relax();
		goto repeat;
	}
	//spin_acquire(&sl->lock.dep_map, 0, 0, _RET_IP_);
	rwlock_acquire_read(&sl->lock.dep_map, 0, 0, _RET_IP_);
	return ret;
}

static __always_inline int read_seqretry2(seqlock_t *sl, unsigned start)
{
	smp_rmb();
	//spin_release(&sl->lock.dep_map, 1, _RET_IP_);
	rwlock_release(&sl->lock.dep_map, 1, _RET_IP_);
	return unlikely(sl->sequence != start);
}

static int locktest_open1(struct inode *inode, struct file *file)
{
	write_seqlock(&seqlock1);
	br_read_lock(brlock1);
	br_read_unlock(brlock1);
	write_sequnlock(&seqlock1);
	return -EINVAL;
}

static int locktest_open2(struct inode *inode, struct file *file)
{
	unsigned int seq;
	br_write_lock(brlock1);
	seq = read_seqbegin2(&seqlock1);
	read_seqretry2(&seqlock1, seq);
	br_write_unlock(brlock1);
	return -EINVAL;
}

static int locktest_open3(struct inode *inode, struct file *file)
{
	write_seqlock(&seqlock1);
	read_lock(&rwlock1);
	read_unlock(&rwlock1);
	write_sequnlock(&seqlock1);
	return -EINVAL;
}

static int locktest_open4(struct inode *inode, struct file *file)
{
	unsigned int seq;
	write_lock(&rwlock1);
	seq = read_seqbegin2(&seqlock1);
	read_seqretry2(&seqlock1, seq);
	write_unlock(&rwlock1);
	return -EINVAL;
}

static const struct file_operations locktest_operations1 = {
	.open = locktest_open1
};

static const struct file_operations locktest_operations2 = {
	.open = locktest_open2
};

static const struct file_operations locktest_operations3 = {
	.open = locktest_open3
};

static const struct file_operations locktest_operations4 = {
	.open = locktest_open4
};

static int __init locktest_init(void)
{
	struct proc_dir_entry *entry;
	seqlock_init(&seqlock1);
	entry = create_proc_entry("locktest1", 0666, NULL);
	if (entry)
		entry->proc_fops = &locktest_operations1;
	entry = create_proc_entry("locktest2", 0666, NULL);
	if (entry)
		entry->proc_fops = &locktest_operations2;
	entry = create_proc_entry("locktest3", 0666, NULL);
	if (entry)
		entry->proc_fops = &locktest_operations3;
	entry = create_proc_entry("locktest4", 0666, NULL);
	if (entry)
		entry->proc_fops = &locktest_operations4;
	return 0;
}

static void __exit locktest_exit(void)
{
	remove_proc_entry("locktest1", NULL);
	remove_proc_entry("locktest2", NULL);
	remove_proc_entry("locktest3", NULL);
	remove_proc_entry("locktest4", NULL);
}

module_init(locktest_init);
module_exit(locktest_exit);
MODULE_LICENSE("GPL");
---------- End of locktest.c ----------

Test results for above program:

"cat /proc/locktest1 /proc/locktest2" => Detect fail
"cat /proc/locktest2 /proc/locktest1" => Detect fail
"cat /proc/locktest3 /proc/locktest4" => Detect fail
"cat /proc/locktest4 /proc/locktest3" => Detect fail

If I change from rwlock_acquire_read() to spin_acquire() in read_seqbegin2()
and from rwlock_release() to spin_release() in read_seqretry2():

"cat /proc/locktest1 /proc/locktest2" => Detect fail
"cat /proc/locktest2 /proc/locktest1" => Detect OK (shown below)
"cat /proc/locktest3 /proc/locktest4" => Detect fail
"cat /proc/locktest4 /proc/locktest3" => Detect OK (shown below)

Guessing from my testcases, read_seqbegin2() will fail to detect the problem
unless we use spin_acquire()/spin_release() rather than
rwlock_acquire_read()/rwlock_release().

Also, something is still wrong because lockdep fails to detect the problem
for "cat /proc/locktest1 /proc/locktest2" and
"cat /proc/locktest3 /proc/locktest4" cases.

[   83.551455] 
[   83.551457] =======================================================
[   83.555293] [ INFO: possible circular locking dependency detected ]
[   83.555293] 2.6.39-rc3-00228-gd733ed6-dirty #259
[   83.555293] -------------------------------------------------------
[   83.555293] cat/2768 is trying to acquire lock:
[   83.555293]  (brlock1_lock_dep_map){++++..}, at: [<e08150b0>] brlock1_local_lock+0x0/0x90 [locktest]
[   83.555293] 
[   83.555293] but task is already holding lock:
[   83.555293]  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08154dd>] locktest_open1+0xd/0x40 [locktest]
[   83.555293] 
[   83.555293] which lock already depends on the new lock.
[   83.555293] 
[   83.555293] 
[   83.555293] the existing dependency chain (in reverse order) is:
[   83.555293] 
[   83.555293] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
[   83.635281]        [<c106c499>] check_prevs_add+0xb9/0x110
[   83.635281]        [<c106c840>] validate_chain+0x320/0x5a0
[   83.635281]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
[   83.635281]        [<c107001a>] lock_acquire+0x7a/0xa0
[   83.635281]        [<e0815555>] locktest_open2+0x45/0x70 [locktest]
[   83.635281]        [<c1118355>] proc_reg_open+0x65/0xe0
[   83.635281]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
[   83.635281]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
[   83.635281]        [<c10dc1d8>] do_last+0xf8/0x6c0
[   83.635281]        [<c10dc846>] path_openat+0xa6/0x340
[   83.635281]        [<c10dcb10>] do_filp_open+0x30/0x80
[   83.635281]        [<c10cefa1>] do_sys_open+0x101/0x1a0
[   83.635281]        [<c10cf069>] sys_open+0x29/0x40
[   83.635281]        [<c13b43c1>] syscall_call+0x7/0xb
[   83.635281] 
[   83.635281] -> #0 (brlock1_lock_dep_map){++++..}:
[   83.635281]        [<c106c34c>] check_prev_add+0x78c/0x820
[   83.635281]        [<c106c499>] check_prevs_add+0xb9/0x110
[   83.635281]        [<c106c840>] validate_chain+0x320/0x5a0
[   83.635281]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
[   83.635281]        [<c107001a>] lock_acquire+0x7a/0xa0
[   83.635281]        [<e08150e3>] brlock1_local_lock+0x33/0x90 [locktest]
[   83.635281]        [<e08154e8>] locktest_open1+0x18/0x40 [locktest]
[   83.635281]        [<c1118355>] proc_reg_open+0x65/0xe0
[   83.635281]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
[   83.635281]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
[   83.635281]        [<c10dc1d8>] do_last+0xf8/0x6c0
[   83.635281]        [<c10dc846>] path_openat+0xa6/0x340
[   83.635281]        [<c10dcb10>] do_filp_open+0x30/0x80
[   83.635281]        [<c10cefa1>] do_sys_open+0x101/0x1a0
[   83.635281]        [<c10cf069>] sys_open+0x29/0x40
[   83.635281]        [<c13b43c1>] syscall_call+0x7/0xb
[   83.635281] 
[   83.635281] other info that might help us debug this:
[   83.635281] 
[   83.635281] 1 lock held by cat/2768:
[   83.635281]  #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08154dd>] locktest_open1+0xd/0x40 [locktest]
[   83.635281] 
[   83.635281] stack backtrace:
[   83.635281] Pid: 2768, comm: cat Not tainted 2.6.39-rc3-00228-gd733ed6-dirty #259
[   83.635281] Call Trace:
[   83.635281]  [<c106ade6>] print_circular_bug+0xc6/0xd0
[   83.635281]  [<c106c34c>] check_prev_add+0x78c/0x820
[   83.635281]  [<c1005d3b>] ? print_context_stack+0x3b/0xa0
[   83.635281]  [<c1004fa1>] ? dump_trace+0x81/0xe0
[   83.635281]  [<c106c499>] check_prevs_add+0xb9/0x110
[   83.635281]  [<c106c840>] validate_chain+0x320/0x5a0
[   83.635281]  [<c106df7c>] ? mark_lock+0x21c/0x3c0
[   83.635281]  [<c106e927>] __lock_acquire+0x2a7/0x8f0
[   83.635281]  [<c107001a>] lock_acquire+0x7a/0xa0
[   83.635281]  [<e08150b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest]
[   83.635281]  [<e08154d0>] ? brlock1_global_unlock+0xa0/0xa0 [locktest]
[   83.635281]  [<e08150e3>] brlock1_local_lock+0x33/0x90 [locktest]
[   83.635281]  [<e08150b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest]
[   83.635281]  [<e08154e8>] locktest_open1+0x18/0x40 [locktest]
[   83.635281]  [<c1118355>] proc_reg_open+0x65/0xe0
[   83.635281]  [<c10ce78f>] __dentry_open+0x16f/0x2e0
[   83.635281]  [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
[   83.635281]  [<c11182f0>] ? proc_reg_mmap+0x80/0x80
[   83.635281]  [<c10dc1d8>] do_last+0xf8/0x6c0
[   83.635281]  [<c10db00c>] ? path_init+0x2cc/0x3c0
[   83.635281]  [<c10dc846>] path_openat+0xa6/0x340
[   83.635281]  [<c106d80b>] ? trace_hardirqs_off+0xb/0x10
[   83.635281]  [<c10dcb10>] do_filp_open+0x30/0x80
[   83.635281]  [<c13b3a5d>] ? _raw_spin_unlock+0x1d/0x20
[   83.635281]  [<c10e9f11>] ? alloc_fd+0xe1/0x1a0
[   83.635281]  [<c10cefa1>] do_sys_open+0x101/0x1a0
[   83.635281]  [<c10cfc8b>] ? vfs_write+0x10b/0x130
[   83.635281]  [<c10cf069>] sys_open+0x29/0x40
[   83.635281]  [<c13b43c1>] syscall_call+0x7/0xb



[   82.758647] 
[   82.758649] =======================================================
[   82.762520] [ INFO: possible circular locking dependency detected ]
[   82.762520] 2.6.39-rc3-00228-gd733ed6-dirty #259
[   82.762520] -------------------------------------------------------
[   82.762520] cat/2768 is trying to acquire lock:
[   82.762520]  (rwlock1){++++..}, at: [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
[   82.762520] 
[   82.762520] but task is already holding lock:
[   82.762520]  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e081558d>] locktest_open3+0xd/0x40 [locktest]
[   82.762520] 
[   82.762520] which lock already depends on the new lock.
[   82.762520] 
[   82.762520] 
[   82.762520] the existing dependency chain (in reverse order) is:
[   82.762520] 
[   82.762520] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
[   82.841627]        [<c106c499>] check_prevs_add+0xb9/0x110
[   82.841627]        [<c106c840>] validate_chain+0x320/0x5a0
[   82.841627]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
[   82.841627]        [<c107001a>] lock_acquire+0x7a/0xa0
[   82.841627]        [<e081560a>] locktest_open4+0x4a/0x90 [locktest]
[   82.841627]        [<c1118355>] proc_reg_open+0x65/0xe0
[   82.841627]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
[   82.841627]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
[   82.841627]        [<c10dc1d8>] do_last+0xf8/0x6c0
[   82.841627]        [<c10dc846>] path_openat+0xa6/0x340
[   82.841627]        [<c10dcb10>] do_filp_open+0x30/0x80
[   82.841627]        [<c10cefa1>] do_sys_open+0x101/0x1a0
[   82.841627]        [<c10cf069>] sys_open+0x29/0x40
[   82.841627]        [<c13b43c1>] syscall_call+0x7/0xb
[   82.841627] 
[   82.841627] -> #0 (rwlock1){++++..}:
[   82.841627]        [<c106c34c>] check_prev_add+0x78c/0x820
[   82.841627]        [<c106c499>] check_prevs_add+0xb9/0x110
[   82.841627]        [<c106c840>] validate_chain+0x320/0x5a0
[   82.841627]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
[   82.841627]        [<c107001a>] lock_acquire+0x7a/0xa0
[   82.841627]        [<c13b3ba9>] _raw_read_lock+0x39/0x70
[   82.841627]        [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
[   82.841627]        [<c1118355>] proc_reg_open+0x65/0xe0
[   82.841627]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
[   82.841627]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
[   82.841627]        [<c10dc1d8>] do_last+0xf8/0x6c0
[   82.841627]        [<c10dc846>] path_openat+0xa6/0x340
[   82.841627]        [<c10dcb10>] do_filp_open+0x30/0x80
[   82.841627]        [<c10cefa1>] do_sys_open+0x101/0x1a0
[   82.841627]        [<c10cf069>] sys_open+0x29/0x40
[   82.841627]        [<c13b43c1>] syscall_call+0x7/0xb
[   82.841627] 
[   82.841627] other info that might help us debug this:
[   82.841627] 
[   82.841627] 1 lock held by cat/2768:
[   82.841627]  #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e081558d>] locktest_open3+0xd/0x40 [locktest]
[   82.841627] 
[   82.841627] stack backtrace:
[   82.841627] Pid: 2768, comm: cat Not tainted 2.6.39-rc3-00228-gd733ed6-dirty #259
[   82.841627] Call Trace:
[   82.841627]  [<c106ade6>] print_circular_bug+0xc6/0xd0
[   82.841627]  [<c106c34c>] check_prev_add+0x78c/0x820
[   82.841627]  [<c1005d3b>] ? print_context_stack+0x3b/0xa0
[   82.841627]  [<c1004fa1>] ? dump_trace+0x81/0xe0
[   82.841627]  [<c106c499>] check_prevs_add+0xb9/0x110
[   82.841627]  [<c106c840>] validate_chain+0x320/0x5a0
[   82.841627]  [<c106df7c>] ? mark_lock+0x21c/0x3c0
[   82.841627]  [<c106e927>] __lock_acquire+0x2a7/0x8f0
[   82.841627]  [<c107001a>] lock_acquire+0x7a/0xa0
[   82.841627]  [<e081559d>] ? locktest_open3+0x1d/0x40 [locktest]
[   82.841627]  [<e0815580>] ? locktest_open2+0x70/0x70 [locktest]
[   82.841627]  [<c13b3ba9>] _raw_read_lock+0x39/0x70
[   82.841627]  [<e081559d>] ? locktest_open3+0x1d/0x40 [locktest]
[   82.841627]  [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
[   82.841627]  [<c1118355>] proc_reg_open+0x65/0xe0
[   82.841627]  [<c10ce78f>] __dentry_open+0x16f/0x2e0
[   82.841627]  [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
[   82.841627]  [<c11182f0>] ? proc_reg_mmap+0x80/0x80
[   82.841627]  [<c10dc1d8>] do_last+0xf8/0x6c0
[   82.841627]  [<c10db00c>] ? path_init+0x2cc/0x3c0
[   82.841627]  [<c10dc846>] path_openat+0xa6/0x340
[   82.841627]  [<c106d80b>] ? trace_hardirqs_off+0xb/0x10
[   82.841627]  [<c10dcb10>] do_filp_open+0x30/0x80
[   82.841627]  [<c13b3a5d>] ? _raw_spin_unlock+0x1d/0x20
[   82.841627]  [<c10e9f11>] ? alloc_fd+0xe1/0x1a0
[   82.841627]  [<c10cefa1>] do_sys_open+0x101/0x1a0
[   82.841627]  [<c10cfc8b>] ? vfs_write+0x10b/0x130
[   82.841627]  [<c10cf069>] sys_open+0x29/0x40
[   82.841627]  [<c13b43c1>] syscall_call+0x7/0xb

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
@ 2011-04-18 13:34   ` Yong Zhang
  2011-04-18 16:34     ` Steven Rostedt
  2011-04-18 16:26   ` Steven Rostedt
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Yong Zhang @ 2011-04-18 13:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Tetsuo Handa, Steven Rostedt, Thomas Gleixner

On Sun, Apr 17, 2011 at 11:45:08AM +0200, Peter Zijlstra wrote:
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> Currently we do not save the recursive read dependencies in the dependency
> chain. As a result, a deadlock caused by the following chains are not spotted,
> since we never have the chain 1 in our dependency list:
> 
>  1: Rlock(A) --> lock(B)
>  2: lock(B)  --> Wlock(A), where A is a recursive read lock.
> 
> Before adding the Recursive Read locks to the dependency chains, we need to
> distinguish them from the normal read locks since the conflicting states for
> these two are quite different.
> 
> Currently the read/write status of a lock while it's acquired is denoted by a
> monotonically increasing variable where:
> 
>  0 - WRITE
>  1 - READ
>  2 - RECURSIVE READ
> 
> In this patch, we propose to modify this distinction from a monotonically
> increasing variable to a bit mask where:
> 
>  0x1 - WRITE
>  0x2 - READ
>  0x4 - RECURSIVE READ
> 
> This helps us to define the conflicting states for each lock with ease:
> Thereby, the conflicting states for a given states are defined as follows:
> 
>  Conflicting_states(WRITE):             RECURSIVE_READ | READ | WRITE
>  Conflicting_states(READ):                               READ | WRITE
>  Conflicting_states(RECURSIVE_READ):                            WRITE
> 
> Also, we use one more bit in the bitmask to distinguish the first recursive
> read in the current chain from the others, since it is sufficient to add only
> this dependency to the dependency list.
> 
> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  include/linux/lockdep.h |  107 ++++++++++++++++++++++++++++++++++++------------
>  kernel/lockdep.c        |   46 ++++++++++----------
>  2 files changed, 105 insertions(+), 48 deletions(-)
> 
> @@ -2273,7 +2273,7 @@ mark_held_locks(struct task_struct *curr
>  		hlock = curr->held_locks + i;
>  
>  		usage_bit = 2 + (mark << 2); /* ENABLED */
> -		if (hlock->read)
> +		if (hlock->rw_state)

		is_read(hlock->rw_state) ?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 6/7] lockdep: Maintain rw_state entries in locklist
  2011-04-17  9:45 ` [RFC][PATCH 6/7] lockdep: Maintain rw_state entries in locklist Peter Zijlstra
@ 2011-04-18 13:37   ` Yong Zhang
  0 siblings, 0 replies; 30+ messages in thread
From: Yong Zhang @ 2011-04-18 13:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Tetsuo Handa, Steven Rostedt, Thomas Gleixner

On Sun, Apr 17, 2011 at 11:45:11AM +0200, Peter Zijlstra wrote:
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> The dependencies are currently maintained using a structure named
> locklist. For a dependency A --> B, it saves B's lock_class in an
> entry that would be linked to A's locks_after list.
> 
> However, in order to make use of the split chains introduced in the
> previous patch, we need to enhance this infrastructure to save the
> read/write states of A and B for each dependency such that we might
> distinguish between the read and write chains.
> 
> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  include/linux/lockdep.h |    6 ++++++
>  kernel/lockdep.c        |   23 +++++++++++++++++------
>  2 files changed, 23 insertions(+), 6 deletions(-)
>
> @@ -1690,6 +1694,8 @@ check_prev_add(struct task_struct *curr,
>  		if (entry->dep_class == hlock_class(next)) {
>  			if (distance == 1)
>  				entry->distance = 1;
> +			entry->this_lock_rw_state |= prev->rw_state;
> +			entry->dep_lock_rw_state |= next->rw_state;

If we could do this and return,

>  			return 2;
>  		}
>  	}
> @@ -1697,19 +1703,24 @@ check_prev_add(struct task_struct *curr,
>  	if (!trylock_loop && !save_trace(&trace))
>  		return 0;
>  
> +	list_for_each_entry(entry, &hlock_class(next)->locks_before, entry) {
> +		if (entry->dep_class == hlock_class(prev)) {
> +			entry->this_lock_rw_state |= next->rw_state;
> +			entry->dep_lock_rw_state |= prev->rw_state;
> +		}
> +	}
> +

Do we have any change to do above?

Or am I missing something?

Thanks,
Yong

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 2/7] lockdep: Remove redundant read checks
  2011-04-17  9:45 ` [RFC][PATCH 2/7] lockdep: Remove redundant read checks Peter Zijlstra
@ 2011-04-18 14:28   ` Steven Rostedt
  0 siblings, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 14:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Sun, 2011-04-17 at 11:45 +0200, Peter Zijlstra wrote:
> plain text document attachment
> (gautham_r_shenoy-lockdep-remove_redundant_read_checks_.patch)
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> Do various simplifications:
> 
> 1) In kernel/lockdep.c::validate_chain():
> 
> 	ret = check_deadlock(curr, hlock, lock, hlock->read);
> 
> ret == 2 only if hlock->read == 2.
> 
> Hence:
> 
> 	if (ret == 2)
> 		hlock->read = 2;
> 
> is redundant and can be removed.

Is this really true? From check_deadlock():

		/*
		 * Allow read-after-read recursion of the same
		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
		 */
		if ((read == 2) && prev->read)
			return 2;

		/*
		 * We're holding the nest_lock, which serializes this lock's
		 * nesting behaviour.
		 */
		if (nest)
			return 2;

We return '2' also when we nest.

> 
> 2) In kernel/lockdep.c::check_prevs_add(curr, next):
> 
> 	if (hlock->read != 2)
> 		check_prev_add(curr, hlock, next, distance);
> 
> Thus, check_prev_add is called only when hlock->read != 2.
> 
> >From the conclusions of 2):
> 
> kernel/lockdep.c::check_prev_add(curr, prev, next, distance) gets called
> iff prev->read != 2.
> 
> Hence, in kernel/lockdep.c::check_prev_add(curr, prev, next, distance):
> 
> 	if (prev->read == 2)
> 		return 1;
> 
> is redunant and can be removed.

I agree with this one, but perhaps a comment should be added in its
place.

-- Steve

> 
> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  kernel/lockdep.c |    9 +--------
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> Index: tip/kernel/lockdep.c
> ===================================================================
> --- tip.orig/kernel/lockdep.c
> +++ tip/kernel/lockdep.c
> @@ -1676,7 +1676,7 @@ check_prev_add(struct task_struct *curr,
>  	 * write-lock never takes any other locks, then the reads are
>  	 * equivalent to a NOP.
>  	 */
> -	if (next->read == 2 || prev->read == 2)
> +	if (next->read == 2)
>  		return 1;
>  	/*
>  	 * Is the <prev> -> <next> dependency already present?
> @@ -1940,13 +1940,6 @@ static int validate_chain(struct task_st
>  		if (!ret)
>  			return 0;
>  		/*
> -		 * Mark recursive read, as we jump over it when
> -		 * building dependencies (just like we jump over
> -		 * trylock entries):
> -		 */
> -		if (ret == 2)
> -			hlock->read = 2;
> -		/*
>  		 * Add dependency only if this lock is not the head
>  		 * of the chain, and if it's not a secondary read-lock:
>  		 */
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
  2011-04-18 13:34   ` Yong Zhang
@ 2011-04-18 16:26   ` Steven Rostedt
  2011-04-18 16:27   ` Steven Rostedt
  2011-04-18 16:31   ` Steven Rostedt
  3 siblings, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 16:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Sun, 2011-04-17 at 11:45 +0200, Peter Zijlstra wrote:

> Currently the read/write status of a lock while it's acquired is denoted by a
> monotonically increasing variable where:
> 
>  0 - WRITE
>  1 - READ
>  2 - RECURSIVE READ
> 
> In this patch, we propose to modify this distinction from a monotonically
> increasing variable to a bit mask where:
> 
>  0x1 - WRITE
>  0x2 - READ
>  0x4 - RECURSIVE READ
> 


>  /*
> @@ -286,6 +287,15 @@ extern void lockdep_init_map(struct lock
>  
>  #define lockdep_set_novalidate_class(lock) \
>  	lockdep_set_class(lock, &__lockdep_no_validate__)
> +
> +/* lock_state bits */
> +#define _W_		0
> +#define _R_		(_W_ + 1)
> +#define _RR_		(_W_ + 2)
> +#define _FIRST_RR_	(_W_ + 3)

Eek!!! Please comment these. I hate trying to figure out what RR means.
If I did not read the change log, my first thought was "Round Robin".

Adding something like:

/*
 * lock_state bits:
 *   WRITE
 *   READ
 *   RECURSIVE_READ
 *   FIRST RECURSIVE_READ
 */

Above that would help tremendously. This code is obfuscated enough, we
need many more comments. (I plan on sending out patches to do this too).




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
  2011-04-18 13:34   ` Yong Zhang
  2011-04-18 16:26   ` Steven Rostedt
@ 2011-04-18 16:27   ` Steven Rostedt
  2011-04-18 16:31   ` Steven Rostedt
  3 siblings, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 16:27 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Sun, 2011-04-17 at 11:45 +0200, Peter Zijlstra wrote:
> plain text document attachment
> (gautham_r_shenoy-lockdep-annotate_read_write_states.patch)
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> Currently we do not save the recursive read dependencies in the dependency
> chain. As a result, a deadlock caused by the following chains are not spotted,
> since we never have the chain 1 in our dependency list:
> 
>  1: Rlock(A) --> lock(B)
>  2: lock(B)  --> Wlock(A), where A is a recursive read lock.
> 
> Before adding the Recursive Read locks to the dependency chains, we need to
> distinguish them from the normal read locks since the conflicting states for
> these two are quite different.
> 
> Currently the read/write status of a lock while it's acquired is denoted by a
> monotonically increasing variable where:
> 
>  0 - WRITE
>  1 - READ
>  2 - RECURSIVE READ
> 
> In this patch, we propose to modify this distinction from a monotonically
> increasing variable to a bit mask where:
> 
>  0x1 - WRITE
>  0x2 - READ
>  0x4 - RECURSIVE READ
> 
> This helps us to define the conflicting states for each lock with ease:
> Thereby, the conflicting states for a given states are defined as follows:
> 
>  Conflicting_states(WRITE):             RECURSIVE_READ | READ | WRITE
>  Conflicting_states(READ):                               READ | WRITE
>  Conflicting_states(RECURSIVE_READ):                            WRITE
> 
> Also, we use one more bit in the bitmask to distinguish the first recursive
> read in the current chain from the others, since it is sufficient to add only
> this dependency to the dependency list.
> 
> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  include/linux/lockdep.h |  107 ++++++++++++++++++++++++++++++++++++------------
>  kernel/lockdep.c        |   46 ++++++++++----------
>  2 files changed, 105 insertions(+), 48 deletions(-)
> 
> Index: tip/include/linux/lockdep.h
> ===================================================================
> --- tip.orig/include/linux/lockdep.h
> +++ tip/include/linux/lockdep.h
> @@ -233,10 +233,11 @@ struct held_lock {
>  	unsigned int irq_context:2; /* bit 0 - soft, bit 1 - hard */
>  	unsigned int trylock:1;						/* 16 bits */
>  
> -	unsigned int read:2;        /* see lock_acquire() comment */
> +	unsigned int rw_state:4;        /* see lock states comment */
>  	unsigned int check:2;       /* see lock_acquire() comment */
>  	unsigned int hardirqs_off:1;
> -	unsigned int references:11;					/* 32 bits */
> +
> +	unsigned short references;
>  };
>  
>  /*
> @@ -286,6 +287,15 @@ extern void lockdep_init_map(struct lock
>  
>  #define lockdep_set_novalidate_class(lock) \
>  	lockdep_set_class(lock, &__lockdep_no_validate__)
> +
> +/* lock_state bits */
> +#define _W_		0
> +#define _R_		(_W_ + 1)
> +#define _RR_		(_W_ + 2)
> +#define _FIRST_RR_	(_W_ + 3)
> +
> +#define get_lock_state(lock_bit)	(1 << lock_bit)
> +
>  /*
>   * Compare locking classes
>   */
> @@ -298,13 +308,40 @@ static inline int lockdep_match_key(stru
>  }
>  
>  /*
> - * Acquire a lock.
> + * lock states:
> + *
> + * A given lock can have one of the following states:
> + * - STATE_WRITE: Exclusive Write.
> + * - STATE_READ: Non-recursive read.
> + * - STATE_RECURSIVE_READ: Recursive Read.
>   *

Heh, you did comment them. But please move this up, or duplicate them.

;)

-- Steve



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
                     ` (2 preceding siblings ...)
  2011-04-18 16:27   ` Steven Rostedt
@ 2011-04-18 16:31   ` Steven Rostedt
  3 siblings, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 16:31 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Sun, 2011-04-17 at 11:45 +0200, Peter Zijlstra wrote:
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  # ifdef CONFIG_PROVE_LOCKING
> -#  define rwlock_acquire(l, s, t, i)           lock_acquire(l, s, t, 0, 2, NULL, i)
> -#  define rwlock_acquire_read(l, s, t, i)      lock_acquire(l, s, t, 2, 2, NULL, i)
> +#  define rwlock_acquire(l, s, t, i)           \
> +               lock_acquire(l, s, t, STATE_WRITE, 2, NULL, i)
> +#  define rwlock_acquire_read(l, s, t, i)      \
> +               lock_acquire(l, s, t, STATE_RECURSIVE_READ, 2, NULL, i)
>  # else
> -#  define rwlock_acquire(l, s, t, i)           lock_acquire(l, s, t, 0, 1, NULL, i)
> -#  define rwlock_acquire_read(l, s, t, i)      lock_acquire(l, s, t, 2, 1, NULL, i)
> +#  define rwlock_acquire(l, s, t, i)           \
> +               lock_acquire(l, s, t, STATE_WRITE, 1, NULL, i)
> +#  define rwlock_acquire_read(l, s, t, i)      \
> +               lock_acquire(l, s, t, STATE_RECURSIVE_READ, 1, NULL, i)
>  # endif
>  # define rwlock_release(l, n, i)               lock_release(l, n, i)
>  #else 

OT, I like this, we need to get rid of that other 2 and 1 too, as it is
very confusing to what they mean. But that can be done in a later clean
up.

-- Steve



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-18 13:34   ` Yong Zhang
@ 2011-04-18 16:34     ` Steven Rostedt
  2011-04-18 16:36       ` Peter Zijlstra
  0 siblings, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 16:34 UTC (permalink / raw)
  To: Yong Zhang
  Cc: Peter Zijlstra, Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Mon, 2011-04-18 at 21:34 +0800, Yong Zhang wrote:

> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > ---
> >  include/linux/lockdep.h |  107 ++++++++++++++++++++++++++++++++++++------------
> >  kernel/lockdep.c        |   46 ++++++++++----------
> >  2 files changed, 105 insertions(+), 48 deletions(-)
> > 
> > @@ -2273,7 +2273,7 @@ mark_held_locks(struct task_struct *curr
> >  		hlock = curr->held_locks + i;
> >  
> >  		usage_bit = 2 + (mark << 2); /* ENABLED */
> > -		if (hlock->read)
> > +		if (hlock->rw_state)
> 
> 		is_read(hlock->rw_state) ?

Yeah, I think this should be a is_read() or add a comment explaining why
not.

-- Steve



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 3/7] lockdep: Annotate read/write states
  2011-04-18 16:34     ` Steven Rostedt
@ 2011-04-18 16:36       ` Peter Zijlstra
  0 siblings, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-18 16:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Yong Zhang, Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Mon, 2011-04-18 at 12:34 -0400, Steven Rostedt wrote:
> On Mon, 2011-04-18 at 21:34 +0800, Yong Zhang wrote:
> 
> > > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > ---
> > >  include/linux/lockdep.h |  107 ++++++++++++++++++++++++++++++++++++------------
> > >  kernel/lockdep.c        |   46 ++++++++++----------
> > >  2 files changed, 105 insertions(+), 48 deletions(-)
> > > 
> > > @@ -2273,7 +2273,7 @@ mark_held_locks(struct task_struct *curr
> > >  		hlock = curr->held_locks + i;
> > >  
> > >  		usage_bit = 2 + (mark << 2); /* ENABLED */
> > > -		if (hlock->read)
> > > +		if (hlock->rw_state)
> > 
> > 		is_read(hlock->rw_state) ?
> 
> Yeah, I think this should be a is_read() or add a comment explaining why
> not.

Yeah it should be is_read().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires
  2011-04-17  9:45 ` [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires Peter Zijlstra
@ 2011-04-18 16:46   ` Steven Rostedt
  2011-04-18 16:49     ` Peter Zijlstra
  2011-04-18 22:07   ` Peter Zijlstra
  1 sibling, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 16:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Sun, 2011-04-17 at 11:45 +0200, Peter Zijlstra wrote:
>  
> +/*
> + * A lock's class id is used to calculate the chain-key. Since we need to
> + * differentiate between the chains which contain the read acquire of
> + * a lock from the chains having write acquire of the same lock,
> + * we offset the class_idx by MAX_LOCKDEP_KEYS if it is a read acquire.

Don't we only care to do this if we have a recursive read? I thought
simple reads still work fine with the current algorithm?

> + *
> + * Thus the the lock's key during a chain-key calculation can be in the range
> + * 1 to 2 * MAX_LOCKDEP_KEYS - 1.
> + *
> + * LOCKDEP_CHAIN_KEY_BITS holds the number of bits required to
> + * represent this range.
> + */
> +#define LOCKDEP_CHAIN_KEY_BITS	(MAX_LOCKDEP_KEYS_BITS + 1)
>  struct held_lock {
>  	/*
>  	 * One-way hash of the dependency chain up to this point. We
> Index: linux-2.6/kernel/lockdep.c
> ===================================================================
> --- linux-2.6.orig/kernel/lockdep.c
> +++ linux-2.6/kernel/lockdep.c
> @@ -303,8 +303,8 @@ static struct list_head chainhash_table[
>   * unique.
>   */
>  #define iterate_chain_key(key1, key2) \
> -	(((key1) << MAX_LOCKDEP_KEYS_BITS) ^ \
> -	((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ \
> +	(((key1) << LOCKDEP_CHAIN_KEY_BITS) ^ \
> +	((key1) >> (64 - LOCKDEP_CHAIN_KEY_BITS)) ^ \
>  	(key2))
>  
>  void lockdep_off(void)
> @@ -1988,6 +1988,9 @@ static void check_chain_key(struct task_
>  		if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
>  			return;
>  
> +		if (is_read(hlock->rw_state))
> +			id += MAX_LOCKDEP_KEYS;

Again, isn't this about recursive reads? Or am I just confused ;)

-- Steve

> +
>  		if (prev_hlock && (prev_hlock->irq_context !=
>  							hlock->irq_context))
>  			chain_key = 0;
> @@ -2815,6 +2818,18 @@ static int __lock_acquire(struct lockdep
>  	if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
>  		return 0;
>  
> +	/*
> +	 * Factor in the read/write state in the chain key calculation.
> +	 *
> +	 * Two chains containing lock dependencies in the same order can
> +	 * still differ due to their read/write state
> +	 * eg: lock(A)->Rlock(B) is different from lock(A)->Wlock(B)
> +	 *
> +	 * Hence distinguish between such chains.
> +	 */
> +	if (is_read(rw_state))
> +		id += MAX_LOCKDEP_KEYS;
> +
>  	chain_key = curr->curr_chain_key;
>  	if (!depth) {
>  		if (DEBUG_LOCKS_WARN_ON(chain_key != 0))
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires
  2011-04-18 16:46   ` Steven Rostedt
@ 2011-04-18 16:49     ` Peter Zijlstra
  2011-04-18 17:33       ` Steven Rostedt
  0 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-18 16:49 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Mon, 2011-04-18 at 12:46 -0400, Steven Rostedt wrote:
> > +/*
> > + * A lock's class id is used to calculate the chain-key. Since we need to
> > + * differentiate between the chains which contain the read acquire of
> > + * a lock from the chains having write acquire of the same lock,
> > + * we offset the class_idx by MAX_LOCKDEP_KEYS if it is a read acquire.
> 
> Don't we only care to do this if we have a recursive read? I thought
> simple reads still work fine with the current algorithm?
> 
> > + *
> > + * Thus the the lock's key during a chain-key calculation can be in the range
> > + * 1 to 2 * MAX_LOCKDEP_KEYS - 1.
> > + *
> > + * LOCKDEP_CHAIN_KEY_BITS holds the number of bits required to
> > + * represent this range.
> > + */
> > +#define LOCKDEP_CHAIN_KEY_BITS       (MAX_LOCKDEP_KEYS_BITS + 1)
> >  struct held_lock {
> >       /*
> >        * One-way hash of the dependency chain up to this point. We
> > Index: linux-2.6/kernel/lockdep.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/lockdep.c
> > +++ linux-2.6/kernel/lockdep.c
> > @@ -303,8 +303,8 @@ static struct list_head chainhash_table[
> >   * unique.
> >   */
> >  #define iterate_chain_key(key1, key2) \
> > -     (((key1) << MAX_LOCKDEP_KEYS_BITS) ^ \
> > -     ((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ \
> > +     (((key1) << LOCKDEP_CHAIN_KEY_BITS) ^ \
> > +     ((key1) >> (64 - LOCKDEP_CHAIN_KEY_BITS)) ^ \
> >       (key2))
> >  
> >  void lockdep_off(void)
> > @@ -1988,6 +1988,9 @@ static void check_chain_key(struct task_
> >               if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
> >                       return;
> >  
> > +             if (is_read(hlock->rw_state))
> > +                     id += MAX_LOCKDEP_KEYS;
> 
> Again, isn't this about recursive reads? Or am I just confused ;) 

So what we do here is split off the write chain, the above could have
been writeen if (!is_write()) to clarify that.

Everything except recursive read validation will traverse both chains,
the recursive read validation will only traverse the write chains and
ignore the combined read/recursive-read chain.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires
  2011-04-18 16:49     ` Peter Zijlstra
@ 2011-04-18 17:33       ` Steven Rostedt
  0 siblings, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2011-04-18 17:33 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Tetsuo Handa, Thomas Gleixner

On Mon, 2011-04-18 at 18:49 +0200, Peter Zijlstra wrote:
> On Mon, 2011-04-18 at 12:46 -0400, Steven Rostedt wrote:
>  
> > >  void lockdep_off(void)
> > > @@ -1988,6 +1988,9 @@ static void check_chain_key(struct task_
> > >               if (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
> > >                       return;
> > >  
> > > +             if (is_read(hlock->rw_state))
> > > +                     id += MAX_LOCKDEP_KEYS;
> > 
> > Again, isn't this about recursive reads? Or am I just confused ;) 
> 
> So what we do here is split off the write chain, the above could have
> been writeen if (!is_write()) to clarify that.

Or just add a comment ;)

-- Steve

> 
> Everything except recursive read validation will traverse both chains,
> the recursive read validation will only traverse the write chains and
> ignore the combined read/recursive-read chain.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires
  2011-04-17  9:45 ` [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires Peter Zijlstra
  2011-04-18 16:46   ` Steven Rostedt
@ 2011-04-18 22:07   ` Peter Zijlstra
  1 sibling, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2011-04-18 22:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Tetsuo Handa, Steven Rostedt, Thomas Gleixner,
	Gautham R Shenoy

On Sun, 2011-04-17 at 11:45 +0200, Peter Zijlstra wrote:
> In order to support recursive read locks we need to support the
> previously mentioned lock state conflict matrix:
> 
>  Conflicting_states(WRITE):             RECURSIVE_READ | READ | WRITE
>  Conflicting_states(READ):                               READ | WRITE
>  Conflicting_states(RECURSIVE_READ):                            WRITE
> 
> Since this introduces asymmetry between recursive read and write, we
> need to split the lock dependency chains such that we can traverse
> WRITE chains without observing RECURSIVE_READ|READ chains. 

So while we split off the WRITE chain from the RECURSIVE_READ|READ
chains, shouldn't we split it in three, because the READ conflict state
only has READ|WRITE, not RECURSIVE_READ.

Therefore a RECURSIVE_READ dependency in the READ chain could throw the
regular READ cycle detector, no?


 A(r) -> B(r) -> C(rr)

 C(w) -> B(r)

would close the cycle and report a problem, but doesn't match the
conflict states.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-18  3:41 ` [RFC][PATCH 0/7] lockdep: Support recurise-read locks Tetsuo Handa
@ 2011-04-22  7:19   ` Yong Zhang
  2011-04-22  7:27     ` Yong Zhang
  2011-04-22  7:44     ` Tetsuo Handa
  0 siblings, 2 replies; 30+ messages in thread
From: Yong Zhang @ 2011-04-22  7:19 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

2011/4/18 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> (Continued from https://lkml.org/lkml/2011/3/31/240 "Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin().")
> Test results for above program:
>
> "cat /proc/locktest1 /proc/locktest2" => Detect fail
> "cat /proc/locktest2 /proc/locktest1" => Detect fail
> "cat /proc/locktest3 /proc/locktest4" => Detect fail
> "cat /proc/locktest4 /proc/locktest3" => Detect fail
>
> If I change from rwlock_acquire_read() to spin_acquire() in read_seqbegin2()
> and from rwlock_release() to spin_release() in read_seqretry2():
>
> "cat /proc/locktest1 /proc/locktest2" => Detect fail
> "cat /proc/locktest2 /proc/locktest1" => Detect OK (shown below)
> "cat /proc/locktest3 /proc/locktest4" => Detect fail
> "cat /proc/locktest4 /proc/locktest3" => Detect OK (shown below)
>
> Guessing from my testcases, read_seqbegin2() will fail to detect the problem
> unless we use spin_acquire()/spin_release() rather than
> rwlock_acquire_read()/rwlock_release().
>
> Also, something is still wrong because lockdep fails to detect the problem
> for "cat /proc/locktest1 /proc/locktest2" and
> "cat /proc/locktest3 /proc/locktest4" cases.

It fails because we never add the recursive read to prev->after evev if
it passed the validation.

check_prev_add()::1671
	/*
	 * For recursive read-locks we do all the dependency checks,
	 * but we dont store read-triggered dependencies (only
	 * write-triggered dependencies). This ensures that only the
	 * write-side dependencies matter, and that if for example a
	 * write-lock never takes any other locks, then the reads are
	 * equivalent to a NOP.
	 */
	if (next->read == 2 || prev->read == 2)
		return 1;

So we have no chain after opening locktest1 or locktest3.

But adding recursive read to prev->after will introduce spurious
lockdep warnings.

Thanks,
Yong

>
> [   83.551455]
> [   83.551457] =======================================================
> [   83.555293] [ INFO: possible circular locking dependency detected ]
> [   83.555293] 2.6.39-rc3-00228-gd733ed6-dirty #259
> [   83.555293] -------------------------------------------------------
> [   83.555293] cat/2768 is trying to acquire lock:
> [   83.555293]  (brlock1_lock_dep_map){++++..}, at: [<e08150b0>] brlock1_local_lock+0x0/0x90 [locktest]
> [   83.555293]
> [   83.555293] but task is already holding lock:
> [   83.555293]  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08154dd>] locktest_open1+0xd/0x40 [locktest]
> [   83.555293]
> [   83.555293] which lock already depends on the new lock.
> [   83.555293]
> [   83.555293]
> [   83.555293] the existing dependency chain (in reverse order) is:
> [   83.555293]
> [   83.555293] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
> [   83.635281]        [<c106c499>] check_prevs_add+0xb9/0x110
> [   83.635281]        [<c106c840>] validate_chain+0x320/0x5a0
> [   83.635281]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
> [   83.635281]        [<c107001a>] lock_acquire+0x7a/0xa0
> [   83.635281]        [<e0815555>] locktest_open2+0x45/0x70 [locktest]
> [   83.635281]        [<c1118355>] proc_reg_open+0x65/0xe0
> [   83.635281]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
> [   83.635281]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
> [   83.635281]        [<c10dc1d8>] do_last+0xf8/0x6c0
> [   83.635281]        [<c10dc846>] path_openat+0xa6/0x340
> [   83.635281]        [<c10dcb10>] do_filp_open+0x30/0x80
> [   83.635281]        [<c10cefa1>] do_sys_open+0x101/0x1a0
> [   83.635281]        [<c10cf069>] sys_open+0x29/0x40
> [   83.635281]        [<c13b43c1>] syscall_call+0x7/0xb
> [   83.635281]
> [   83.635281] -> #0 (brlock1_lock_dep_map){++++..}:
> [   83.635281]        [<c106c34c>] check_prev_add+0x78c/0x820
> [   83.635281]        [<c106c499>] check_prevs_add+0xb9/0x110
> [   83.635281]        [<c106c840>] validate_chain+0x320/0x5a0
> [   83.635281]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
> [   83.635281]        [<c107001a>] lock_acquire+0x7a/0xa0
> [   83.635281]        [<e08150e3>] brlock1_local_lock+0x33/0x90 [locktest]
> [   83.635281]        [<e08154e8>] locktest_open1+0x18/0x40 [locktest]
> [   83.635281]        [<c1118355>] proc_reg_open+0x65/0xe0
> [   83.635281]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
> [   83.635281]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
> [   83.635281]        [<c10dc1d8>] do_last+0xf8/0x6c0
> [   83.635281]        [<c10dc846>] path_openat+0xa6/0x340
> [   83.635281]        [<c10dcb10>] do_filp_open+0x30/0x80
> [   83.635281]        [<c10cefa1>] do_sys_open+0x101/0x1a0
> [   83.635281]        [<c10cf069>] sys_open+0x29/0x40
> [   83.635281]        [<c13b43c1>] syscall_call+0x7/0xb
> [   83.635281]
> [   83.635281] other info that might help us debug this:
> [   83.635281]
> [   83.635281] 1 lock held by cat/2768:
> [   83.635281]  #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08154dd>] locktest_open1+0xd/0x40 [locktest]
> [   83.635281]
> [   83.635281] stack backtrace:
> [   83.635281] Pid: 2768, comm: cat Not tainted 2.6.39-rc3-00228-gd733ed6-dirty #259
> [   83.635281] Call Trace:
> [   83.635281]  [<c106ade6>] print_circular_bug+0xc6/0xd0
> [   83.635281]  [<c106c34c>] check_prev_add+0x78c/0x820
> [   83.635281]  [<c1005d3b>] ? print_context_stack+0x3b/0xa0
> [   83.635281]  [<c1004fa1>] ? dump_trace+0x81/0xe0
> [   83.635281]  [<c106c499>] check_prevs_add+0xb9/0x110
> [   83.635281]  [<c106c840>] validate_chain+0x320/0x5a0
> [   83.635281]  [<c106df7c>] ? mark_lock+0x21c/0x3c0
> [   83.635281]  [<c106e927>] __lock_acquire+0x2a7/0x8f0
> [   83.635281]  [<c107001a>] lock_acquire+0x7a/0xa0
> [   83.635281]  [<e08150b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest]
> [   83.635281]  [<e08154d0>] ? brlock1_global_unlock+0xa0/0xa0 [locktest]
> [   83.635281]  [<e08150e3>] brlock1_local_lock+0x33/0x90 [locktest]
> [   83.635281]  [<e08150b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest]
> [   83.635281]  [<e08154e8>] locktest_open1+0x18/0x40 [locktest]
> [   83.635281]  [<c1118355>] proc_reg_open+0x65/0xe0
> [   83.635281]  [<c10ce78f>] __dentry_open+0x16f/0x2e0
> [   83.635281]  [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
> [   83.635281]  [<c11182f0>] ? proc_reg_mmap+0x80/0x80
> [   83.635281]  [<c10dc1d8>] do_last+0xf8/0x6c0
> [   83.635281]  [<c10db00c>] ? path_init+0x2cc/0x3c0
> [   83.635281]  [<c10dc846>] path_openat+0xa6/0x340
> [   83.635281]  [<c106d80b>] ? trace_hardirqs_off+0xb/0x10
> [   83.635281]  [<c10dcb10>] do_filp_open+0x30/0x80
> [   83.635281]  [<c13b3a5d>] ? _raw_spin_unlock+0x1d/0x20
> [   83.635281]  [<c10e9f11>] ? alloc_fd+0xe1/0x1a0
> [   83.635281]  [<c10cefa1>] do_sys_open+0x101/0x1a0
> [   83.635281]  [<c10cfc8b>] ? vfs_write+0x10b/0x130
> [   83.635281]  [<c10cf069>] sys_open+0x29/0x40
> [   83.635281]  [<c13b43c1>] syscall_call+0x7/0xb
>
>
>
> [   82.758647]
> [   82.758649] =======================================================
> [   82.762520] [ INFO: possible circular locking dependency detected ]
> [   82.762520] 2.6.39-rc3-00228-gd733ed6-dirty #259
> [   82.762520] -------------------------------------------------------
> [   82.762520] cat/2768 is trying to acquire lock:
> [   82.762520]  (rwlock1){++++..}, at: [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
> [   82.762520]
> [   82.762520] but task is already holding lock:
> [   82.762520]  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e081558d>] locktest_open3+0xd/0x40 [locktest]
> [   82.762520]
> [   82.762520] which lock already depends on the new lock.
> [   82.762520]
> [   82.762520]
> [   82.762520] the existing dependency chain (in reverse order) is:
> [   82.762520]
> [   82.762520] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
> [   82.841627]        [<c106c499>] check_prevs_add+0xb9/0x110
> [   82.841627]        [<c106c840>] validate_chain+0x320/0x5a0
> [   82.841627]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
> [   82.841627]        [<c107001a>] lock_acquire+0x7a/0xa0
> [   82.841627]        [<e081560a>] locktest_open4+0x4a/0x90 [locktest]
> [   82.841627]        [<c1118355>] proc_reg_open+0x65/0xe0
> [   82.841627]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
> [   82.841627]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
> [   82.841627]        [<c10dc1d8>] do_last+0xf8/0x6c0
> [   82.841627]        [<c10dc846>] path_openat+0xa6/0x340
> [   82.841627]        [<c10dcb10>] do_filp_open+0x30/0x80
> [   82.841627]        [<c10cefa1>] do_sys_open+0x101/0x1a0
> [   82.841627]        [<c10cf069>] sys_open+0x29/0x40
> [   82.841627]        [<c13b43c1>] syscall_call+0x7/0xb
> [   82.841627]
> [   82.841627] -> #0 (rwlock1){++++..}:
> [   82.841627]        [<c106c34c>] check_prev_add+0x78c/0x820
> [   82.841627]        [<c106c499>] check_prevs_add+0xb9/0x110
> [   82.841627]        [<c106c840>] validate_chain+0x320/0x5a0
> [   82.841627]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
> [   82.841627]        [<c107001a>] lock_acquire+0x7a/0xa0
> [   82.841627]        [<c13b3ba9>] _raw_read_lock+0x39/0x70
> [   82.841627]        [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
> [   82.841627]        [<c1118355>] proc_reg_open+0x65/0xe0
> [   82.841627]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
> [   82.841627]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
> [   82.841627]        [<c10dc1d8>] do_last+0xf8/0x6c0
> [   82.841627]        [<c10dc846>] path_openat+0xa6/0x340
> [   82.841627]        [<c10dcb10>] do_filp_open+0x30/0x80
> [   82.841627]        [<c10cefa1>] do_sys_open+0x101/0x1a0
> [   82.841627]        [<c10cf069>] sys_open+0x29/0x40
> [   82.841627]        [<c13b43c1>] syscall_call+0x7/0xb
> [   82.841627]
> [   82.841627] other info that might help us debug this:
> [   82.841627]
> [   82.841627] 1 lock held by cat/2768:
> [   82.841627]  #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e081558d>] locktest_open3+0xd/0x40 [locktest]
> [   82.841627]
> [   82.841627] stack backtrace:
> [   82.841627] Pid: 2768, comm: cat Not tainted 2.6.39-rc3-00228-gd733ed6-dirty #259
> [   82.841627] Call Trace:
> [   82.841627]  [<c106ade6>] print_circular_bug+0xc6/0xd0
> [   82.841627]  [<c106c34c>] check_prev_add+0x78c/0x820
> [   82.841627]  [<c1005d3b>] ? print_context_stack+0x3b/0xa0
> [   82.841627]  [<c1004fa1>] ? dump_trace+0x81/0xe0
> [   82.841627]  [<c106c499>] check_prevs_add+0xb9/0x110
> [   82.841627]  [<c106c840>] validate_chain+0x320/0x5a0
> [   82.841627]  [<c106df7c>] ? mark_lock+0x21c/0x3c0
> [   82.841627]  [<c106e927>] __lock_acquire+0x2a7/0x8f0
> [   82.841627]  [<c107001a>] lock_acquire+0x7a/0xa0
> [   82.841627]  [<e081559d>] ? locktest_open3+0x1d/0x40 [locktest]
> [   82.841627]  [<e0815580>] ? locktest_open2+0x70/0x70 [locktest]
> [   82.841627]  [<c13b3ba9>] _raw_read_lock+0x39/0x70
> [   82.841627]  [<e081559d>] ? locktest_open3+0x1d/0x40 [locktest]
> [   82.841627]  [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
> [   82.841627]  [<c1118355>] proc_reg_open+0x65/0xe0
> [   82.841627]  [<c10ce78f>] __dentry_open+0x16f/0x2e0
> [   82.841627]  [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
> [   82.841627]  [<c11182f0>] ? proc_reg_mmap+0x80/0x80
> [   82.841627]  [<c10dc1d8>] do_last+0xf8/0x6c0
> [   82.841627]  [<c10db00c>] ? path_init+0x2cc/0x3c0
> [   82.841627]  [<c10dc846>] path_openat+0xa6/0x340
> [   82.841627]  [<c106d80b>] ? trace_hardirqs_off+0xb/0x10
> [   82.841627]  [<c10dcb10>] do_filp_open+0x30/0x80
> [   82.841627]  [<c13b3a5d>] ? _raw_spin_unlock+0x1d/0x20
> [   82.841627]  [<c10e9f11>] ? alloc_fd+0xe1/0x1a0
> [   82.841627]  [<c10cefa1>] do_sys_open+0x101/0x1a0
> [   82.841627]  [<c10cfc8b>] ? vfs_write+0x10b/0x130
> [   82.841627]  [<c10cf069>] sys_open+0x29/0x40
> [   82.841627]  [<c13b43c1>] syscall_call+0x7/0xb
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-22  7:19   ` Yong Zhang
@ 2011-04-22  7:27     ` Yong Zhang
  2011-04-22  7:44     ` Tetsuo Handa
  1 sibling, 0 replies; 30+ messages in thread
From: Yong Zhang @ 2011-04-22  7:27 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

On Fri, Apr 22, 2011 at 3:19 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
> 2011/4/18 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
>> (Continued from https://lkml.org/lkml/2011/3/31/240 "Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin().")
>> Test results for above program:
>>
>> "cat /proc/locktest1 /proc/locktest2" => Detect fail
>> "cat /proc/locktest2 /proc/locktest1" => Detect fail
>> "cat /proc/locktest3 /proc/locktest4" => Detect fail
>> "cat /proc/locktest4 /proc/locktest3" => Detect fail
>>
>> If I change from rwlock_acquire_read() to spin_acquire() in read_seqbegin2()
>> and from rwlock_release() to spin_release() in read_seqretry2():
>>
>> "cat /proc/locktest1 /proc/locktest2" => Detect fail
>> "cat /proc/locktest2 /proc/locktest1" => Detect OK (shown below)
>> "cat /proc/locktest3 /proc/locktest4" => Detect fail
>> "cat /proc/locktest4 /proc/locktest3" => Detect OK (shown below)
>>
>> Guessing from my testcases, read_seqbegin2() will fail to detect the problem
>> unless we use spin_acquire()/spin_release() rather than
>> rwlock_acquire_read()/rwlock_release().
>>
>> Also, something is still wrong because lockdep fails to detect the problem
>> for "cat /proc/locktest1 /proc/locktest2" and
>> "cat /proc/locktest3 /proc/locktest4" cases.
>
> It fails because we never add the recursive read to prev->after evev if
> it passed the validation.
>
> check_prev_add()::1671
>        /*
>         * For recursive read-locks we do all the dependency checks,
>         * but we dont store read-triggered dependencies (only
>         * write-triggered dependencies). This ensures that only the
>         * write-side dependencies matter, and that if for example a
>         * write-lock never takes any other locks, then the reads are
>         * equivalent to a NOP.
>         */
>        if (next->read == 2 || prev->read == 2)

And the semantic doesn't change in Peter's patch.

>                return 1;
>
> So we have no chain after opening locktest1 or locktest3.
>
> But adding recursive read to prev->after will introduce spurious
> lockdep warnings.
>
> Thanks,
> Yong
>
>>
>> [   83.551455]
>> [   83.551457] =======================================================
>> [   83.555293] [ INFO: possible circular locking dependency detected ]
>> [   83.555293] 2.6.39-rc3-00228-gd733ed6-dirty #259
>> [   83.555293] -------------------------------------------------------
>> [   83.555293] cat/2768 is trying to acquire lock:
>> [   83.555293]  (brlock1_lock_dep_map){++++..}, at: [<e08150b0>] brlock1_local_lock+0x0/0x90 [locktest]
>> [   83.555293]
>> [   83.555293] but task is already holding lock:
>> [   83.555293]  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08154dd>] locktest_open1+0xd/0x40 [locktest]
>> [   83.555293]
>> [   83.555293] which lock already depends on the new lock.
>> [   83.555293]
>> [   83.555293]
>> [   83.555293] the existing dependency chain (in reverse order) is:
>> [   83.555293]
>> [   83.555293] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
>> [   83.635281]        [<c106c499>] check_prevs_add+0xb9/0x110
>> [   83.635281]        [<c106c840>] validate_chain+0x320/0x5a0
>> [   83.635281]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
>> [   83.635281]        [<c107001a>] lock_acquire+0x7a/0xa0
>> [   83.635281]        [<e0815555>] locktest_open2+0x45/0x70 [locktest]
>> [   83.635281]        [<c1118355>] proc_reg_open+0x65/0xe0
>> [   83.635281]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
>> [   83.635281]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
>> [   83.635281]        [<c10dc1d8>] do_last+0xf8/0x6c0
>> [   83.635281]        [<c10dc846>] path_openat+0xa6/0x340
>> [   83.635281]        [<c10dcb10>] do_filp_open+0x30/0x80
>> [   83.635281]        [<c10cefa1>] do_sys_open+0x101/0x1a0
>> [   83.635281]        [<c10cf069>] sys_open+0x29/0x40
>> [   83.635281]        [<c13b43c1>] syscall_call+0x7/0xb
>> [   83.635281]
>> [   83.635281] -> #0 (brlock1_lock_dep_map){++++..}:
>> [   83.635281]        [<c106c34c>] check_prev_add+0x78c/0x820
>> [   83.635281]        [<c106c499>] check_prevs_add+0xb9/0x110
>> [   83.635281]        [<c106c840>] validate_chain+0x320/0x5a0
>> [   83.635281]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
>> [   83.635281]        [<c107001a>] lock_acquire+0x7a/0xa0
>> [   83.635281]        [<e08150e3>] brlock1_local_lock+0x33/0x90 [locktest]
>> [   83.635281]        [<e08154e8>] locktest_open1+0x18/0x40 [locktest]
>> [   83.635281]        [<c1118355>] proc_reg_open+0x65/0xe0
>> [   83.635281]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
>> [   83.635281]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
>> [   83.635281]        [<c10dc1d8>] do_last+0xf8/0x6c0
>> [   83.635281]        [<c10dc846>] path_openat+0xa6/0x340
>> [   83.635281]        [<c10dcb10>] do_filp_open+0x30/0x80
>> [   83.635281]        [<c10cefa1>] do_sys_open+0x101/0x1a0
>> [   83.635281]        [<c10cf069>] sys_open+0x29/0x40
>> [   83.635281]        [<c13b43c1>] syscall_call+0x7/0xb
>> [   83.635281]
>> [   83.635281] other info that might help us debug this:
>> [   83.635281]
>> [   83.635281] 1 lock held by cat/2768:
>> [   83.635281]  #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08154dd>] locktest_open1+0xd/0x40 [locktest]
>> [   83.635281]
>> [   83.635281] stack backtrace:
>> [   83.635281] Pid: 2768, comm: cat Not tainted 2.6.39-rc3-00228-gd733ed6-dirty #259
>> [   83.635281] Call Trace:
>> [   83.635281]  [<c106ade6>] print_circular_bug+0xc6/0xd0
>> [   83.635281]  [<c106c34c>] check_prev_add+0x78c/0x820
>> [   83.635281]  [<c1005d3b>] ? print_context_stack+0x3b/0xa0
>> [   83.635281]  [<c1004fa1>] ? dump_trace+0x81/0xe0
>> [   83.635281]  [<c106c499>] check_prevs_add+0xb9/0x110
>> [   83.635281]  [<c106c840>] validate_chain+0x320/0x5a0
>> [   83.635281]  [<c106df7c>] ? mark_lock+0x21c/0x3c0
>> [   83.635281]  [<c106e927>] __lock_acquire+0x2a7/0x8f0
>> [   83.635281]  [<c107001a>] lock_acquire+0x7a/0xa0
>> [   83.635281]  [<e08150b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest]
>> [   83.635281]  [<e08154d0>] ? brlock1_global_unlock+0xa0/0xa0 [locktest]
>> [   83.635281]  [<e08150e3>] brlock1_local_lock+0x33/0x90 [locktest]
>> [   83.635281]  [<e08150b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest]
>> [   83.635281]  [<e08154e8>] locktest_open1+0x18/0x40 [locktest]
>> [   83.635281]  [<c1118355>] proc_reg_open+0x65/0xe0
>> [   83.635281]  [<c10ce78f>] __dentry_open+0x16f/0x2e0
>> [   83.635281]  [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
>> [   83.635281]  [<c11182f0>] ? proc_reg_mmap+0x80/0x80
>> [   83.635281]  [<c10dc1d8>] do_last+0xf8/0x6c0
>> [   83.635281]  [<c10db00c>] ? path_init+0x2cc/0x3c0
>> [   83.635281]  [<c10dc846>] path_openat+0xa6/0x340
>> [   83.635281]  [<c106d80b>] ? trace_hardirqs_off+0xb/0x10
>> [   83.635281]  [<c10dcb10>] do_filp_open+0x30/0x80
>> [   83.635281]  [<c13b3a5d>] ? _raw_spin_unlock+0x1d/0x20
>> [   83.635281]  [<c10e9f11>] ? alloc_fd+0xe1/0x1a0
>> [   83.635281]  [<c10cefa1>] do_sys_open+0x101/0x1a0
>> [   83.635281]  [<c10cfc8b>] ? vfs_write+0x10b/0x130
>> [   83.635281]  [<c10cf069>] sys_open+0x29/0x40
>> [   83.635281]  [<c13b43c1>] syscall_call+0x7/0xb
>>
>>
>>
>> [   82.758647]
>> [   82.758649] =======================================================
>> [   82.762520] [ INFO: possible circular locking dependency detected ]
>> [   82.762520] 2.6.39-rc3-00228-gd733ed6-dirty #259
>> [   82.762520] -------------------------------------------------------
>> [   82.762520] cat/2768 is trying to acquire lock:
>> [   82.762520]  (rwlock1){++++..}, at: [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
>> [   82.762520]
>> [   82.762520] but task is already holding lock:
>> [   82.762520]  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e081558d>] locktest_open3+0xd/0x40 [locktest]
>> [   82.762520]
>> [   82.762520] which lock already depends on the new lock.
>> [   82.762520]
>> [   82.762520]
>> [   82.762520] the existing dependency chain (in reverse order) is:
>> [   82.762520]
>> [   82.762520] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
>> [   82.841627]        [<c106c499>] check_prevs_add+0xb9/0x110
>> [   82.841627]        [<c106c840>] validate_chain+0x320/0x5a0
>> [   82.841627]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
>> [   82.841627]        [<c107001a>] lock_acquire+0x7a/0xa0
>> [   82.841627]        [<e081560a>] locktest_open4+0x4a/0x90 [locktest]
>> [   82.841627]        [<c1118355>] proc_reg_open+0x65/0xe0
>> [   82.841627]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
>> [   82.841627]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
>> [   82.841627]        [<c10dc1d8>] do_last+0xf8/0x6c0
>> [   82.841627]        [<c10dc846>] path_openat+0xa6/0x340
>> [   82.841627]        [<c10dcb10>] do_filp_open+0x30/0x80
>> [   82.841627]        [<c10cefa1>] do_sys_open+0x101/0x1a0
>> [   82.841627]        [<c10cf069>] sys_open+0x29/0x40
>> [   82.841627]        [<c13b43c1>] syscall_call+0x7/0xb
>> [   82.841627]
>> [   82.841627] -> #0 (rwlock1){++++..}:
>> [   82.841627]        [<c106c34c>] check_prev_add+0x78c/0x820
>> [   82.841627]        [<c106c499>] check_prevs_add+0xb9/0x110
>> [   82.841627]        [<c106c840>] validate_chain+0x320/0x5a0
>> [   82.841627]        [<c106e927>] __lock_acquire+0x2a7/0x8f0
>> [   82.841627]        [<c107001a>] lock_acquire+0x7a/0xa0
>> [   82.841627]        [<c13b3ba9>] _raw_read_lock+0x39/0x70
>> [   82.841627]        [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
>> [   82.841627]        [<c1118355>] proc_reg_open+0x65/0xe0
>> [   82.841627]        [<c10ce78f>] __dentry_open+0x16f/0x2e0
>> [   82.841627]        [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
>> [   82.841627]        [<c10dc1d8>] do_last+0xf8/0x6c0
>> [   82.841627]        [<c10dc846>] path_openat+0xa6/0x340
>> [   82.841627]        [<c10dcb10>] do_filp_open+0x30/0x80
>> [   82.841627]        [<c10cefa1>] do_sys_open+0x101/0x1a0
>> [   82.841627]        [<c10cf069>] sys_open+0x29/0x40
>> [   82.841627]        [<c13b43c1>] syscall_call+0x7/0xb
>> [   82.841627]
>> [   82.841627] other info that might help us debug this:
>> [   82.841627]
>> [   82.841627] 1 lock held by cat/2768:
>> [   82.841627]  #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e081558d>] locktest_open3+0xd/0x40 [locktest]
>> [   82.841627]
>> [   82.841627] stack backtrace:
>> [   82.841627] Pid: 2768, comm: cat Not tainted 2.6.39-rc3-00228-gd733ed6-dirty #259
>> [   82.841627] Call Trace:
>> [   82.841627]  [<c106ade6>] print_circular_bug+0xc6/0xd0
>> [   82.841627]  [<c106c34c>] check_prev_add+0x78c/0x820
>> [   82.841627]  [<c1005d3b>] ? print_context_stack+0x3b/0xa0
>> [   82.841627]  [<c1004fa1>] ? dump_trace+0x81/0xe0
>> [   82.841627]  [<c106c499>] check_prevs_add+0xb9/0x110
>> [   82.841627]  [<c106c840>] validate_chain+0x320/0x5a0
>> [   82.841627]  [<c106df7c>] ? mark_lock+0x21c/0x3c0
>> [   82.841627]  [<c106e927>] __lock_acquire+0x2a7/0x8f0
>> [   82.841627]  [<c107001a>] lock_acquire+0x7a/0xa0
>> [   82.841627]  [<e081559d>] ? locktest_open3+0x1d/0x40 [locktest]
>> [   82.841627]  [<e0815580>] ? locktest_open2+0x70/0x70 [locktest]
>> [   82.841627]  [<c13b3ba9>] _raw_read_lock+0x39/0x70
>> [   82.841627]  [<e081559d>] ? locktest_open3+0x1d/0x40 [locktest]
>> [   82.841627]  [<e081559d>] locktest_open3+0x1d/0x40 [locktest]
>> [   82.841627]  [<c1118355>] proc_reg_open+0x65/0xe0
>> [   82.841627]  [<c10ce78f>] __dentry_open+0x16f/0x2e0
>> [   82.841627]  [<c10ce9fe>] nameidata_to_filp+0x5e/0x70
>> [   82.841627]  [<c11182f0>] ? proc_reg_mmap+0x80/0x80
>> [   82.841627]  [<c10dc1d8>] do_last+0xf8/0x6c0
>> [   82.841627]  [<c10db00c>] ? path_init+0x2cc/0x3c0
>> [   82.841627]  [<c10dc846>] path_openat+0xa6/0x340
>> [   82.841627]  [<c106d80b>] ? trace_hardirqs_off+0xb/0x10
>> [   82.841627]  [<c10dcb10>] do_filp_open+0x30/0x80
>> [   82.841627]  [<c13b3a5d>] ? _raw_spin_unlock+0x1d/0x20
>> [   82.841627]  [<c10e9f11>] ? alloc_fd+0xe1/0x1a0
>> [   82.841627]  [<c10cefa1>] do_sys_open+0x101/0x1a0
>> [   82.841627]  [<c10cfc8b>] ? vfs_write+0x10b/0x130
>> [   82.841627]  [<c10cf069>] sys_open+0x29/0x40
>> [   82.841627]  [<c13b43c1>] syscall_call+0x7/0xb
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>
>
>
> --
> Only stand for myself
>



-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-22  7:19   ` Yong Zhang
  2011-04-22  7:27     ` Yong Zhang
@ 2011-04-22  7:44     ` Tetsuo Handa
  2011-04-22  8:01       ` Yong Zhang
  1 sibling, 1 reply; 30+ messages in thread
From: Tetsuo Handa @ 2011-04-22  7:44 UTC (permalink / raw)
  To: yong.zhang0; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

Yong Zhang wrote:
> > Also, something is still wrong because lockdep fails to detect the problem
> > for "cat /proc/locktest1 /proc/locktest2" and
> > "cat /proc/locktest3 /proc/locktest4" cases.
> 
> It fails because we never add the recursive read to prev->after evev if
> it passed the validation.
> 
Thanks. But why "cat /proc/locktest1 /proc/locktest2" is "the recursive read"
and "cat /proc/locktest2 /proc/locktest1" is not "the recursive read"?
Both are serialized. Both hold and release the same lock.
The only difference is which function was called first,
and lockdep alart depends on which function was called first.

It sounds to me that Documentation/lockdep-design.txt says
timing (i.e. which function was called first) is not important.

172 Proof of 100% correctness:
173 --------------------------
174 
175 The validator achieves perfect, mathematical 'closure' (proof of locking
176 correctness) in the sense that for every simple, standalone single-task
177 locking sequence that occurred at least once during the lifetime of the
178 kernel, the validator proves it with a 100% certainty that no
179 combination and timing of these locking sequences can cause any class of
180 lock related deadlock. [*]
181 
182 I.e. complex multi-CPU and multi-task locking scenarios do not have to
183 occur in practice to prove a deadlock: only the simple 'component'
184 locking chains have to occur at least once (anytime, in any
185 task/context) for the validator to be able to prove correctness. (For
186 example, complex deadlocks that would normally need more than 3 CPUs and
187 a very unlikely constellation of tasks, irq-contexts and timings to
188 occur, can be detected on a plain, lightly loaded single-CPU system as
189 well!)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-22  7:44     ` Tetsuo Handa
@ 2011-04-22  8:01       ` Yong Zhang
  2011-04-22  8:31         ` Tetsuo Handa
  0 siblings, 1 reply; 30+ messages in thread
From: Yong Zhang @ 2011-04-22  8:01 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

2011/4/22 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> Yong Zhang wrote:
>> > Also, something is still wrong because lockdep fails to detect the problem
>> > for "cat /proc/locktest1 /proc/locktest2" and
>> > "cat /proc/locktest3 /proc/locktest4" cases.
>>
>> It fails because we never add the recursive read to prev->after evev if
>> it passed the validation.
>>
> Thanks. But why "cat /proc/locktest1 /proc/locktest2" is "the recursive read"
> and "cat /proc/locktest2 /proc/locktest1" is not "the recursive read"?
> Both are serialized. Both hold and release the same lock.
> The only difference is which function was called first,

When you are using rwlock_acquire*(), your four testcases are
all failed, the reason I have explained.

When you are using spin_acquire()/spin_release() in read_seqbegin2()/
read_seqretry2(), if you call locktest2/locktest4 firstly, the chain will
be established, so when call locktest1/locktest3, lockdep warns on it.
But if you call locktest1/locktest2 firstly, the chain will not be established
just because recursive read is not added to prev->after.

> and lockdep alart depends on which function was called first.

No, it's not related with which function is called firstly, instead it's
the current behavior of lockdep.

>
> It sounds to me that Documentation/lockdep-design.txt says
> timing (i.e. which function was called first) is not important.
>
> 172 Proof of 100% correctness:
> 173 --------------------------
> 174
> 175 The validator achieves perfect, mathematical 'closure' (proof of locking
> 176 correctness) in the sense that for every simple, standalone single-task
> 177 locking sequence that occurred at least once during the lifetime of the
> 178 kernel, the validator proves it with a 100% certainty that no
> 179 combination and timing of these locking sequences can cause any class of
> 180 lock related deadlock. [*]
> 181
> 182 I.e. complex multi-CPU and multi-task locking scenarios do not have to
> 183 occur in practice to prove a deadlock: only the simple 'component'
> 184 locking chains have to occur at least once (anytime, in any
> 185 task/context) for the validator to be able to prove correctness. (For
> 186 example, complex deadlocks that would normally need more than 3 CPUs and
> 187 a very unlikely constellation of tasks, irq-contexts and timings to
> 188 occur, can be detected on a plain, lightly loaded single-CPU system as
> 189 well!)

This is true, but currently we take different action on recursive read
validation
which seems needed to be improved. :)

Thanks,
Yong



-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-22  8:01       ` Yong Zhang
@ 2011-04-22  8:31         ` Tetsuo Handa
  2011-04-22  8:59           ` Yong Zhang
  0 siblings, 1 reply; 30+ messages in thread
From: Tetsuo Handa @ 2011-04-22  8:31 UTC (permalink / raw)
  To: yong.zhang0; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

Yong Zhang wrote:
> When you are using spin_acquire()/spin_release() in read_seqbegin2()/
> read_seqretry2(), if you call locktest2/locktest4 firstly, the chain will
> be established, so when call locktest1/locktest3, lockdep warns on it.

This part is OK.

> But if you call locktest1/locktest2 firstly, the chain will not be established
> just because recursive read is not added to prev->after.

This part is not OK. At least, I think lockdep should be able to establish the
chain when locktest1 is called AGAIN after locktest2 is called (i.e.
"cat /proc/locktest1 /proc/locktest2 /proc/locktest1" case). But lockdep can
establish the chain for only "cat /proc/locktest2 /proc/locktest1" case.
I think there is a bug that prevents the lockdep from establishing the chain
when locktest1 is called AGAIN after locktest2 is called. If we can't fix the
bug, we should consider periodically (or upon printing stall warning messages)
revalidating already established chains.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-22  8:31         ` Tetsuo Handa
@ 2011-04-22  8:59           ` Yong Zhang
  2011-04-22  9:19             ` Tetsuo Handa
  0 siblings, 1 reply; 30+ messages in thread
From: Yong Zhang @ 2011-04-22  8:59 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

2011/4/22 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
>> But if you call locktest1/locktest2 firstly, the chain will not be established
>> just because recursive read is not added to prev->after.
>
> This part is not OK. At least, I think lockdep should be able to establish the
> chain when locktest1 is called AGAIN after locktest2 is called (i.e.
> "cat /proc/locktest1 /proc/locktest2 /proc/locktest1" case).

I guess lockdep will warn on "cat /proc/locktest1 /proc/locktest2
/proc/locktest1"

> But lockdep can
> establish the chain for only "cat /proc/locktest2 /proc/locktest1" case.
> I think there is a bug that prevents the lockdep from establishing the chain
> when locktest1 is called AGAIN after locktest2 is called.

If we want to fully support recursive read validation, it's a bug; but that
also mean some head-burning work :)

> If we can't fix the
> bug, we should consider periodically (or upon printing stall warning messages)
> revalidating already established chains.

I don't think periodically revalidating make sense; because lockdep do
validate everything real-time.

Thanks,
Yong



-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 0/7] lockdep: Support recurise-read locks
  2011-04-22  8:59           ` Yong Zhang
@ 2011-04-22  9:19             ` Tetsuo Handa
  2011-04-23 12:33               ` [PATCH] lockdep: ignore cached chain key for recursive read Yong Zhang
  0 siblings, 1 reply; 30+ messages in thread
From: Tetsuo Handa @ 2011-04-22  9:19 UTC (permalink / raw)
  To: yong.zhang0; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

Yong Zhang wrote:
> 2011/4/22 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> >> But if you call locktest1/locktest2 firstly, the chain will not be established
> >> just because recursive read is not added to prev->after.
> >
> > This part is not OK. At least, I think lockdep should be able to establish the
> > chain when locktest1 is called AGAIN after locktest2 is called (i.e.
> > "cat /proc/locktest1 /proc/locktest2 /proc/locktest1" case).
> 
> I guess lockdep will warn on "cat /proc/locktest1 /proc/locktest2
> /proc/locktest1"

It should warn, but it doesn't warn.
You can confirm it using locktest.c in this thread.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] lockdep: ignore cached chain key for recursive read
  2011-04-22  9:19             ` Tetsuo Handa
@ 2011-04-23 12:33               ` Yong Zhang
  2011-04-23 13:04                 ` Tetsuo Handa
  0 siblings, 1 reply; 30+ messages in thread
From: Yong Zhang @ 2011-04-23 12:33 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

On Fri, Apr 22, 2011 at 06:19:32PM +0900, Tetsuo Handa wrote:
> Yong Zhang wrote:
> > 2011/4/22 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> > >> But if you call locktest1/locktest2 firstly, the chain will not be established
> > >> just because recursive read is not added to prev->after.
> > >
> > > This part is not OK. At least, I think lockdep should be able to establish the
> > > chain when locktest1 is called AGAIN after locktest2 is called (i.e.
> > > "cat /proc/locktest1 /proc/locktest2 /proc/locktest1" case).
> > 
> > I guess lockdep will warn on "cat /proc/locktest1 /proc/locktest2
> > /proc/locktest1"
> 
> It should warn, but it doesn't warn.
> You can confirm it using locktest.c in this thread.

Just confirmed it on my side.

I think below patch could fix it.
BTW, I make it on top of Peter's patch, if you want to apply
it on vanilla kernel, just change "is_rec_read(hlock->rw_state"
to "hlock->read == 2"

Thanks,
Yong

---
Subject: [PATCH] lockdep: ignore cached chain key for recursive read

Currently we don't add recursive read to the dependence
chain but cached the chain key.

So for recursive read, we shoule validate it all the time,
and don't care whether it's cached or not.

If we have such sequence:
1) lock(A); rlock(B);
2) wlock(B); lock(A);
3) lock(A); rlock(B);
lockdep should warn at 3 for "possible circular locking dependency",
but it fails because we have cached the key at 1 and don't validate
again.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
 kernel/lockdep.c |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index da6a8be..3ad3442 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -1885,7 +1885,23 @@ cache_hit:
 					"%016Lx tail class: [%p] %s\n",
 					(unsigned long long)chain_key,
 					class->key, class->name);
-			return 0;
+			/*
+			 * For recursive read, we validate it all the time,
+			 * since we don't know when wlock is coming.
+			 *
+			 * If we have such sequence:
+			 * 1) lock(A); rlock(B);
+			 * 2) wlock(B); lock(A);
+			 * 3) lock(A); rlock(B);
+			 * lockdep should warn at 3 for "possible circular
+			 * locking dependency", but it fails because
+			 * we have cached the key at 1 and don't validate
+			 * again.
+			 */
+			if (is_rec_read(hlock->rw_state) && graph_lock())
+				return 1;
+			else
+				return 0;
 		}
 	}
 	if (very_verbose(class))
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] lockdep: ignore cached chain key for recursive read
  2011-04-23 12:33               ` [PATCH] lockdep: ignore cached chain key for recursive read Yong Zhang
@ 2011-04-23 13:04                 ` Tetsuo Handa
  0 siblings, 0 replies; 30+ messages in thread
From: Tetsuo Handa @ 2011-04-23 13:04 UTC (permalink / raw)
  To: yong.zhang0; +Cc: a.p.zijlstra, rostedt, tglx, mingo, linux-kernel

Yong Zhang wrote:
> I think below patch could fix it.
Great!

With this patch applied (on 2.6.39-rc4), lockdep warns on
"cat /proc/locktest1 /proc/locktest2 /proc/locktest1" case
as with "cat /proc/locktest2 /proc/locktest1" case.

Thank you.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.39-rc4 #3
-------------------------------------------------------
cat/4363 is trying to acquire lock:
 (brlock1_lock_dep_map){++++..}, at: [<e0838000>] brlock1_local_lock+0x0/0x60 [locktest]

but task is already holding lock:
 (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e083811d>] locktest_open1+0xd/0x40 [locktest]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}:
       [<c1068d14>] __lock_acquire+0x244/0x6d0
       [<c106921b>] lock_acquire+0x7b/0xa0
       [<e0838575>] locktest_open2+0x45/0x70 [locktest]
       [<c1107ec9>] proc_reg_open+0x79/0x100
       [<c10bf35e>] __dentry_open+0xce/0x2f0
       [<c10bff7e>] nameidata_to_filp+0x5e/0x70
       [<c10cd35b>] do_last+0x21b/0x930
       [<c10ce3aa>] path_openat+0x9a/0x360
       [<c10ce750>] do_filp_open+0x30/0x80
       [<c10c0a35>] do_sys_open+0xe5/0x1a0
       [<c10c0b59>] sys_open+0x29/0x40
       [<c13a40cc>] sysenter_do_call+0x12/0x32

-> #0 (brlock1_lock_dep_map){++++..}:
       [<c1068ac5>] validate_chain+0x1135/0x1140
       [<c1068d14>] __lock_acquire+0x244/0x6d0
       [<c106921b>] lock_acquire+0x7b/0xa0
       [<e0838033>] brlock1_local_lock+0x33/0x60 [locktest]
       [<e0838129>] locktest_open1+0x19/0x40 [locktest]
       [<c1107ec9>] proc_reg_open+0x79/0x100
       [<c10bf35e>] __dentry_open+0xce/0x2f0
       [<c10bff7e>] nameidata_to_filp+0x5e/0x70
       [<c10cd35b>] do_last+0x21b/0x930
       [<c10ce3aa>] path_openat+0x9a/0x360
       [<c10ce750>] do_filp_open+0x30/0x80
       [<c10c0a35>] do_sys_open+0xe5/0x1a0
       [<c10c0b59>] sys_open+0x29/0x40
       [<c13a40cc>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

1 lock held by cat/4363:
 #0:  (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e083811d>] locktest_open1+0xd/0x40 [locktest]

stack backtrace:
Pid: 4363, comm: cat Not tainted 2.6.39-rc4 #3
Call Trace:
 [<c103acab>] ? printk+0x1b/0x20
 [<c10673cb>] print_circular_bug+0xbb/0xc0
 [<c1068ac5>] validate_chain+0x1135/0x1140
 [<c1068d14>] __lock_acquire+0x244/0x6d0
 [<c106921b>] lock_acquire+0x7b/0xa0
 [<e0838000>] ? 0xe0837fff
 [<e0838110>] ? locktest_open4+0xb0/0xb0 [locktest]
 [<e0838033>] brlock1_local_lock+0x33/0x60 [locktest]
 [<e0838000>] ? 0xe0837fff
 [<e0838129>] locktest_open1+0x19/0x40 [locktest]
 [<c1107ec9>] proc_reg_open+0x79/0x100
 [<c10bf35e>] __dentry_open+0xce/0x2f0
 [<c10bff7e>] nameidata_to_filp+0x5e/0x70
 [<c1107e50>] ? proc_reg_release+0x100/0x100
 [<c10cd35b>] do_last+0x21b/0x930
 [<c10ce3aa>] path_openat+0x9a/0x360
 [<c1059149>] ? sched_clock_cpu+0x119/0x160
 [<c10ce750>] do_filp_open+0x30/0x80
 [<c13a380d>] ? _raw_spin_unlock+0x1d/0x20
 [<c10dadf1>] ? alloc_fd+0x171/0x1b0
 [<c10c0a35>] do_sys_open+0xe5/0x1a0
 [<c10c0b59>] sys_open+0x29/0x40
 [<c13a40cc>] sysenter_do_call+0x12/0x32

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2011-04-23 13:05 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-17  9:45 [RFC][PATCH 0/7] lockdep: Support recurise-read locks Peter Zijlstra
2011-04-17  9:45 ` [RFC][PATCH 1/7] lockdep: Implement extra recursive-read lock tests Peter Zijlstra
2011-04-17  9:45 ` [RFC][PATCH 2/7] lockdep: Remove redundant read checks Peter Zijlstra
2011-04-18 14:28   ` Steven Rostedt
2011-04-17  9:45 ` [RFC][PATCH 3/7] lockdep: Annotate read/write states Peter Zijlstra
2011-04-18 13:34   ` Yong Zhang
2011-04-18 16:34     ` Steven Rostedt
2011-04-18 16:36       ` Peter Zijlstra
2011-04-18 16:26   ` Steven Rostedt
2011-04-18 16:27   ` Steven Rostedt
2011-04-18 16:31   ` Steven Rostedt
2011-04-17  9:45 ` [RFC][PATCH 4/7] lockdep: Seperate lock ids for read/write acquires Peter Zijlstra
2011-04-18 16:46   ` Steven Rostedt
2011-04-18 16:49     ` Peter Zijlstra
2011-04-18 17:33       ` Steven Rostedt
2011-04-18 22:07   ` Peter Zijlstra
2011-04-17  9:45 ` [RFC][PATCH 5/7] lockdep: Rename lock_list::class Peter Zijlstra
2011-04-17  9:45 ` [RFC][PATCH 6/7] lockdep: Maintain rw_state entries in locklist Peter Zijlstra
2011-04-18 13:37   ` Yong Zhang
2011-04-17  9:45 ` [RFC][PATCH 7/7] lockdep: Consider the rw_state of lock while validating the chain Peter Zijlstra
2011-04-18  3:41 ` [RFC][PATCH 0/7] lockdep: Support recurise-read locks Tetsuo Handa
2011-04-22  7:19   ` Yong Zhang
2011-04-22  7:27     ` Yong Zhang
2011-04-22  7:44     ` Tetsuo Handa
2011-04-22  8:01       ` Yong Zhang
2011-04-22  8:31         ` Tetsuo Handa
2011-04-22  8:59           ` Yong Zhang
2011-04-22  9:19             ` Tetsuo Handa
2011-04-23 12:33               ` [PATCH] lockdep: ignore cached chain key for recursive read Yong Zhang
2011-04-23 13:04                 ` Tetsuo Handa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox