public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched()
@ 2009-07-16  6:28 Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 2/7] sched: Remove obsolete comment in __cond_resched() Frederic Weisbecker
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

The schedule() function is a loop that reschedules the current task
while the TIF_NEED_RESCHED flag is set:

void schedule(void)
{
need_resched:
	/* schedule code */
	if (need_resched())
		goto need_resched;
}

And cond_resched() repeat this loop:

do {
	add_preempt_count(PREEMPT_ACTIVE);
	schedule();
	sub_preempt_count(PREEMPT_ACTIVE);
} while(need_resched());

This loop is needless because schedule() already did the check and
nothing can set TIF_NEED_RESCHED between schedule() exit and the loop
check in need_resched().

Then remove this needless loop.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 75d2c1d..264b163 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6618,11 +6618,9 @@ static void __cond_resched(void)
 	 * PREEMPT_ACTIVE, which could trigger a second
 	 * cond_resched() call.
 	 */
-	do {
-		add_preempt_count(PREEMPT_ACTIVE);
-		schedule();
-		sub_preempt_count(PREEMPT_ACTIVE);
-	} while (need_resched());
+	add_preempt_count(PREEMPT_ACTIVE);
+	schedule();
+	sub_preempt_count(PREEMPT_ACTIVE);
 }
 
 int __sched _cond_resched(void)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/7] sched: Remove obsolete comment in __cond_resched()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
@ 2009-07-16  6:28 ` Frederic Weisbecker
  2009-07-18 14:21   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 3/7] sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep() Frederic Weisbecker
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

Remove the outdated comment from __cond_resched() related
to the now removed Big Kernel Semaphore.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 264b163..bb11547 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6613,11 +6613,6 @@ static void __cond_resched(void)
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
 	__might_sleep(__FILE__, __LINE__);
 #endif
-	/*
-	 * The BKS might be reacquired before we have dropped
-	 * PREEMPT_ACTIVE, which could trigger a second
-	 * cond_resched() call.
-	 */
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
 	sub_preempt_count(PREEMPT_ACTIVE);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/7] sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 2/7] sched: Remove obsolete comment in __cond_resched() Frederic Weisbecker
@ 2009-07-16  6:28 ` Frederic Weisbecker
  2009-07-18 14:21   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep() Frederic Weisbecker
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

Cover the off case for __might_sleep(), so that we avoid #ifdefs
in files that make use of it. Especially, this prepares for the
__might_sleep() pull up on cond_resched().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra a.p.zijlstra@chello.nl
---
 include/linux/kernel.h |    1 +
 kernel/sched.c         |    3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index b0ff486..99882e8 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -231,6 +231,7 @@ extern int _cond_resched(void);
 # define might_sleep() \
 	do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)
 #else
+  static inline void __might_sleep(char *file, int line) { }
 # define might_sleep() do { might_resched(); } while (0)
 #endif
 
diff --git a/kernel/sched.c b/kernel/sched.c
index bb11547..ac334ba 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6610,9 +6610,8 @@ static inline int should_resched(void)
 
 static void __cond_resched(void)
 {
-#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
 	__might_sleep(__FILE__, __LINE__);
-#endif
+
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
 	sub_preempt_count(PREEMPT_ACTIVE);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 2/7] sched: Remove obsolete comment in __cond_resched() Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 3/7] sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep() Frederic Weisbecker
@ 2009-07-16  6:28 ` Frederic Weisbecker
  2009-07-16 14:14   ` Peter Zijlstra
  2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 5/7] sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched() Frederic Weisbecker
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

Add a preempt count base offset to compare against the current
preempt level count. It prepares to pull up the might_sleep
check from cond_resched() to cond_resched_lock() and
cond_resched_bh().

For these two helpers, we need to respectively ensure that once
we'll unlock the given spinlock / reenable local softirqs, we
will reach a sleepable state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/hardirq.h |    7 +++++++
 include/linux/kernel.h  |    6 +++---
 kernel/sched.c          |    8 ++++----
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 8246c69..d55b0be 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -103,6 +103,13 @@
  */
 #define in_atomic()	((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_INATOMIC_BASE)
 
+static inline int current_preempt_equals(int preempt_offset)
+{
+	int nested = preempt_count() & ~PREEMPT_ACTIVE;
+
+	return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
+}
+
 /*
  * Check whether we were atomic before we did preempt_disable():
  * (used by the scheduler, *after* releasing the kernel lock)
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 99882e8..f61039e 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -217,7 +217,7 @@ extern int _cond_resched(void);
 #endif
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
-  void __might_sleep(char *file, int line);
+  void __might_sleep(char *file, int line, int preempt_offset);
 /**
  * might_sleep - annotation for functions that can sleep
  *
@@ -229,9 +229,9 @@ extern int _cond_resched(void);
  * supposed to.
  */
 # define might_sleep() \
-	do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)
+	do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
 #else
-  static inline void __might_sleep(char *file, int line) { }
+  static inline void __might_sleep(char *file, int line, int preempt_offset) { }
 # define might_sleep() do { might_resched(); } while (0)
 #endif
 
diff --git a/kernel/sched.c b/kernel/sched.c
index ac334ba..847e8fb 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6610,7 +6610,7 @@ static inline int should_resched(void)
 
 static void __cond_resched(void)
 {
-	__might_sleep(__FILE__, __LINE__);
+	__might_sleep(__FILE__, __LINE__, 0);
 
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
@@ -9444,13 +9444,13 @@ void __init sched_init(void)
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
-void __might_sleep(char *file, int line)
+void __might_sleep(char *file, int line, int preempt_offset)
 {
 #ifdef in_atomic
 	static unsigned long prev_jiffy;	/* ratelimiting */
 
-	if ((!in_atomic() && !irqs_disabled()) ||
-		    system_state != SYSTEM_RUNNING || oops_in_progress)
+	if ((current_preempt_equals(preempt_offset) && !irqs_disabled()) ||
+	    system_state != SYSTEM_RUNNING || oops_in_progress)
 		return;
 	if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
 		return;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/7] sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2009-07-16  6:28 ` [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep() Frederic Weisbecker
@ 2009-07-16  6:28 ` Frederic Weisbecker
  2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 6/7 v3] sched: Pull up the might_sleep() check in cond_resched() Frederic Weisbecker
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

CONFIG_PREEMPT_BKL doesn't exist anymore. So remove this config-on
case definition of cond_resched().

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/sched.h |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9bada20..e2bdf18 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2285,17 +2285,12 @@ static inline int need_resched(void)
  * cond_resched_softirq() will enable bhs before scheduling.
  */
 extern int _cond_resched(void);
-#ifdef CONFIG_PREEMPT_BKL
-static inline int cond_resched(void)
-{
-	return 0;
-}
-#else
+
 static inline int cond_resched(void)
 {
 	return _cond_resched();
 }
-#endif
+
 extern int cond_resched_lock(spinlock_t * lock);
 extern int cond_resched_softirq(void);
 static inline int cond_resched_bkl(void)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/7 v3] sched: Pull up the might_sleep() check in cond_resched()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2009-07-16  6:28 ` [PATCH 5/7] sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched() Frederic Weisbecker
@ 2009-07-16  6:28 ` Frederic Weisbecker
  2009-07-18 14:22   ` [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched() tip-bot for Frederic Weisbecker
  2009-07-16  6:28 ` [PATCH 7/7] sched: Convert the only user of cond_resched_bkl to use cond_resched() Frederic Weisbecker
  2009-07-18 14:21 ` [tip:sched/core] sched: Drop the need_resched() loop from cond_resched() tip-bot for Frederic Weisbecker
  6 siblings, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

might_sleep() is called lately in cond_resched(), after the
need_resched()/preempt enabled/system running tests are checked.

It's better to check the sleeps while atomic earlier and not depend
on some environment datas that reduce the chances to detect a
problem.

Also define cond_resched_*() helpers as macros, so that the FILE/LINE
reported in the sleeping while atomic warning displays the real origin
and not sched.h

Changes in v2:
- call __might_sleep() directly instead of might_sleep() which may call
  cond_resched()
- turn cond_resched() into a macro so that the file:line couple reported
  refers to the caller of cond_resched() and not __cond_resched() itself.

Changes in v3:
- also propagate this __might_sleep() pull up to cond_resched_lock() and
  cond_resched_softirq()

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 fs/dcache.c           |    1 +
 include/linux/sched.h |   29 +++++++++++++++++++----------
 kernel/sched.c        |   12 +++++-------
 3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 5ecb8e1..64698d6 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -32,6 +32,7 @@
 #include <linux/swap.h>
 #include <linux/bootmem.h>
 #include <linux/fs_struct.h>
+#include <linux/hardirq.h>
 #include "internal.h"
 
 int sysctl_vfs_cache_pressure __read_mostly = 100;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e2bdf18..099408f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2286,17 +2286,26 @@ static inline int need_resched(void)
  */
 extern int _cond_resched(void);
 
-static inline int cond_resched(void)
-{
-	return _cond_resched();
-}
+#define cond_resched() ({			\
+	__might_sleep(__FILE__, __LINE__, 0);	\
+	_cond_resched();			\
+})
 
-extern int cond_resched_lock(spinlock_t * lock);
-extern int cond_resched_softirq(void);
-static inline int cond_resched_bkl(void)
-{
-	return _cond_resched();
-}
+extern int __cond_resched_lock(spinlock_t * lock);
+
+#define cond_resched_lock(lock) ({				\
+	__might_sleep(__FILE__, __LINE__, PREEMPT_OFFSET);	\
+	__cond_resched_lock(lock);				\
+})
+
+extern int __cond_resched_softirq(void);
+
+#define cond_resched_softirq() ({				\
+	__might_sleep(__FILE__, __LINE__, SOFTIRQ_OFFSET);	\
+	__cond_resched_softirq();				\
+})
+
+#define cond_resched_bkl()	cond_resched()
 
 /*
  * Does a critical section need to be broken due to another
diff --git a/kernel/sched.c b/kernel/sched.c
index 847e8fb..b9950b1 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6610,8 +6610,6 @@ static inline int should_resched(void)
 
 static void __cond_resched(void)
 {
-	__might_sleep(__FILE__, __LINE__, 0);
-
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
 	sub_preempt_count(PREEMPT_ACTIVE);
@@ -6628,14 +6626,14 @@ int __sched _cond_resched(void)
 EXPORT_SYMBOL(_cond_resched);
 
 /*
- * cond_resched_lock() - if a reschedule is pending, drop the given lock,
+ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
  * call schedule, and on return reacquire the lock.
  *
  * This works OK both with and without CONFIG_PREEMPT. We do strange low-level
  * operations here to prevent schedule() from being called twice (once via
  * spin_unlock(), once by hand).
  */
-int cond_resched_lock(spinlock_t *lock)
+int __cond_resched_lock(spinlock_t *lock)
 {
 	int resched = should_resched();
 	int ret = 0;
@@ -6651,9 +6649,9 @@ int cond_resched_lock(spinlock_t *lock)
 	}
 	return ret;
 }
-EXPORT_SYMBOL(cond_resched_lock);
+EXPORT_SYMBOL(__cond_resched_lock);
 
-int __sched cond_resched_softirq(void)
+int __sched __cond_resched_softirq(void)
 {
 	BUG_ON(!in_softirq());
 
@@ -6665,7 +6663,7 @@ int __sched cond_resched_softirq(void)
 	}
 	return 0;
 }
-EXPORT_SYMBOL(cond_resched_softirq);
+EXPORT_SYMBOL(__cond_resched_softirq);
 
 /**
  * yield - yield the current processor to other threads.
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/7] sched: Convert the only user of cond_resched_bkl to use cond_resched()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2009-07-16  6:28 ` [PATCH 6/7 v3] sched: Pull up the might_sleep() check in cond_resched() Frederic Weisbecker
@ 2009-07-16  6:28 ` Frederic Weisbecker
  2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2009-07-18 14:21 ` [tip:sched/core] sched: Drop the need_resched() loop from cond_resched() tip-bot for Frederic Weisbecker
  6 siblings, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16  6:28 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner

fs/locks.c:flock_lock_file() is the only user of cond_resched_bkl()

This helper doesn't do anything more than cond_resched(). The latter
naming is enough to explain that we are rescheduling if needed.

The bkl suffix suggests another semantics but it's actually
a synonym of cond_resched().

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 fs/locks.c            |    2 +-
 include/linux/sched.h |    2 --
 2 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 09502f3..0fb66dd 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -768,7 +768,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 	 * give it the opportunity to lock the file.
 	 */
 	if (found)
-		cond_resched_bkl();
+		cond_resched();
 
 find_conflict:
 	for_each_lock(inode, before) {
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 099408f..d7ffd3f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2305,8 +2305,6 @@ extern int __cond_resched_softirq(void);
 	__cond_resched_softirq();				\
 })
 
-#define cond_resched_bkl()	cond_resched()
-
 /*
  * Does a critical section need to be broken due to another
  * task waiting?: (technically does not depend on CONFIG_PREEMPT,
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep()
  2009-07-16  6:28 ` [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep() Frederic Weisbecker
@ 2009-07-16 14:14   ` Peter Zijlstra
  2009-07-16 14:34     ` Peter Zijlstra
  2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2009-07-16 14:14 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, LKML, Thomas Gleixner

On Thu, 2009-07-16 at 02:28 -0400, Frederic Weisbecker wrote:
> +++ b/include/linux/hardirq.h
> @@ -103,6 +103,13 @@
>   */
>  #define in_atomic()    ((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_INATOMIC_BASE)
>  
> +static inline int current_preempt_equals(int preempt_offset)
> +{
> +       int nested = preempt_count() & ~PREEMPT_ACTIVE;
> +
> +       return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
> +}

I'm not sure about it being in hardirq.h, I think we should keep this in
sched.c.

Also, I'm not sure about the name, but then I suck at naming too. How
about something like: preempt_count_equals() ?

Other than that the series looks nice and I've got it queued.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep()
  2009-07-16 14:14   ` Peter Zijlstra
@ 2009-07-16 14:34     ` Peter Zijlstra
  2009-07-16 14:42       ` Frederic Weisbecker
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2009-07-16 14:34 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, LKML, Thomas Gleixner

On Thu, 2009-07-16 at 16:14 +0200, Peter Zijlstra wrote:
> On Thu, 2009-07-16 at 02:28 -0400, Frederic Weisbecker wrote:
> > +++ b/include/linux/hardirq.h
> > @@ -103,6 +103,13 @@
> >   */
> >  #define in_atomic()    ((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_INATOMIC_BASE)
> >  
> > +static inline int current_preempt_equals(int preempt_offset)
> > +{
> > +       int nested = preempt_count() & ~PREEMPT_ACTIVE;
> > +
> > +       return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
> > +}
> 
> I'm not sure about it being in hardirq.h, I think we should keep this in
> sched.c.
> 
> Also, I'm not sure about the name, but then I suck at naming too. How
> about something like: preempt_count_equals() ?
> 
> Other than that the series looks nice and I've got it queued.

I've added this on top:

---
Index: linux-2.6/include/linux/hardirq.h
===================================================================
--- linux-2.6.orig/include/linux/hardirq.h
+++ linux-2.6/include/linux/hardirq.h
@@ -103,13 +103,6 @@
  */
 #define in_atomic()	((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_INATOMIC_BASE)
 
-static inline int current_preempt_equals(int preempt_offset)
-{
-	int nested = preempt_count() & ~PREEMPT_ACTIVE;
-
-	return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
-}
-
 /*
  * Check whether we were atomic before we did preempt_disable():
  * (used by the scheduler, *after* releasing the kernel lock)
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -9444,12 +9444,19 @@ void __init sched_init(void)
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
+static inline int preempt_count_equals(int preempt_offset)
+{
+	int nested = preempt_count() & ~PREEMPT_ACTIVE;
+
+	return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
+}
+
 void __might_sleep(char *file, int line, int preempt_offset)
 {
 #ifdef in_atomic
 	static unsigned long prev_jiffy;	/* ratelimiting */
 
-	if ((current_preempt_equals(preempt_offset) && !irqs_disabled()) ||
+	if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
 	    system_state != SYSTEM_RUNNING || oops_in_progress)
 		return;
 	if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep()
  2009-07-16 14:34     ` Peter Zijlstra
@ 2009-07-16 14:42       ` Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-16 14:42 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML, Thomas Gleixner

On Thu, Jul 16, 2009 at 04:34:53PM +0200, Peter Zijlstra wrote:
> On Thu, 2009-07-16 at 16:14 +0200, Peter Zijlstra wrote:
> > On Thu, 2009-07-16 at 02:28 -0400, Frederic Weisbecker wrote:
> > > +++ b/include/linux/hardirq.h
> > > @@ -103,6 +103,13 @@
> > >   */
> > >  #define in_atomic()    ((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_INATOMIC_BASE)
> > >  
> > > +static inline int current_preempt_equals(int preempt_offset)
> > > +{
> > > +       int nested = preempt_count() & ~PREEMPT_ACTIVE;
> > > +
> > > +       return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
> > > +}
> > 
> > I'm not sure about it being in hardirq.h, I think we should keep this in
> > sched.c.
> > 
> > Also, I'm not sure about the name, but then I suck at naming too. How
> > about something like: preempt_count_equals() ?
> > 
> > Other than that the series looks nice and I've got it queued.
> 
> I've added this on top:
> 
> ---
> Index: linux-2.6/include/linux/hardirq.h
> ===================================================================
> --- linux-2.6.orig/include/linux/hardirq.h
> +++ linux-2.6/include/linux/hardirq.h
> @@ -103,13 +103,6 @@
>   */
>  #define in_atomic()	((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_INATOMIC_BASE)
>  
> -static inline int current_preempt_equals(int preempt_offset)
> -{
> -	int nested = preempt_count() & ~PREEMPT_ACTIVE;
> -
> -	return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
> -}
> -
>  /*
>   * Check whether we were atomic before we did preempt_disable():
>   * (used by the scheduler, *after* releasing the kernel lock)
> Index: linux-2.6/kernel/sched.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched.c
> +++ linux-2.6/kernel/sched.c
> @@ -9444,12 +9444,19 @@ void __init sched_init(void)
>  }
>  
>  #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
> +static inline int preempt_count_equals(int preempt_offset)
> +{
> +	int nested = preempt_count() & ~PREEMPT_ACTIVE;
> +
> +	return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
> +}
> +
>  void __might_sleep(char *file, int line, int preempt_offset)
>  {
>  #ifdef in_atomic
>  	static unsigned long prev_jiffy;	/* ratelimiting */
>  
> -	if ((current_preempt_equals(preempt_offset) && !irqs_disabled()) ||
> +	if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
>  	    system_state != SYSTEM_RUNNING || oops_in_progress)
>  		return;
>  	if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)


Ok. Yeah the naming and the place are better there.

Thanks.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Drop the need_resched() loop from cond_resched()
  2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2009-07-16  6:28 ` [PATCH 7/7] sched: Convert the only user of cond_resched_bkl to use cond_resched() Frederic Weisbecker
@ 2009-07-18 14:21 ` tip-bot for Frederic Weisbecker
  6 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  e7aaaa6934636d7a6cadd9e2a05250fbb6a34f65
Gitweb:     http://git.kernel.org/tip/e7aaaa6934636d7a6cadd9e2a05250fbb6a34f65
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:38 +0200

sched: Drop the need_resched() loop from cond_resched()

The schedule() function is a loop that reschedules the current
task while the TIF_NEED_RESCHED flag is set:

void schedule(void)
{
need_resched:
	/* schedule code */
	if (need_resched())
		goto need_resched;
}

And cond_resched() repeat this loop:

do {
	add_preempt_count(PREEMPT_ACTIVE);
	schedule();
	sub_preempt_count(PREEMPT_ACTIVE);
} while(need_resched());

This loop is needless because schedule() already did the check
and nothing can set TIF_NEED_RESCHED between schedule() exit
and the loop check in need_resched().

Then remove this needless loop.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 kernel/sched.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 03f7e3f..4c5ee84 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6618,11 +6618,9 @@ static void __cond_resched(void)
 	 * PREEMPT_ACTIVE, which could trigger a second
 	 * cond_resched() call.
 	 */
-	do {
-		add_preempt_count(PREEMPT_ACTIVE);
-		schedule();
-		sub_preempt_count(PREEMPT_ACTIVE);
-	} while (need_resched());
+	add_preempt_count(PREEMPT_ACTIVE);
+	schedule();
+	sub_preempt_count(PREEMPT_ACTIVE);
 }
 
 int __sched _cond_resched(void)

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Remove obsolete comment in __cond_resched()
  2009-07-16  6:28 ` [PATCH 2/7] sched: Remove obsolete comment in __cond_resched() Frederic Weisbecker
@ 2009-07-18 14:21   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arnd, fweisbec, tglx,
	mingo

Commit-ID:  4b2155678d7cc7b5f45d6b36049091376c3408a2
Gitweb:     http://git.kernel.org/tip/4b2155678d7cc7b5f45d6b36049091376c3408a2
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:39 +0200

sched: Remove obsolete comment in __cond_resched()

Remove the outdated comment from __cond_resched() related to
the now removed Big Kernel Semaphore.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 kernel/sched.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 4c5ee84..4d39e96 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6613,11 +6613,6 @@ static void __cond_resched(void)
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
 	__might_sleep(__FILE__, __LINE__);
 #endif
-	/*
-	 * The BKS might be reacquired before we have dropped
-	 * PREEMPT_ACTIVE, which could trigger a second
-	 * cond_resched() call.
-	 */
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
 	sub_preempt_count(PREEMPT_ACTIVE);

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep()
  2009-07-16  6:28 ` [PATCH 3/7] sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep() Frederic Weisbecker
@ 2009-07-18 14:21   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  e09758fae8ccde97e026c704319eaa18d488dc86
Gitweb:     http://git.kernel.org/tip/e09758fae8ccde97e026c704319eaa18d488dc86
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:41 +0200

sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep()

Cover the off case for __might_sleep(), so that we avoid
#ifdefs in files that make use of it. Especially, this prepares
for the __might_sleep() pull up on cond_resched().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/kernel.h |    1 +
 kernel/sched.c         |    3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d6320a3..b804f69 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -139,6 +139,7 @@ extern int _cond_resched(void);
 # define might_sleep() \
 	do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)
 #else
+  static inline void __might_sleep(char *file, int line) { }
 # define might_sleep() do { might_resched(); } while (0)
 #endif
 
diff --git a/kernel/sched.c b/kernel/sched.c
index 4d39e96..370a6c3 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6610,9 +6610,8 @@ static inline int should_resched(void)
 
 static void __cond_resched(void)
 {
-#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
 	__might_sleep(__FILE__, __LINE__);
-#endif
+
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
 	sub_preempt_count(PREEMPT_ACTIVE);

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Add a preempt count base offset to __might_sleep()
  2009-07-16  6:28 ` [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep() Frederic Weisbecker
  2009-07-16 14:14   ` Peter Zijlstra
@ 2009-07-18 14:22   ` tip-bot for Frederic Weisbecker
  1 sibling, 0 replies; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  e4aafea2d4bde8b44d6500c4ee7195bbfc51269e
Gitweb:     http://git.kernel.org/tip/e4aafea2d4bde8b44d6500c4ee7195bbfc51269e
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:42 +0200

sched: Add a preempt count base offset to __might_sleep()

Add a preempt count base offset to compare against the current
preempt level count. It prepares to pull up the might_sleep
check from cond_resched() to cond_resched_lock() and
cond_resched_bh().

For these two helpers, we need to respectively ensure that once
we'll unlock the given spinlock / reenable local softirqs, we
will reach a sleepable state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
[ Move and rename preempt_count_equals() ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/kernel.h |    6 +++---
 kernel/sched.c         |   15 +++++++++++----
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index b804f69..2b5b1e0 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -125,7 +125,7 @@ extern int _cond_resched(void);
 #endif
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
-  void __might_sleep(char *file, int line);
+  void __might_sleep(char *file, int line, int preempt_offset);
 /**
  * might_sleep - annotation for functions that can sleep
  *
@@ -137,9 +137,9 @@ extern int _cond_resched(void);
  * supposed to.
  */
 # define might_sleep() \
-	do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)
+	do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
 #else
-  static inline void __might_sleep(char *file, int line) { }
+  static inline void __might_sleep(char *file, int line, int preempt_offset) { }
 # define might_sleep() do { might_resched(); } while (0)
 #endif
 
diff --git a/kernel/sched.c b/kernel/sched.c
index 370a6c3..3ff4d00 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6610,7 +6610,7 @@ static inline int should_resched(void)
 
 static void __cond_resched(void)
 {
-	__might_sleep(__FILE__, __LINE__);
+	__might_sleep(__FILE__, __LINE__, 0);
 
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
@@ -9429,13 +9429,20 @@ void __init sched_init(void)
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
-void __might_sleep(char *file, int line)
+static inline int preempt_count_equals(int preempt_offset)
+{
+	int nested = preempt_count() & ~PREEMPT_ACTIVE;
+
+	return (nested == PREEMPT_INATOMIC_BASE + preempt_offset);
+}
+
+void __might_sleep(char *file, int line, int preempt_offset)
 {
 #ifdef in_atomic
 	static unsigned long prev_jiffy;	/* ratelimiting */
 
-	if ((!in_atomic() && !irqs_disabled()) ||
-		    system_state != SYSTEM_RUNNING || oops_in_progress)
+	if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
+	    system_state != SYSTEM_RUNNING || oops_in_progress)
 		return;
 	if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
 		return;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched()
  2009-07-16  6:28 ` [PATCH 5/7] sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched() Frederic Weisbecker
@ 2009-07-18 14:22   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  6f80bd985fe242c2e6a8b6209ed20b0495d3d63b
Gitweb:     http://git.kernel.org/tip/6f80bd985fe242c2e6a8b6209ed20b0495d3d63b
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:43 +0200

sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched()

CONFIG_PREEMPT_BKL doesn't exist anymore. So remove this
config-on case definition of cond_resched().

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/sched.h |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9bada20..e2bdf18 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2285,17 +2285,12 @@ static inline int need_resched(void)
  * cond_resched_softirq() will enable bhs before scheduling.
  */
 extern int _cond_resched(void);
-#ifdef CONFIG_PREEMPT_BKL
-static inline int cond_resched(void)
-{
-	return 0;
-}
-#else
+
 static inline int cond_resched(void)
 {
 	return _cond_resched();
 }
-#endif
+
 extern int cond_resched_lock(spinlock_t * lock);
 extern int cond_resched_softirq(void);
 static inline int cond_resched_bkl(void)

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-16  6:28 ` [PATCH 6/7 v3] sched: Pull up the might_sleep() check in cond_resched() Frederic Weisbecker
@ 2009-07-18 14:22   ` tip-bot for Frederic Weisbecker
  2009-07-20  6:50     ` Li Zefan
  0 siblings, 1 reply; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  613afbf83298efaead05ebcac23d2285609d7160
Gitweb:     http://git.kernel.org/tip/613afbf83298efaead05ebcac23d2285609d7160
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:44 +0200

sched: Pull up the might_sleep() check into cond_resched()

might_sleep() is called late-ish in cond_resched(), after the
need_resched()/preempt enabled/system running tests are
checked.

It's better to check the sleeps while atomic earlier and not
depend on some environment datas that reduce the chances to
detect a problem.

Also define cond_resched_*() helpers as macros, so that the
FILE/LINE reported in the sleeping while atomic warning
displays the real origin and not sched.h

Changes in v2:

 - Call __might_sleep() directly instead of might_sleep() which
   may call cond_resched()

 - Turn cond_resched() into a macro so that the file:line
   couple reported refers to the caller of cond_resched() and
   not __cond_resched() itself.

Changes in v3:

 - Also propagate this __might_sleep() pull up to
   cond_resched_lock() and cond_resched_softirq()

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-6-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 fs/dcache.c           |    1 +
 include/linux/sched.h |   29 +++++++++++++++++++----------
 kernel/sched.c        |   12 +++++-------
 3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 9e5cd3c..a100fa3 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -32,6 +32,7 @@
 #include <linux/swap.h>
 #include <linux/bootmem.h>
 #include <linux/fs_struct.h>
+#include <linux/hardirq.h>
 #include "internal.h"
 
 int sysctl_vfs_cache_pressure __read_mostly = 100;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e2bdf18..c41d424 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2286,17 +2286,26 @@ static inline int need_resched(void)
  */
 extern int _cond_resched(void);
 
-static inline int cond_resched(void)
-{
-	return _cond_resched();
-}
+#define cond_resched() ({			\
+	__might_sleep(__FILE__, __LINE__, 0);	\
+	_cond_resched();			\
+})
 
-extern int cond_resched_lock(spinlock_t * lock);
-extern int cond_resched_softirq(void);
-static inline int cond_resched_bkl(void)
-{
-	return _cond_resched();
-}
+extern int __cond_resched_lock(spinlock_t *lock);
+
+#define cond_resched_lock(lock) ({				\
+	__might_sleep(__FILE__, __LINE__, PREEMPT_OFFSET);	\
+	__cond_resched_lock(lock);				\
+})
+
+extern int __cond_resched_softirq(void);
+
+#define cond_resched_softirq() ({				\
+	__might_sleep(__FILE__, __LINE__, SOFTIRQ_OFFSET);	\
+	__cond_resched_softirq();				\
+})
+
+#define cond_resched_bkl()	cond_resched()
 
 /*
  * Does a critical section need to be broken due to another
diff --git a/kernel/sched.c b/kernel/sched.c
index 3ff4d00..1f7919a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6610,8 +6610,6 @@ static inline int should_resched(void)
 
 static void __cond_resched(void)
 {
-	__might_sleep(__FILE__, __LINE__, 0);
-
 	add_preempt_count(PREEMPT_ACTIVE);
 	schedule();
 	sub_preempt_count(PREEMPT_ACTIVE);
@@ -6628,14 +6626,14 @@ int __sched _cond_resched(void)
 EXPORT_SYMBOL(_cond_resched);
 
 /*
- * cond_resched_lock() - if a reschedule is pending, drop the given lock,
+ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
  * call schedule, and on return reacquire the lock.
  *
  * This works OK both with and without CONFIG_PREEMPT. We do strange low-level
  * operations here to prevent schedule() from being called twice (once via
  * spin_unlock(), once by hand).
  */
-int cond_resched_lock(spinlock_t *lock)
+int __cond_resched_lock(spinlock_t *lock)
 {
 	int resched = should_resched();
 	int ret = 0;
@@ -6651,9 +6649,9 @@ int cond_resched_lock(spinlock_t *lock)
 	}
 	return ret;
 }
-EXPORT_SYMBOL(cond_resched_lock);
+EXPORT_SYMBOL(__cond_resched_lock);
 
-int __sched cond_resched_softirq(void)
+int __sched __cond_resched_softirq(void)
 {
 	BUG_ON(!in_softirq());
 
@@ -6665,7 +6663,7 @@ int __sched cond_resched_softirq(void)
 	}
 	return 0;
 }
-EXPORT_SYMBOL(cond_resched_softirq);
+EXPORT_SYMBOL(__cond_resched_softirq);
 
 /**
  * yield - yield the current processor to other threads.

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched: Convert the only user of cond_resched_bkl to use cond_resched()
  2009-07-16  6:28 ` [PATCH 7/7] sched: Convert the only user of cond_resched_bkl to use cond_resched() Frederic Weisbecker
@ 2009-07-18 14:22   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-18 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  def01bc53d03881acfc393bd10a5c7575187e008
Gitweb:     http://git.kernel.org/tip/def01bc53d03881acfc393bd10a5c7575187e008
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 18 Jul 2009 15:51:45 +0200

sched: Convert the only user of cond_resched_bkl to use cond_resched()

fs/locks.c:flock_lock_file() is the only user of
cond_resched_bkl()

This helper doesn't do anything more than cond_resched(). The
latter naming is enough to explain that we are rescheduling if
needed.

The bkl suffix suggests another semantics but it's actually a
synonym of cond_resched().

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1247725694-6082-7-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 fs/locks.c            |    2 +-
 include/linux/sched.h |    2 --
 2 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index b6440f5..2eb8197 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -768,7 +768,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 	 * give it the opportunity to lock the file.
 	 */
 	if (found)
-		cond_resched_bkl();
+		cond_resched();
 
 find_conflict:
 	for_each_lock(inode, before) {
diff --git a/include/linux/sched.h b/include/linux/sched.h
index c41d424..cbbfca6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2305,8 +2305,6 @@ extern int __cond_resched_softirq(void);
 	__cond_resched_softirq();				\
 })
 
-#define cond_resched_bkl()	cond_resched()
-
 /*
  * Does a critical section need to be broken due to another
  * task waiting?: (technically does not depend on CONFIG_PREEMPT,

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-18 14:22   ` [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched() tip-bot for Frederic Weisbecker
@ 2009-07-20  6:50     ` Li Zefan
  2009-07-20  8:12       ` Frederic Weisbecker
  0 siblings, 1 reply; 25+ messages in thread
From: Li Zefan @ 2009-07-20  6:50 UTC (permalink / raw)
  To: fweisbec; +Cc: hpa, linux-kernel, a.p.zijlstra, tglx, mingo, linux-tip-commits

> Commit-ID:  613afbf83298efaead05ebcac23d2285609d7160
> Gitweb:     http://git.kernel.org/tip/613afbf83298efaead05ebcac23d2285609d7160
> Author:     Frederic Weisbecker <fweisbec@gmail.com>
> AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
> Committer:  Ingo Molnar <mingo@elte.hu>
> CommitDate: Sat, 18 Jul 2009 15:51:44 +0200
> 
> sched: Pull up the might_sleep() check into cond_resched()
> 
> might_sleep() is called late-ish in cond_resched(), after the
> need_resched()/preempt enabled/system running tests are
> checked.
> 
> It's better to check the sleeps while atomic earlier and not
> depend on some environment datas that reduce the chances to
> detect a problem.
> 
> Also define cond_resched_*() helpers as macros, so that the
> FILE/LINE reported in the sleeping while atomic warning
> displays the real origin and not sched.h
> 

I guess it's this patch that causes lots of "BUG"

BUG: sleeping function called from invalid context at fs/jbd/commit.c:902        
in_atomic(): 0, irqs_disabled(): 0, pid: 64, name: kjournald                     
INFO: lockdep is turned off.                                                     
Pid: 64, comm: kjournald Tainted: GF          2.6.31-rc3-tip #15                 
Call Trace:                                                                      
 [<c042cbd1>] __might_sleep+0xda/0xdf                                            
 [<c053e9f4>] journal_commit_transaction+0xb03/0xc5f                             
 [<c043ecc4>] ? try_to_del_timer_sync+0x48/0x4f                                  
 [<c0541394>] kjournald+0xcf/0x1fe                                               
 [<c0448998>] ? autoremove_wake_function+0x0/0x34                                
 [<c05412c5>] ? kjournald+0x0/0x1fe                                              
 [<c0448708>] kthread+0x6b/0x70                                                  
 [<c044869d>] ? kthread+0x0/0x70                                                 
 [<c040364b>] kernel_thread_helper+0x7/0x10                                      
BUG: sleeping function called from invalid context at fs/dcache.c:512            
in_atomic(): 0, irqs_disabled(): 0, pid: 2005, name: bash                        
INFO: lockdep is turned off.                                                     
Pid: 2005, comm: bash Tainted: GF          2.6.31-rc3-tip #15                    
Call Trace:                                                                      
 [<c042cbd1>] __might_sleep+0xda/0xdf                                            
 [<c04cae29>] __shrink_dcache_sb+0x208/0x27a                                     
 [<c04cb038>] shrink_dcache_parent+0x2c/0xcf                                     
 [<c04f8371>] proc_flush_task+0xa7/0x194                                         
 [<c0437553>] release_task+0x29/0x3b4                                            
 [<c0437fe0>] wait_consider_task+0x702/0xa91                                     
 [<c043844d>] do_wait+0xde/0x276                                                 
 [<c0430f6e>] ? default_wake_function+0x0/0x12                                   
 [<c0438672>] sys_wait4+0x8d/0xa6                                                
 [<c04a3c65>] ? might_fault+0x85/0x87                                            
 [<c04386a3>] sys_waitpid+0x18/0x1a                                              
 [<c0402ab8>] sysenter_do_call+0x12/0x36                       
...
...

> Changes in v2:
> 
>  - Call __might_sleep() directly instead of might_sleep() which
>    may call cond_resched()
> 
>  - Turn cond_resched() into a macro so that the file:line
>    couple reported refers to the caller of cond_resched() and
>    not __cond_resched() itself.
> 
> Changes in v3:
> 
>  - Also propagate this __might_sleep() pull up to
>    cond_resched_lock() and cond_resched_softirq()
> 
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> LKML-Reference: <1247725694-6082-6-git-send-email-fweisbec@gmail.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-20  6:50     ` Li Zefan
@ 2009-07-20  8:12       ` Frederic Weisbecker
  2009-07-20  8:49         ` Peter Zijlstra
  2009-07-20 11:39         ` Peter Zijlstra
  0 siblings, 2 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-20  8:12 UTC (permalink / raw)
  To: Li Zefan; +Cc: hpa, linux-kernel, a.p.zijlstra, tglx, mingo, linux-tip-commits

On Mon, Jul 20, 2009 at 02:50:19PM +0800, Li Zefan wrote:
> > Commit-ID:  613afbf83298efaead05ebcac23d2285609d7160
> > Gitweb:     http://git.kernel.org/tip/613afbf83298efaead05ebcac23d2285609d7160
> > Author:     Frederic Weisbecker <fweisbec@gmail.com>
> > AuthorDate: Thu, 16 Jul 2009 15:44:29 +0200
> > Committer:  Ingo Molnar <mingo@elte.hu>
> > CommitDate: Sat, 18 Jul 2009 15:51:44 +0200
> > 
> > sched: Pull up the might_sleep() check into cond_resched()
> > 
> > might_sleep() is called late-ish in cond_resched(), after the
> > need_resched()/preempt enabled/system running tests are
> > checked.
> > 
> > It's better to check the sleeps while atomic earlier and not
> > depend on some environment datas that reduce the chances to
> > detect a problem.
> > 
> > Also define cond_resched_*() helpers as macros, so that the
> > FILE/LINE reported in the sleeping while atomic warning
> > displays the real origin and not sched.h
> > 
> 
> I guess it's this patch that causes lots of "BUG"
> 
> BUG: sleeping function called from invalid context at fs/jbd/commit.c:902        
> in_atomic(): 0, irqs_disabled(): 0, pid: 64, name: kjournald                     
> INFO: lockdep is turned off.                                                     
> Pid: 64, comm: kjournald Tainted: GF          2.6.31-rc3-tip #15                 
> Call Trace:                                                                      
>  [<c042cbd1>] __might_sleep+0xda/0xdf                                            
>  [<c053e9f4>] journal_commit_transaction+0xb03/0xc5f                             
>  [<c043ecc4>] ? try_to_del_timer_sync+0x48/0x4f                                  
>  [<c0541394>] kjournald+0xcf/0x1fe                                               
>  [<c0448998>] ? autoremove_wake_function+0x0/0x34                                
>  [<c05412c5>] ? kjournald+0x0/0x1fe                                              
>  [<c0448708>] kthread+0x6b/0x70                                                  
>  [<c044869d>] ? kthread+0x0/0x70                                                 
>  [<c040364b>] kernel_thread_helper+0x7/0x10                                      
> BUG: sleeping function called from invalid context at fs/dcache.c:512            
> in_atomic(): 0, irqs_disabled(): 0, pid: 2005, name: bash                        
> INFO: lockdep is turned off.                                                     
> Pid: 2005, comm: bash Tainted: GF          2.6.31-rc3-tip #15                    
> Call Trace:                                                                      
>  [<c042cbd1>] __might_sleep+0xda/0xdf                                            
>  [<c04cae29>] __shrink_dcache_sb+0x208/0x27a                                     
>  [<c04cb038>] shrink_dcache_parent+0x2c/0xcf                                     
>  [<c04f8371>] proc_flush_task+0xa7/0x194                                         
>  [<c0437553>] release_task+0x29/0x3b4                                            
>  [<c0437fe0>] wait_consider_task+0x702/0xa91                                     
>  [<c043844d>] do_wait+0xde/0x276                                                 
>  [<c0430f6e>] ? default_wake_function+0x0/0x12                                   
>  [<c0438672>] sys_wait4+0x8d/0xa6                                                
>  [<c04a3c65>] ? might_fault+0x85/0x87                                            
>  [<c04386a3>] sys_waitpid+0x18/0x1a                                              
>  [<c0402ab8>] sysenter_do_call+0x12/0x36                       


Hm, I can read that in fs/dcache.c:512

/* dentry->d_lock was dropped in prune_one_dentry() */
cond_resched_lock(&dcache_lock);

Isn't it a mususe of cond_resched_lock() ?
In this case, dcache.c should be fixed.

Anyway a generic fix could be the following.
Can you tell me if this works for you?

Thanks!
---
From: Frederic Weisbecker <fweisbec@gmail.com>
Subject: [PATCH] sched: Check if the spinlock is locked in cond_resched_lock()

Some uses of cond_resched_lock() might involve an
unlocked spinlock, resulting in spurious sleep in
atomic warnings.
Check whether the spinlock is actually locked and
take that into account in the might_sleep() check.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
diff --git a/include/linux/sched.h b/include/linux/sched.h
index cb070dc..2789658 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2294,9 +2294,10 @@ extern int _cond_resched(void);
 
 extern int __cond_resched_lock(spinlock_t *lock);
 
-#define cond_resched_lock(lock) ({				\
-	__might_sleep(__FILE__, __LINE__, PREEMPT_OFFSET);	\
-	__cond_resched_lock(lock);				\
+#define cond_resched_lock(lock) ({				  \
+	__might_sleep(__FILE__, __LINE__, spin_is_locked(lock) ?  \
+					  PREEMPT_OFFSET : 0);	  \
+	__cond_resched_lock(lock);				  \
 })
 
 extern int __cond_resched_softirq(void);



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-20  8:12       ` Frederic Weisbecker
@ 2009-07-20  8:49         ` Peter Zijlstra
  2009-07-20  9:13           ` Frederic Weisbecker
  2009-07-20 11:56           ` Ingo Molnar
  2009-07-20 11:39         ` Peter Zijlstra
  1 sibling, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2009-07-20  8:49 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Li Zefan, hpa, linux-kernel, tglx, mingo, linux-tip-commits

On Mon, 2009-07-20 at 04:12 -0400, Frederic Weisbecker wrote:

> From: Frederic Weisbecker <fweisbec@gmail.com>
> Subject: [PATCH] sched: Check if the spinlock is locked in cond_resched_lock()
> 
> Some uses of cond_resched_lock() might involve an
> unlocked spinlock, resulting in spurious sleep in
> atomic warnings.
> Check whether the spinlock is actually locked and
> take that into account in the might_sleep() check.
> 
> Reported-by: Li Zefan <lizf@cn.fujitsu.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index cb070dc..2789658 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2294,9 +2294,10 @@ extern int _cond_resched(void);
>  
>  extern int __cond_resched_lock(spinlock_t *lock);
>  
> -#define cond_resched_lock(lock) ({				\
> -	__might_sleep(__FILE__, __LINE__, PREEMPT_OFFSET);	\
> -	__cond_resched_lock(lock);				\
> +#define cond_resched_lock(lock) ({				  \
> +	__might_sleep(__FILE__, __LINE__, spin_is_locked(lock) ?  \
> +					  PREEMPT_OFFSET : 0);	  \
> +	__cond_resched_lock(lock);				  \
>  })
>  
>  extern int __cond_resched_softirq(void);


No, this looks utterly broken.. who is to say it doesn't get unlocked
right after that spin_is_locked() check?

cond_resched_lock() callers must hold the lock they use it on, not doing
so is broken.

So I would suggest something like the below instead:

(utterly untested)

Almost-signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/lockdep.h |    8 ++++++++
 kernel/lockdep.c        |   33 +++++++++++++++++++++++++++++++++
 kernel/sched.c          |    2 ++
 3 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 12aabfc..920df42 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -296,6 +296,10 @@ extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 extern void lock_release(struct lockdep_map *lock, int nested,
 			 unsigned long ip);
 
+#define lockdep_is_held(lock)	lock_is_held(&(lock)->dep_map)
+
+extern int lock_is_held(struct lockdep_map *lock);
+
 extern void lock_set_class(struct lockdep_map *lock, const char *name,
 			   struct lock_class_key *key, unsigned int subclass,
 			   unsigned long ip);
@@ -314,6 +318,8 @@ extern void lockdep_trace_alloc(gfp_t mask);
 
 #define lockdep_depth(tsk)	(debug_locks ? (tsk)->lockdep_depth : 0)
 
+#define lockdep_assert_held(l)	WARN_ON(!lockdep_is_held(l))
+
 #else /* !LOCKDEP */
 
 static inline void lockdep_off(void)
@@ -358,6 +364,8 @@ struct lock_class_key { };
 
 #define lockdep_depth(tsk)	(0)
 
+#define lockdep_assert_held(l)			do { } while (0)
+
 #endif /* !LOCKDEP */
 
 #ifdef CONFIG_LOCK_STAT
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 3718a98..fd626ea 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3044,6 +3044,19 @@ __lock_release(struct lockdep_map *lock, int nested, unsigned long ip)
 	check_chain_key(curr);
 }
 
+static int __lock_is_held(struct lockdep_map *lock)
+{
+	struct task_struct *curr = current;
+	int i;
+
+	for (i = 0; i < curr->lockdep_depth; i++) {
+		if (curr->held_locks[i].instance == lock)
+			return 1;
+	}
+
+	return 0;
+}
+
 /*
  * Check whether we follow the irq-flags state precisely:
  */
@@ -3145,6 +3158,26 @@ void lock_release(struct lockdep_map *lock, int nested,
 }
 EXPORT_SYMBOL_GPL(lock_release);
 
+int lock_is_held(struct lockdep_map *lock)
+{
+	unsigned long flags;
+	int ret = 0;
+
+	if (unlikely(current->lockdep_recursion))
+		return;
+
+	raw_local_irq_save(flags);
+	check_flags(flags);
+
+	current->lockdep_recursion = 1;
+	ret = __lock_is_held(lock);
+	current->lockdep_recursion = 0;
+	raw_local_irq_restore(flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(lock_is_held);
+
 void lockdep_set_current_reclaim_state(gfp_t gfp_mask)
 {
 	current->lockdep_reclaim_gfp = gfp_mask;
diff --git a/kernel/sched.c b/kernel/sched.c
index 56fb225..6255f82 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6638,6 +6638,8 @@ int __cond_resched_lock(spinlock_t *lock)
 	int resched = should_resched();
 	int ret = 0;
 
+	lockdep_assert_held(lock);
+
 	if (spin_needbreak(lock) || resched) {
 		spin_unlock(lock);
 		if (resched)



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-20  8:49         ` Peter Zijlstra
@ 2009-07-20  9:13           ` Frederic Weisbecker
  2009-07-20 11:56           ` Ingo Molnar
  1 sibling, 0 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2009-07-20  9:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Li Zefan, hpa, linux-kernel, tglx, mingo, linux-tip-commits

On Mon, Jul 20, 2009 at 10:49:07AM +0200, Peter Zijlstra wrote:
> On Mon, 2009-07-20 at 04:12 -0400, Frederic Weisbecker wrote:
> 
> > From: Frederic Weisbecker <fweisbec@gmail.com>
> > Subject: [PATCH] sched: Check if the spinlock is locked in cond_resched_lock()
> > 
> > Some uses of cond_resched_lock() might involve an
> > unlocked spinlock, resulting in spurious sleep in
> > atomic warnings.
> > Check whether the spinlock is actually locked and
> > take that into account in the might_sleep() check.
> > 
> > Reported-by: Li Zefan <lizf@cn.fujitsu.com>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index cb070dc..2789658 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -2294,9 +2294,10 @@ extern int _cond_resched(void);
> >  
> >  extern int __cond_resched_lock(spinlock_t *lock);
> >  
> > -#define cond_resched_lock(lock) ({				\
> > -	__might_sleep(__FILE__, __LINE__, PREEMPT_OFFSET);	\
> > -	__cond_resched_lock(lock);				\
> > +#define cond_resched_lock(lock) ({				  \
> > +	__might_sleep(__FILE__, __LINE__, spin_is_locked(lock) ?  \
> > +					  PREEMPT_OFFSET : 0);	  \
> > +	__cond_resched_lock(lock);				  \
> >  })
> >  
> >  extern int __cond_resched_softirq(void);
> 
> 
> No, this looks utterly broken.. who is to say it doesn't get unlocked
> right after that spin_is_locked() check?


Oh, indeed, that was silly...


> 
> cond_resched_lock() callers must hold the lock they use it on, not doing
> so is broken.
> 
> So I would suggest something like the below instead:
> 
> (utterly untested)


Looks better anyway.

Thanks,
Frederic.


> 
> Almost-signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  include/linux/lockdep.h |    8 ++++++++
>  kernel/lockdep.c        |   33 +++++++++++++++++++++++++++++++++
>  kernel/sched.c          |    2 ++
>  3 files changed, 43 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index 12aabfc..920df42 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -296,6 +296,10 @@ extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>  extern void lock_release(struct lockdep_map *lock, int nested,
>  			 unsigned long ip);
>  
> +#define lockdep_is_held(lock)	lock_is_held(&(lock)->dep_map)
> +
> +extern int lock_is_held(struct lockdep_map *lock);
> +
>  extern void lock_set_class(struct lockdep_map *lock, const char *name,
>  			   struct lock_class_key *key, unsigned int subclass,
>  			   unsigned long ip);
> @@ -314,6 +318,8 @@ extern void lockdep_trace_alloc(gfp_t mask);
>  
>  #define lockdep_depth(tsk)	(debug_locks ? (tsk)->lockdep_depth : 0)
>  
> +#define lockdep_assert_held(l)	WARN_ON(!lockdep_is_held(l))
> +
>  #else /* !LOCKDEP */
>  
>  static inline void lockdep_off(void)
> @@ -358,6 +364,8 @@ struct lock_class_key { };
>  
>  #define lockdep_depth(tsk)	(0)
>  
> +#define lockdep_assert_held(l)			do { } while (0)
> +
>  #endif /* !LOCKDEP */
>  
>  #ifdef CONFIG_LOCK_STAT
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index 3718a98..fd626ea 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -3044,6 +3044,19 @@ __lock_release(struct lockdep_map *lock, int nested, unsigned long ip)
>  	check_chain_key(curr);
>  }
>  
> +static int __lock_is_held(struct lockdep_map *lock)
> +{
> +	struct task_struct *curr = current;
> +	int i;
> +
> +	for (i = 0; i < curr->lockdep_depth; i++) {
> +		if (curr->held_locks[i].instance == lock)
> +			return 1;
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * Check whether we follow the irq-flags state precisely:
>   */
> @@ -3145,6 +3158,26 @@ void lock_release(struct lockdep_map *lock, int nested,
>  }
>  EXPORT_SYMBOL_GPL(lock_release);
>  
> +int lock_is_held(struct lockdep_map *lock)
> +{
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	if (unlikely(current->lockdep_recursion))
> +		return;
> +
> +	raw_local_irq_save(flags);
> +	check_flags(flags);
> +
> +	current->lockdep_recursion = 1;
> +	ret = __lock_is_held(lock);
> +	current->lockdep_recursion = 0;
> +	raw_local_irq_restore(flags);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(lock_is_held);
> +
>  void lockdep_set_current_reclaim_state(gfp_t gfp_mask)
>  {
>  	current->lockdep_reclaim_gfp = gfp_mask;
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 56fb225..6255f82 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -6638,6 +6638,8 @@ int __cond_resched_lock(spinlock_t *lock)
>  	int resched = should_resched();
>  	int ret = 0;
>  
> +	lockdep_assert_held(lock);
> +
>  	if (spin_needbreak(lock) || resched) {
>  		spin_unlock(lock);
>  		if (resched)
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-20  8:12       ` Frederic Weisbecker
  2009-07-20  8:49         ` Peter Zijlstra
@ 2009-07-20 11:39         ` Peter Zijlstra
  2009-07-22 17:17           ` Frédéric Weisbecker
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2009-07-20 11:39 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Li Zefan, hpa, linux-kernel, tglx, mingo, linux-tip-commits

On Mon, 2009-07-20 at 04:12 -0400, Frederic Weisbecker wrote:

> Hm, I can read that in fs/dcache.c:512
> 
> /* dentry->d_lock was dropped in prune_one_dentry() */
> cond_resched_lock(&dcache_lock);

dentry->d_lock != dcache_lock

> Isn't it a mususe of cond_resched_lock() ?
> In this case, dcache.c should be fixed.

Both sites look good wrt locking. So there's something else fishy.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched()
  2009-07-20  8:49         ` Peter Zijlstra
  2009-07-20  9:13           ` Frederic Weisbecker
@ 2009-07-20 11:56           ` Ingo Molnar
  1 sibling, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2009-07-20 11:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Li Zefan, hpa, linux-kernel, tglx,
	linux-tip-commits


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Mon, 2009-07-20 at 04:12 -0400, Frederic Weisbecker wrote:
> 
> > From: Frederic Weisbecker <fweisbec@gmail.com>
> > Subject: [PATCH] sched: Check if the spinlock is locked in cond_resched_lock()
> > 
> > Some uses of cond_resched_lock() might involve an
> > unlocked spinlock, resulting in spurious sleep in
> > atomic warnings.
> > Check whether the spinlock is actually locked and
> > take that into account in the might_sleep() check.
> > 
> > Reported-by: Li Zefan <lizf@cn.fujitsu.com>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index cb070dc..2789658 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -2294,9 +2294,10 @@ extern int _cond_resched(void);
> >  
> >  extern int __cond_resched_lock(spinlock_t *lock);
> >  
> > -#define cond_resched_lock(lock) ({				\
> > -	__might_sleep(__FILE__, __LINE__, PREEMPT_OFFSET);	\
> > -	__cond_resched_lock(lock);				\
> > +#define cond_resched_lock(lock) ({				  \
> > +	__might_sleep(__FILE__, __LINE__, spin_is_locked(lock) ?  \
> > +					  PREEMPT_OFFSET : 0);	  \
> > +	__cond_resched_lock(lock);				  \
> >  })
> >  
> >  extern int __cond_resched_softirq(void);
> 
> 
> No, this looks utterly broken.. who is to say it doesn't get unlocked
> right after that spin_is_locked() check?
> 
> cond_resched_lock() callers must hold the lock they use it on, not doing
> so is broken.
> 
> So I would suggest something like the below instead:
> 
> (utterly untested)

FYI, i've undone the tip:sched/core bits from tip:master for now - 
please send a delta patch against tip:sched/core once this is fixed.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into  cond_resched()
  2009-07-20 11:39         ` Peter Zijlstra
@ 2009-07-22 17:17           ` Frédéric Weisbecker
  2009-07-22 17:56             ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Frédéric Weisbecker @ 2009-07-22 17:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Li Zefan, hpa, linux-kernel, tglx, mingo, linux-tip-commits

2009/7/20 Peter Zijlstra <a.p.zijlstra@chello.nl>
>
> On Mon, 2009-07-20 at 04:12 -0400, Frederic Weisbecker wrote:
>
> > Hm, I can read that in fs/dcache.c:512
> >
> > /* dentry->d_lock was dropped in prune_one_dentry() */
> > cond_resched_lock(&dcache_lock);
>
> dentry->d_lock != dcache_lock
>
> > Isn't it a mususe of cond_resched_lock() ?
> > In this case, dcache.c should be fixed.
>
> Both sites look good wrt locking. So there's something else fishy.
>

(Resend in text/plain, sorry.)


Ah, I think I've got it. In case of !CONFIG_PREEMPT, the spinlocks of
course don't play
with preemption, making no change reflected in the preempt_count().
The assumption of preempt_count() = 1
is then false.

I prepare a patch.
Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [tip:sched/core] sched: Pull up the might_sleep() check into  cond_resched()
  2009-07-22 17:17           ` Frédéric Weisbecker
@ 2009-07-22 17:56             ` Peter Zijlstra
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2009-07-22 17:56 UTC (permalink / raw)
  To: Frédéric Weisbecker
  Cc: Li Zefan, hpa, linux-kernel, tglx, mingo, linux-tip-commits

On Wed, 2009-07-22 at 19:17 +0200, Frédéric Weisbecker wrote:

> Ah, I think I've got it. In case of !CONFIG_PREEMPT, the spinlocks of
> course don't play
> with preemption, making no change reflected in the preempt_count().
> The assumption of preempt_count() = 1
> is then false.

D'0h indeed. And I didn't hit that because the last time I build a !
CONFIG_PREEMPT kernel is like many years ago ;-)




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2009-07-22 17:57 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-16  6:28 [PATCH 1/7] sched: Drop the need_resched() loop from cond_resched() Frederic Weisbecker
2009-07-16  6:28 ` [PATCH 2/7] sched: Remove obsolete comment in __cond_resched() Frederic Weisbecker
2009-07-18 14:21   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
2009-07-16  6:28 ` [PATCH 3/7] sched: Cover the CONFIG_DEBUG_SPINLOCK_SLEEP off-case for __might_sleep() Frederic Weisbecker
2009-07-18 14:21   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
2009-07-16  6:28 ` [PATCH 4/7] sched: Add a preempt count base offset to __might_sleep() Frederic Weisbecker
2009-07-16 14:14   ` Peter Zijlstra
2009-07-16 14:34     ` Peter Zijlstra
2009-07-16 14:42       ` Frederic Weisbecker
2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
2009-07-16  6:28 ` [PATCH 5/7] sched: Remove the CONFIG_PREEMPT_BKL case definition of cond_resched() Frederic Weisbecker
2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
2009-07-16  6:28 ` [PATCH 6/7 v3] sched: Pull up the might_sleep() check in cond_resched() Frederic Weisbecker
2009-07-18 14:22   ` [tip:sched/core] sched: Pull up the might_sleep() check into cond_resched() tip-bot for Frederic Weisbecker
2009-07-20  6:50     ` Li Zefan
2009-07-20  8:12       ` Frederic Weisbecker
2009-07-20  8:49         ` Peter Zijlstra
2009-07-20  9:13           ` Frederic Weisbecker
2009-07-20 11:56           ` Ingo Molnar
2009-07-20 11:39         ` Peter Zijlstra
2009-07-22 17:17           ` Frédéric Weisbecker
2009-07-22 17:56             ` Peter Zijlstra
2009-07-16  6:28 ` [PATCH 7/7] sched: Convert the only user of cond_resched_bkl to use cond_resched() Frederic Weisbecker
2009-07-18 14:22   ` [tip:sched/core] " tip-bot for Frederic Weisbecker
2009-07-18 14:21 ` [tip:sched/core] sched: Drop the need_resched() loop from cond_resched() tip-bot for Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox