netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
       [not found] <20140804103025.478913141@infradead.org>
@ 2014-08-05  8:33 ` Ilya Dryomov
  2014-08-05 13:06   ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Ilya Dryomov @ 2014-08-05  8:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, oleg, Linus Torvalds, tglx, Mike Galbraith,
	Linux Kernel Mailing List, netdev, linux-mm

On Mon, Aug 4, 2014 at 2:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> Hi,
>
> Ilya recently tripped over a nested sleep which made Ingo suggest we should
> have debug checks for that. So I did some, see patch 7. Of course that
> triggered a whole bunch of fail the instant I tried to boot my machine.
>
> With this series I can boot my test box and build a kernel on it, I'm fairly
> sure that's far too limited a test to have found all, but its a start.

FWIW, I'm getting a lot of these during light rbd testing.  CC'ed
netdev and linux-mm.

WARNING: CPU: 2 PID: 1978 at kernel/sched/core.c:7094 __might_sleep+0x5b/0x1e0()
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<ffffffff81070640>] prepare_to_wait+0x50/0xa0
Modules linked in:
CPU: 2 PID: 1978 Comm: ceph-osd Not tainted 3.16.0-vm+ #109
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 0000000000001bb6 ffff8800126739e8 ffffffff8156ec1d 0000000000000000
 ffff880012673a38 ffff880012673a28 ffffffff81032c27 ffff880012673a58
 0000000000000200 ffff8800150fa060 00000000000007ad ffffffff817ed352
Call Trace:
 [<ffffffff8156ec1d>] dump_stack+0x4f/0x7c
 [<ffffffff81032c27>] warn_slowpath_common+0x87/0xb0
 [<ffffffff81032cf1>] warn_slowpath_fmt+0x41/0x50
 [<ffffffff814f23cf>] ? tcp_v4_do_rcv+0x10f/0x4a0
 [<ffffffff81070640>] ? prepare_to_wait+0x50/0xa0
 [<ffffffff81070640>] ? prepare_to_wait+0x50/0xa0
 [<ffffffff8105b53b>] __might_sleep+0x5b/0x1e0
 [<ffffffff8148d73d>] release_sock+0x13d/0x200
 [<ffffffff81498223>] sk_stream_wait_memory+0x133/0x2d0
 [<ffffffff810701d0>] ? woken_wake_function+0x10/0x10
 [<ffffffff814dfdbf>] tcp_sendmsg+0xb6f/0xd70
 [<ffffffff815096cf>] inet_sendmsg+0xdf/0x100
 [<ffffffff815095f0>] ? inet_recvmsg+0x100/0x100
 [<ffffffff814896d7>] sock_sendmsg+0x67/0x90
 [<ffffffff810fd961>] ? might_fault+0x51/0xb0
 [<ffffffff81489a22>] ___sys_sendmsg+0x2d2/0x2e0
 [<ffffffff81095e58>] ? futex_wake+0x128/0x140
 [<ffffffff81095d31>] ? futex_wake+0x1/0x140
 [<ffffffff81141dd0>] ? do_dup2+0xd0/0xd0
 [<ffffffff8105fa31>] ? get_parent_ip+0x11/0x50
 [<ffffffff813cea27>] ? debug_smp_processor_id+0x17/0x20
 [<ffffffff813c33c5>] ? delay_tsc+0x85/0xb0
 [<ffffffff81141ead>] ? __fget+0xdd/0xf0
 [<ffffffff81141dd0>] ? do_dup2+0xd0/0xd0
 [<ffffffff81141f05>] ? __fget_light+0x45/0x60
 [<ffffffff81141f2e>] ? __fdget+0xe/0x10
 [<ffffffff8148a4e4>] __sys_sendmsg+0x44/0x70
 [<ffffffff8148a519>] SyS_sendmsg+0x9/0x10
 [<ffffffff81575b92>] system_call_fastpath+0x16/0x1b

WARNING: CPU: 0 PID: 380 at kernel/sched/core.c:7094 __might_sleep+0x5b/0x1e0()
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<ffffffff81070640>] prepare_to_wait+0x50/0xa0
Modules linked in:
CPU: 0 PID: 380 Comm: kswapd0 Tainted: G        W     3.16.0-vm+ #109
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 0000000000001bb6 ffff88007b64bc68 ffffffff8156ec1d 0000000000000000
 ffff88007b64bcb8 ffff88007b64bca8 ffffffff81032c27 0000000000000000
 0000000000000000 ffff88007c062060 0000000000000065 ffffffff8179ca1f
Call Trace:
 [<ffffffff8156ec1d>] dump_stack+0x4f/0x7c
 [<ffffffff81032c27>] warn_slowpath_common+0x87/0xb0
 [<ffffffff81032cf1>] warn_slowpath_fmt+0x41/0x50
 [<ffffffff81070640>] ? prepare_to_wait+0x50/0xa0
 [<ffffffff81070640>] ? prepare_to_wait+0x50/0xa0
 [<ffffffff8105b53b>] __might_sleep+0x5b/0x1e0
 [<ffffffff810f7fd3>] __reset_isolation_suitable+0x83/0x140
 [<ffffffff810f83f3>] reset_isolation_suitable+0x33/0x50
 [<ffffffff810eb717>] kswapd+0x2e7/0x4d0
 [<ffffffff810701d0>] ? woken_wake_function+0x10/0x10
 [<ffffffff810eb430>] ? balance_pgdat+0x5b0/0x5b0
 [<ffffffff810539ab>] kthread+0xfb/0x110
 [<ffffffff810538b0>] ? flush_kthread_worker+0x130/0x130
 [<ffffffff81575aec>] ret_from_fork+0x7c/0xb0
 [<ffffffff810538b0>] ? flush_kthread_worker+0x130/0x130

Thanks,

                Ilya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
  2014-08-05  8:33 ` [RFC][PATCH 0/7] nested sleeps, fixes and debug infra Ilya Dryomov
@ 2014-08-05 13:06   ` Peter Zijlstra
  2014-08-06  7:51     ` Ilya Dryomov
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2014-08-05 13:06 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Ingo Molnar, oleg, Linus Torvalds, tglx, Mike Galbraith,
	Linux Kernel Mailing List, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3776 bytes --]

On Tue, Aug 05, 2014 at 12:33:16PM +0400, Ilya Dryomov wrote:
> On Mon, Aug 4, 2014 at 2:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > Hi,
> >
> > Ilya recently tripped over a nested sleep which made Ingo suggest we should
> > have debug checks for that. So I did some, see patch 7. Of course that
> > triggered a whole bunch of fail the instant I tried to boot my machine.
> >
> > With this series I can boot my test box and build a kernel on it, I'm fairly
> > sure that's far too limited a test to have found all, but its a start.
> 
> FWIW, I'm getting a lot of these during light rbd testing.  CC'ed
> netdev and linux-mm.

Both are cond_resched() calls, and that's not blocking as such, just a
preemption point, so lets exclude those.

From the school of '_' are free:

---
 include/linux/kernel.h |    3 +++
 include/linux/sched.h  |    6 +++---
 kernel/sched/core.c    |   12 +++++++++---
 3 files changed, 15 insertions(+), 6 deletions(-)

--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -162,6 +162,7 @@ extern int _cond_resched(void);
 #endif
 
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+  void ___might_sleep(const char *file, int line, int preempt_offset);
   void __might_sleep(const char *file, int line, int preempt_offset);
 /**
  * might_sleep - annotation for functions that can sleep
@@ -176,6 +177,8 @@ extern int _cond_resched(void);
 # define might_sleep() \
 	do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
 #else
+  static inline void ___might_sleep(const char *file, int line,
+				   int preempt_offset) { }
   static inline void __might_sleep(const char *file, int line,
 				   int preempt_offset) { }
 # define might_sleep() do { might_resched(); } while (0)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2754,7 +2754,7 @@ static inline int signal_pending_state(l
 extern int _cond_resched(void);
 
 #define cond_resched() ({			\
-	__might_sleep(__FILE__, __LINE__, 0);	\
+	___might_sleep(__FILE__, __LINE__, 0);	\
 	_cond_resched();			\
 })
 
@@ -2767,14 +2767,14 @@ extern int __cond_resched_lock(spinlock_
 #endif
 
 #define cond_resched_lock(lock) ({				\
-	__might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);	\
+	___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\
 	__cond_resched_lock(lock);				\
 })
 
 extern int __cond_resched_softirq(void);
 
 #define cond_resched_softirq() ({					\
-	__might_sleep(__FILE__, __LINE__, SOFTIRQ_DISABLE_OFFSET);	\
+	___might_sleep(__FILE__, __LINE__, SOFTIRQ_DISABLE_OFFSET);	\
 	__cond_resched_softirq();					\
 })
 
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7078,8 +7078,6 @@ static inline int preempt_count_equals(i
 
 void __might_sleep(const char *file, int line, int preempt_offset)
 {
-	static unsigned long prev_jiffy;	/* ratelimiting */
-
 	/*
 	 * Blocking primitives will set (and therefore destroy) current->state,
 	 * since we will exit with TASK_RUNNING make sure we enter with it,
@@ -7093,6 +7091,14 @@ void __might_sleep(const char *file, int
 			(void *)current->task_state_change))
 		__set_current_state(TASK_RUNNING);
 
+	___might_sleep(file, line, preempt_offset);
+}
+EXPORT_SYMBOL(__might_sleep);
+
+void ___might_sleep(const char *file, int line, int preempt_offset)
+{
+	static unsigned long prev_jiffy;	/* ratelimiting */
+
 	rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
 	if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
 	     !is_idle_task(current)) ||
@@ -7122,7 +7128,7 @@ void __might_sleep(const char *file, int
 #endif
 	dump_stack();
 }
-EXPORT_SYMBOL(__might_sleep);
+EXPORT_SYMBOL(___might_sleep);
 #endif
 
 #ifdef CONFIG_MAGIC_SYSRQ

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
  2014-08-05 13:06   ` Peter Zijlstra
@ 2014-08-06  7:51     ` Ilya Dryomov
  2014-08-06  8:31       ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Ilya Dryomov @ 2014-08-06  7:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, oleg, Linus Torvalds, tglx, Mike Galbraith,
	Linux Kernel Mailing List, netdev, linux-mm

On Tue, Aug 5, 2014 at 5:06 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Aug 05, 2014 at 12:33:16PM +0400, Ilya Dryomov wrote:
>> On Mon, Aug 4, 2014 at 2:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > Hi,
>> >
>> > Ilya recently tripped over a nested sleep which made Ingo suggest we should
>> > have debug checks for that. So I did some, see patch 7. Of course that
>> > triggered a whole bunch of fail the instant I tried to boot my machine.
>> >
>> > With this series I can boot my test box and build a kernel on it, I'm fairly
>> > sure that's far too limited a test to have found all, but its a start.
>>
>> FWIW, I'm getting a lot of these during light rbd testing.  CC'ed
>> netdev and linux-mm.
>
> Both are cond_resched() calls, and that's not blocking as such, just a
> preemption point, so lets exclude those.

OK, this one is a bit different.

WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 __might_sleep+0x58/0x90()
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<ffffffff81070e10>] prepare_to_wait+0x50 /0xa0
Modules linked in:
CPU: 1 PID: 1744 Comm: lt-ceph_test_li Not tainted 3.16.0-vm+ #113
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 0000000000001bc0 ffff88006c4479d8 ffffffff8156f455 0000000000000000
 ffff88006c447a28 ffff88006c447a18 ffffffff81033357 0000000000000001
 0000000000000000 0000000000000950 ffffffff817ee48a ffff88006dba6120
Call Trace:
 [<ffffffff8156f455>] dump_stack+0x4f/0x7c
 [<ffffffff81033357>] warn_slowpath_common+0x87/0xb0
 [<ffffffff81033421>] warn_slowpath_fmt+0x41/0x50
 [<ffffffff81078bb2>] ? trace_hardirqs_on_caller+0x182/0x1f0
 [<ffffffff81070e10>] ? prepare_to_wait+0x50/0xa0
 [<ffffffff81070e10>] ? prepare_to_wait+0x50/0xa0
 [<ffffffff8105bc38>] __might_sleep+0x58/0x90
 [<ffffffff8148c671>] lock_sock_nested+0x31/0xb0
 [<ffffffff8148dfeb>] ? release_sock+0x1bb/0x200
 [<ffffffff81498aaa>] sk_stream_wait_memory+0x18a/0x2d0
 [<ffffffff810709a0>] ? woken_wake_function+0x10/0x10
 [<ffffffff814e058f>] tcp_sendmsg+0xb6f/0xd70
 [<ffffffff81509e9f>] inet_sendmsg+0xdf/0x100
 [<ffffffff81509dc0>] ? inet_recvmsg+0x100/0x100
 [<ffffffff81489f07>] sock_sendmsg+0x67/0x90
 [<ffffffff810fe391>] ? might_fault+0x51/0xb0
 [<ffffffff8148a252>] ___sys_sendmsg+0x2d2/0x2e0
 [<ffffffff811428a0>] ? do_dup2+0xd0/0xd0
 [<ffffffff811428a0>] ? do_dup2+0xd0/0xd0
 [<ffffffff8105bfe0>] ? finish_task_switch+0x50/0x100
 [<ffffffff811429d5>] ? __fget_light+0x45/0x60
 [<ffffffff811429fe>] ? __fdget+0xe/0x10
 [<ffffffff8148ad14>] __sys_sendmsg+0x44/0x70
 [<ffffffff8148ad49>] SyS_sendmsg+0x9/0x10
 [<ffffffff815764d2>] system_call_fastpath+0x16/0x1b

Thanks,

                Ilya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
  2014-08-06  7:51     ` Ilya Dryomov
@ 2014-08-06  8:31       ` Peter Zijlstra
  2014-08-06 21:16         ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2014-08-06  8:31 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Ingo Molnar, oleg, Linus Torvalds, tglx, Mike Galbraith,
	Linux Kernel Mailing List, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1211 bytes --]

On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote:

> OK, this one is a bit different.
> 
> WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 __might_sleep+0x58/0x90()
> do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff81070e10>] prepare_to_wait+0x50 /0xa0

>  [<ffffffff8105bc38>] __might_sleep+0x58/0x90
>  [<ffffffff8148c671>] lock_sock_nested+0x31/0xb0
>  [<ffffffff81498aaa>] sk_stream_wait_memory+0x18a/0x2d0

Urgh, tedious. Its not an actual bug as is. Due to the condition check
in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING.

I'm not entirely sure what the cleanest way is to make this go away.
Possibly something like so:

---
 include/net/sock.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 156350745700..37902176c5ab 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -886,6 +886,7 @@ static inline void sock_rps_reset_rxhash(struct sock *sk)
 		if (!__rc) {						\
 			*(__timeo) = schedule_timeout(*(__timeo));	\
 		}							\
+		__set_current_state(TASK_RUNNING);			\
 		lock_sock(__sk);					\
 		__rc = __condition;					\
 		__rc;							\

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
  2014-08-06  8:31       ` Peter Zijlstra
@ 2014-08-06 21:16         ` David Miller
  2014-08-07  8:10           ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2014-08-06 21:16 UTC (permalink / raw)
  To: peterz
  Cc: ilya.dryomov, mingo, oleg, torvalds, tglx, umgwanakikbuti,
	linux-kernel, netdev, linux-mm

From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 6 Aug 2014 10:31:34 +0200

> On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote:
> 
>> OK, this one is a bit different.
>> 
>> WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 __might_sleep+0x58/0x90()
>> do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff81070e10>] prepare_to_wait+0x50 /0xa0
> 
>>  [<ffffffff8105bc38>] __might_sleep+0x58/0x90
>>  [<ffffffff8148c671>] lock_sock_nested+0x31/0xb0
>>  [<ffffffff81498aaa>] sk_stream_wait_memory+0x18a/0x2d0
> 
> Urgh, tedious. Its not an actual bug as is. Due to the condition check
> in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING.
> 
> I'm not entirely sure what the cleanest way is to make this go away.
> Possibly something like so:

If you submit this formally to netdev with a signoff I'm willing to apply
this if it helps the debug infrastructure.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
  2014-08-06 21:16         ` David Miller
@ 2014-08-07  8:10           ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2014-08-07  8:10 UTC (permalink / raw)
  To: David Miller
  Cc: ilya.dryomov, mingo, oleg, torvalds, tglx, umgwanakikbuti,
	linux-kernel, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]

On Wed, Aug 06, 2014 at 02:16:03PM -0700, David Miller wrote:
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Wed, 6 Aug 2014 10:31:34 +0200
> 
> > On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote:
> > 
> >> OK, this one is a bit different.
> >> 
> >> WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 __might_sleep+0x58/0x90()
> >> do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff81070e10>] prepare_to_wait+0x50 /0xa0
> > 
> >>  [<ffffffff8105bc38>] __might_sleep+0x58/0x90
> >>  [<ffffffff8148c671>] lock_sock_nested+0x31/0xb0
> >>  [<ffffffff81498aaa>] sk_stream_wait_memory+0x18a/0x2d0
> > 
> > Urgh, tedious. Its not an actual bug as is. Due to the condition check
> > in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING.
> > 
> > I'm not entirely sure what the cleanest way is to make this go away.
> > Possibly something like so:
> 
> If you submit this formally to netdev with a signoff I'm willing to apply
> this if it helps the debug infrastructure.

Thanks, for now I'm just collecting things to see how far I can take
this. But I'll certainly include you and netdev on a next posting.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-07  8:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20140804103025.478913141@infradead.org>
2014-08-05  8:33 ` [RFC][PATCH 0/7] nested sleeps, fixes and debug infra Ilya Dryomov
2014-08-05 13:06   ` Peter Zijlstra
2014-08-06  7:51     ` Ilya Dryomov
2014-08-06  8:31       ` Peter Zijlstra
2014-08-06 21:16         ` David Miller
2014-08-07  8:10           ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).