public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
@ 2007-04-25  5:27 Miles Lane
  2007-04-25  5:41 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Miles Lane @ 2007-04-25  5:27 UTC (permalink / raw)
  To: Andrew Morton, LKML

[ 1251.506964] PM: Preparing system for mem sleep
[ 1251.514790] Stopping tasks ...
[ 1271.456065] Stopping user space processes timed out after 20
seconds (1 tasks refusing to freeze):
[ 1271.456243]  multiload-apple
[ 1271.456291] Restarting tasks ... done.

This isn't happening under earlier builds I've tested.  How can I debug this?

Thanks,
          Miles

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
  2007-04-25  5:27 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze) Miles Lane
@ 2007-04-25  5:41 ` Andrew Morton
  2007-04-25  5:49   ` Miles Lane
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2007-04-25  5:41 UTC (permalink / raw)
  To: Miles Lane; +Cc: LKML, Rafael J. Wysocki, Oleg Nesterov

On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:

> [ 1251.506964] PM: Preparing system for mem sleep
> [ 1251.514790] Stopping tasks ...
> [ 1271.456065] Stopping user space processes timed out after 20
> seconds (1 tasks refusing to freeze):
> [ 1271.456243]  multiload-apple
> [ 1271.456291] Restarting tasks ... done.
> 
> This isn't happening under earlier builds I've tested.  How can I debug this?
> 

hm, that's multiload-applet, some gnome thing.

sysrq-T, perhaps?  Perhaps the process is sleeping in the kernel somewhere.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
  2007-04-25  5:41 ` Andrew Morton
@ 2007-04-25  5:49   ` Miles Lane
  2007-04-25  5:54     ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Miles Lane @ 2007-04-25  5:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Rafael J. Wysocki, Oleg Nesterov

On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
>
> > [ 1251.506964] PM: Preparing system for mem sleep
> > [ 1251.514790] Stopping tasks ...
> > [ 1271.456065] Stopping user space processes timed out after 20
> > seconds (1 tasks refusing to freeze):
> > [ 1271.456243]  multiload-apple
> > [ 1271.456291] Restarting tasks ... done.
> >
> > This isn't happening under earlier builds I've tested.  How can I debug this?
> >
>
> hm, that's multiload-applet, some gnome thing.
>
> sysrq-T, perhaps?  Perhaps the process is sleeping in the kernel somewhere.

Should I wait for the next patch from Tejun before retesting?  Perhaps
this suspend problem is a side effect of the locking problem he
mentioned.

         Miles

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
  2007-04-25  5:49   ` Miles Lane
@ 2007-04-25  5:54     ` Andrew Morton
  2007-04-25  6:41       ` Miles Lane
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2007-04-25  5:54 UTC (permalink / raw)
  To: Miles Lane; +Cc: LKML, Rafael J. Wysocki, Oleg Nesterov

On Tue, 24 Apr 2007 22:49:48 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:

> On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> >
> > > [ 1251.506964] PM: Preparing system for mem sleep
> > > [ 1251.514790] Stopping tasks ...
> > > [ 1271.456065] Stopping user space processes timed out after 20
> > > seconds (1 tasks refusing to freeze):
> > > [ 1271.456243]  multiload-apple
> > > [ 1271.456291] Restarting tasks ... done.
> > >
> > > This isn't happening under earlier builds I've tested.  How can I debug this?
> > >
> >
> > hm, that's multiload-applet, some gnome thing.
> >
> > sysrq-T, perhaps?  Perhaps the process is sleeping in the kernel somewhere.
> 
> Should I wait for the next patch from Tejun before retesting?  Perhaps
> this suspend problem is a side effect of the locking problem he
> mentioned.

It's unlikely to be related to Tejun's sysfs changes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
  2007-04-25  5:54     ` Andrew Morton
@ 2007-04-25  6:41       ` Miles Lane
  2007-04-25  7:07         ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Miles Lane @ 2007-04-25  6:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Rafael J. Wysocki, Oleg Nesterov

On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 24 Apr 2007 22:49:48 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
>
> > On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > > On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> > >
> > > > [ 1251.506964] PM: Preparing system for mem sleep
> > > > [ 1251.514790] Stopping tasks ...
> > > > [ 1271.456065] Stopping user space processes timed out after 20
> > > > seconds (1 tasks refusing to freeze):
> > > > [ 1271.456243]  multiload-apple
> > > > [ 1271.456291] Restarting tasks ... done.
> > > >
> > > > This isn't happening under earlier builds I've tested.  How can I debug this?
> > > >
> > >
> > > hm, that's multiload-applet, some gnome thing.
> > >
> > > sysrq-T, perhaps?  Perhaps the process is sleeping in the kernel somewhere.
> >
> > Should I wait for the next patch from Tejun before retesting?  Perhaps
> > this suspend problem is a side effect of the locking problem he
> > mentioned.
>
> It's unlikely to be related to Tejun's sysfs changes.
>
I tried to reproduce this, but this time when I tried to suspend, the
machine just hung with a message showing saying the system was
suspending.  When I touched my Synaptics mousepad after about ten
seconds of waiting, the system suddenly suspended.  Upon resuming, I
checked dmesg and found the time seems to be totally out of whack:

[ 1334.589074] pci 0000:00:1f.6: LATE suspend
[ 1334.589080] Intel ICH 0000:00:1f.5: LATE suspend
[ 1334.589085] pci 0000:00:1f.3: LATE suspend
[ 1334.589091] pci 0000:00:1f.0: LATE suspend
[ 1334.589096] pci 0000:00:1e.0: LATE suspend, may wakeup
[ 1334.589102] pci 0000:00:02.1: LATE suspend
[ 1334.589108] pci 0000:00:02.0: LATE suspend
[ 1334.589113] pci 0000:00:00.3: LATE suspend
[ 1334.589118] pci 0000:00:00.1: LATE suspend
[ 1334.589124] agpgart-intel 0000:00:00.0: LATE suspend
[ 1334.589733]  hwsleep-0323 [03] enter_sleep_state     : Entering
sleep state [S3]
[18014527.889728] Intel machine check architecture supported.
[18014527.889749] Intel machine check reporting enabled on CPU#0.
[18014527.889783] Back to C!
[18014527.889783] agpgart-intel 0000:00:00.0: EARLY resume
[18014527.889783] PCI: Calling quirk c01c94b2 for 0000:00:00.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
  2007-04-25  6:41       ` Miles Lane
@ 2007-04-25  7:07         ` Andrew Morton
  2007-04-25  7:32           ` Andi Kleen
       [not found]           ` <a44ae5cd0704250048i474d7249l7d5b27f6adaf2742@mail.gmail.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2007-04-25  7:07 UTC (permalink / raw)
  To: Miles Lane
  Cc: LKML, Rafael J. Wysocki, Oleg Nesterov, Gautham R Shenoy,
	Andi Kleen

On Tue, 24 Apr 2007 23:41:32 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:

> On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Tue, 24 Apr 2007 22:49:48 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> >
> > > On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> > > >
> > > > > [ 1251.506964] PM: Preparing system for mem sleep
> > > > > [ 1251.514790] Stopping tasks ...
> > > > > [ 1271.456065] Stopping user space processes timed out after 20
> > > > > seconds (1 tasks refusing to freeze):
> > > > > [ 1271.456243]  multiload-apple
> > > > > [ 1271.456291] Restarting tasks ... done.
> > > > >
> > > > > This isn't happening under earlier builds I've tested.  How can I debug this?
> > > > >
> > > >
> > > > hm, that's multiload-applet, some gnome thing.
> > > >
> > > > sysrq-T, perhaps?  Perhaps the process is sleeping in the kernel somewhere.
> > >
> > > Should I wait for the next patch from Tejun before retesting?  Perhaps
> > > this suspend problem is a side effect of the locking problem he
> > > mentioned.
> >
> > It's unlikely to be related to Tejun's sysfs changes.
> >
> I tried to reproduce this, but this time when I tried to suspend, the
> machine just hung with a message showing saying the system was
> suspending.  When I touched my Synaptics mousepad after about ten
> seconds of waiting, the system suddenly suspended.

Fun.  Could be one of Greg's trees, could be the swsusp patches, could be
the freezer changes, could be USB, could be the input layer, could be
anything.  I don't see how we can merge anything at all into 2.6.22 at
present.  The only thing to be said for doing that is that we'd increase
our pool of bisection-searchers.  argh.

Rafael, Oleg: do we have a way of exercising the freezer from userspace? 
Just do a freeze/unfreeze?  We should.

Also, can we get better diagnostics when the freeze fails?  Say, go the
equivalent of a sysrq-T?  It could be that multiload-applet is waiting on
activity from an already-frozen thread or something.

>  Upon resuming, I
> checked dmesg and found the time seems to be totally out of whack:

So sched_clock() went bad.  That's another tree or three we can't merge.

Is the system time also wrong?

> [ 1334.589074] pci 0000:00:1f.6: LATE suspend
> [ 1334.589080] Intel ICH 0000:00:1f.5: LATE suspend
> [ 1334.589085] pci 0000:00:1f.3: LATE suspend
> [ 1334.589091] pci 0000:00:1f.0: LATE suspend
> [ 1334.589096] pci 0000:00:1e.0: LATE suspend, may wakeup
> [ 1334.589102] pci 0000:00:02.1: LATE suspend
> [ 1334.589108] pci 0000:00:02.0: LATE suspend
> [ 1334.589113] pci 0000:00:00.3: LATE suspend
> [ 1334.589118] pci 0000:00:00.1: LATE suspend
> [ 1334.589124] agpgart-intel 0000:00:00.0: LATE suspend
> [ 1334.589733]  hwsleep-0323 [03] enter_sleep_state     : Entering
> sleep state [S3]
> [18014527.889728] Intel machine check architecture supported.
> [18014527.889749] Intel machine check reporting enabled on CPU#0.
> [18014527.889783] Back to C!
> [18014527.889783] agpgart-intel 0000:00:00.0: EARLY resume
> [18014527.889783] PCI: Calling quirk c01c94b2 for 0000:00:00.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
  2007-04-25  7:07         ` Andrew Morton
@ 2007-04-25  7:32           ` Andi Kleen
       [not found]           ` <a44ae5cd0704250048i474d7249l7d5b27f6adaf2742@mail.gmail.com>
  1 sibling, 0 replies; 10+ messages in thread
From: Andi Kleen @ 2007-04-25  7:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Miles Lane, LKML, Rafael J. Wysocki, Oleg Nesterov,
	Gautham R Shenoy


> >  Upon resuming, I
> > checked dmesg and found the time seems to be totally out of whack:
> 
> So sched_clock() went bad.  That's another tree or three we can't merge.

Right now sched_clock doesn't have a call back to resync for suspend/resume. I will add one.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
       [not found]           ` <a44ae5cd0704250048i474d7249l7d5b27f6adaf2742@mail.gmail.com>
@ 2007-04-25 19:52             ` Rafael J. Wysocki
       [not found]             ` <20070425011902.714d85d6.akpm@linux-foundation.org>
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2007-04-25 19:52 UTC (permalink / raw)
  To: Miles Lane
  Cc: Andrew Morton, LKML, Oleg Nesterov, Gautham R Shenoy, Andi Kleen

On Wednesday, 25 April 2007 09:48, Miles Lane wrote:
> On 4/25/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Tue, 24 Apr 2007 23:41:32 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> >
> > > On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > On Tue, 24 Apr 2007 22:49:48 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> > > >
> > > > > On 4/24/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > > > On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@gmail.com> wrote:
> > > > > >
> > > > > > > [ 1251.506964] PM: Preparing system for mem sleep
> > > > > > > [ 1251.514790] Stopping tasks ...
> > > > > > > [ 1271.456065] Stopping user space processes timed out after 20
> > > > > > > seconds (1 tasks refusing to freeze):
> > > > > > > [ 1271.456243]  multiload-apple
> > > > > > > [ 1271.456291] Restarting tasks ... done.
> > > > > > >
> > > > > > > This isn't happening under earlier builds I've tested.  How can I debug this?
> > > > > > >
> > > > > >
> > > > > > hm, that's multiload-applet, some gnome thing.
> > > > > >
> > > > > > sysrq-T, perhaps?  Perhaps the process is sleeping in the kernel somewhere.
> > > > >
> > > > > Should I wait for the next patch from Tejun before retesting?  Perhaps
> > > > > this suspend problem is a side effect of the locking problem he
> > > > > mentioned.
> > > >
> > > > It's unlikely to be related to Tejun's sysfs changes.
> > > >
> > > I tried to reproduce this, but this time when I tried to suspend, the
> > > machine just hung with a message showing saying the system was
> > > suspending.  When I touched my Synaptics mousepad after about ten
> > > seconds of waiting, the system suddenly suspended.
> >
> > Fun.  Could be one of Greg's trees, could be the swsusp patches, could be
> > the freezer changes, could be USB, could be the input layer, could be
> > anything.  I don't see how we can merge anything at all into 2.6.22 at
> > present.  The only thing to be said for doing that is that we'd increase
> > our pool of bisection-searchers.  argh.
> >
> > Rafael, Oleg: do we have a way of exercising the freezer from userspace?
> > Just do a freeze/unfreeze?  We should.
> >
> > Also, can we get better diagnostics when the freeze fails?  Say, go the
> > equivalent of a sysrq-T?  It could be that multiload-applet is waiting on
> > activity from an already-frozen thread or something.
> >
> > >  Upon resuming, I
> > > checked dmesg and found the time seems to be totally out of whack:
> >
> > So sched_clock() went bad.  That's another tree or three we can't merge.
> >
> > Is the system time also wrong?
> >
> > > [ 1334.589074] pci 0000:00:1f.6: LATE suspend
> > > [ 1334.589080] Intel ICH 0000:00:1f.5: LATE suspend
> > > [ 1334.589085] pci 0000:00:1f.3: LATE suspend
> > > [ 1334.589091] pci 0000:00:1f.0: LATE suspend
> > > [ 1334.589096] pci 0000:00:1e.0: LATE suspend, may wakeup
> > > [ 1334.589102] pci 0000:00:02.1: LATE suspend
> > > [ 1334.589108] pci 0000:00:02.0: LATE suspend
> > > [ 1334.589113] pci 0000:00:00.3: LATE suspend
> > > [ 1334.589118] pci 0000:00:00.1: LATE suspend
> > > [ 1334.589124] agpgart-intel 0000:00:00.0: LATE suspend
> > > [ 1334.589733]  hwsleep-0323 [03] enter_sleep_state     : Entering
> > > sleep state [S3]
> > > [18014527.889728] Intel machine check architecture supported.
> > > [18014527.889749] Intel machine check reporting enabled on CPU#0.
> > > [18014527.889783] Back to C!
> > > [18014527.889783] agpgart-intel 0000:00:00.0: EARLY resume
> > > [18014527.889783] PCI: Calling quirk c01c94b2 for 0000:00:00.0
> >
> 
> After several attempts, I reproduced getting tasks that won't freeze.
> Sorry for the size of this text.  I don't know how to delete the unneeded stuff.

Which wireless driver do you use?

Can you try to suspend with the driver unloaded?

Rafael

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
       [not found]               ` <a44ae5cd0704251300o2804e71dwebeda266415b691b@mail.gmail.com>
@ 2007-04-25 21:33                 ` Gautham R Shenoy
       [not found]                 ` <20070425141245.1bb29930.akpm@linux-foundation.org>
  1 sibling, 0 replies; 10+ messages in thread
From: Gautham R Shenoy @ 2007-04-25 21:33 UTC (permalink / raw)
  To: Miles Lane
  Cc: Andrew Morton, Rafael J. Wysocki, Oleg Nesterov, Andi Kleen,
	linux-kernel

Hi Miles, 

Looks like following processes failed to freeze while waiting 
on the rtnl_mutex.

Either somebody is holding the rtnl_mutex for a really long time 
Or the task holding rtnl_mutex is frozen.

I don't quite know about the former, but if it's the latter 
then we definitely have a bug.

Thanks and Regards
gautham.


> =======================
> INFO: lockdep is turned off.
> avahi-daemon  D 000003D0     0  2853      1 (NOTLB)
>       c32e1c80 00000092 a4206282 000003d0 00000000 00000046 a4206282 
>       000003d0
>       c1ca35e4 c03b9b54 c30fc2f0 c1ca34d0 c036f520 c32e1cb0 c036f4e0 
>       00000246
>       c32e1cd0 c0299cc7 00000000 00000002 c0299dfe 00000246 00000202 
>       c1f7f060
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnl_lock+16/18] rtnl_lock+0x10/0x12
> [ip_mc_leave_group+24/170] ip_mc_leave_group+0x18/0xaa
> [ip_setsockopt+1608/2365] ip_setsockopt+0x648/0x93d
> [udp_setsockopt+67/74] udp_setsockopt+0x43/0x4a
> [sock_common_setsockopt+30/37] sock_common_setsockopt+0x1e/0x25
> [sys_setsockopt+105/133] sys_setsockopt+0x69/0x85
> [sys_socketcall+488/577] sys_socketcall+0x1e8/0x241
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================


> =======================
> INFO: lockdep is turned off.
> multiload-app D 000003D0     0  4159      1 (NOTLB)
>       c733ee10 00200092 aa7df471 000003d0 00000000 00200046 aa7df471 
>       000003d0
>       c7277624 c03b9b54 c72fea98 00000000 c036f520 c733ee40 c036f4e0 
>       00200246
>       c733ee60 c0299cc7 00000000 00000002 c0299dfe 00200246 00000000 
>       00000000
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnl_lock+16/18] rtnl_lock+0x10/0x12
> [dev_ioctl+1066/1134] dev_ioctl+0x42a/0x46e
> [sock_ioctl+446/458] sock_ioctl+0x1be/0x1ca
> [do_ioctl+34/104] do_ioctl+0x22/0x68
> [vfs_ioctl+575/594] vfs_ioctl+0x23f/0x252
> [sys_ioctl+49/74] sys_ioctl+0x31/0x4a
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================


> =======================
> INFO: lockdep is turned off.
> wpa_supplican D 000003D0     0  6511      1 (NOTLB)
>       c724cc50 00200092 a33ddada 000003d0 00000000 00200046 a33ddada 
>       000003d0
>       c7191624 c03b9b54 c72fe330 c46c64d0 c036f520 c724cc80 c036f4e0 
>       00200246
>       c724cca0 c0299cc7 00000000 00000002 c0299dfe 00200046 00000000 
>       00000000
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnetlink_rcv+26/68] rtnetlink_rcv+0x1a/0x44
> [netlink_data_ready+21/85] netlink_data_ready+0x15/0x55
> [netlink_sendskb+34/83] netlink_sendskb+0x22/0x53
> [netlink_unicast+443/469] netlink_unicast+0x1bb/0x1d5
> [netlink_sendmsg+582/594] netlink_sendmsg+0x246/0x252
> [sock_sendmsg+204/229] sock_sendmsg+0xcc/0xe5
> [sys_sendto+204/236] sys_sendto+0xcc/0xec
> [sys_send+54/56] sys_send+0x36/0x38
> [sys_socketcall+318/577] sys_socketcall+0x13e/0x241
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================


> INFO: lockdep is turned off.
> dhclient      D 000003D0     0  6998   2816 (NOTLB)
>       c9094e10 00000092 ac00d705 000003d0 00000000 00000046 ac00d705 
>       000003d0
>       c6f45664 c03b9b54 c905da18 00000000 c036f520 c9094e40 c036f4e0 
>       00000246
>       c9094e60 c0299cc7 00000000 00000002 c0299dfe 00000000 c965614c 
>       c6f45550
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnl_lock+16/18] rtnl_lock+0x10/0x12
> [dev_ioctl+30/1134] dev_ioctl+0x1e/0x46e
> [sock_ioctl+446/458] sock_ioctl+0x1be/0x1ca
> [do_ioctl+34/104] do_ioctl+0x22/0x68
> [vfs_ioctl+575/594] vfs_ioctl+0x23f/0x252
> [sys_ioctl+49/74] sys_ioctl+0x31/0x4a
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================

> =======================
> INFO: lockdep is turned off.
> avahi-daemon  D 000003D0     0  2853      1 (NOTLB)
>       c32e1c80 00000092 a4206282 000003d0 00000000 00000046 a4206282 
>       000003d0
>       c1ca35e4 c03b9b54 c30fc2f0 c1ca34d0 c036f520 c32e1cb0 c036f4e0 
>       00000246
>       c32e1cd0 c0299cc7 00000000 00000002 c0299dfe 00000246 00000202 
>       c1f7f060
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnl_lock+16/18] rtnl_lock+0x10/0x12
> [ip_mc_leave_group+24/170] ip_mc_leave_group+0x18/0xaa
> [ip_setsockopt+1608/2365] ip_setsockopt+0x648/0x93d
> [udp_setsockopt+67/74] udp_setsockopt+0x43/0x4a
> [sock_common_setsockopt+30/37] sock_common_setsockopt+0x1e/0x25
> [sys_setsockopt+105/133] sys_setsockopt+0x69/0x85
> [sys_socketcall+488/577] sys_socketcall+0x1e8/0x241
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================

> =======================
> INFO: lockdep is turned off.
> multiload-app D 000003D0     0  4159      1 (NOTLB)
>       c733ee10 00200092 aa7df471 000003d0 00000000 00200046 aa7df471 
>       000003d0
>       c7277624 c03b9b54 c72fea98 00000000 c036f520 c733ee40 c036f4e0 
>       00200246
>       c733ee60 c0299cc7 00000000 00000002 c0299dfe 00200246 00000000 
>       00000000
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnl_lock+16/18] rtnl_lock+0x10/0x12
> [dev_ioctl+1066/1134] dev_ioctl+0x42a/0x46e
> [sock_ioctl+446/458] sock_ioctl+0x1be/0x1ca
> [do_ioctl+34/104] do_ioctl+0x22/0x68
> [vfs_ioctl+575/594] vfs_ioctl+0x23f/0x252
> [sys_ioctl+49/74] sys_ioctl+0x31/0x4a
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================

> =======================
> INFO: lockdep is turned off.
> wpa_supplican D 000003D0     0  6511      1 (NOTLB)
>       c724cc50 00200092 a33ddada 000003d0 00000000 00200046 a33ddada 
>       000003d0
>       c7191624 c03b9b54 c72fe330 c46c64d0 c036f520 c724cc80 c036f4e0 
>       00200246
>       c724cca0 c0299cc7 00000000 00000002 c0299dfe 00200046 00000000 
>       00000000
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnetlink_rcv+26/68] rtnetlink_rcv+0x1a/0x44
> [netlink_data_ready+21/85] netlink_data_ready+0x15/0x55
> [netlink_sendskb+34/83] netlink_sendskb+0x22/0x53
> [netlink_unicast+443/469] netlink_unicast+0x1bb/0x1d5
> [netlink_sendmsg+582/594] netlink_sendmsg+0x246/0x252
> [sock_sendmsg+204/229] sock_sendmsg+0xcc/0xe5
> [sys_sendto+204/236] sys_sendto+0xcc/0xec
> [sys_send+54/56] sys_send+0x36/0x38
> [sys_socketcall+318/577] sys_socketcall+0x13e/0x241
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================


> INFO: lockdep is turned off.
> dhclient      D 000003D0     0  6998   2816 (NOTLB)
>       c9094e10 00000092 ac00d705 000003d0 00000000 00000046 ac00d705 
>       000003d0
>       c6f45664 c03b9b54 c905da18 00000000 c036f520 c9094e40 c036f4e0 
>       00000246
>       c9094e60 c0299cc7 00000000 00000002 c0299dfe 00000000 c965614c 
>       c6f45550
> Call Trace:
> [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
> [mutex_lock+31/35] mutex_lock+0x1f/0x23
> [rtnl_lock+16/18] rtnl_lock+0x10/0x12
> [dev_ioctl+30/1134] dev_ioctl+0x1e/0x46e
> [sock_ioctl+446/458] sock_ioctl+0x1be/0x1ca
> [do_ioctl+34/104] do_ioctl+0x22/0x68
> [vfs_ioctl+575/594] vfs_ioctl+0x23f/0x252
> [sys_ioctl+49/74] sys_ioctl+0x31/0x4a
> [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> =======================

Thanks and Regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze)
       [not found]                 ` <20070425141245.1bb29930.akpm@linux-foundation.org>
@ 2007-04-25 21:37                   ` Gautham R Shenoy
  0 siblings, 0 replies; 10+ messages in thread
From: Gautham R Shenoy @ 2007-04-25 21:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Miles Lane, Rafael J. Wysocki, Oleg Nesterov, Andi Kleen,
	linux-kernel

Hi Andrew,

Looks like we went off-list :)

On Wed, Apr 25, 2007 at 02:12:45PM -0700, Andrew Morton wrote:
> On Wed, 25 Apr 2007 13:00:01 -0700
> "Miles Lane" <miles.lane@gmail.com> wrote:
> 
> > SysRq : Show State
> 
> OK.  Your freezer failure is caused by lots of breakage with networking's
> rtnl_lock.
> 
> My usual first step with these traces is to search for " D ":
> 
> avahi-daemon  D 000003D0     0  2853      1 (NOTLB)
>        c32e1c80 00000092 a4206282 000003d0 00000000 00000046 a4206282 000003d0
>        c1ca35e4 c03b9b54 c30fc2f0 c1ca34d0 c036f520 c32e1cb0 c036f4e0 00000246
>        c32e1cd0 c0299cc7 00000000 00000002 c0299dfe 00000246 00000202 c1f7f060
> Call Trace:
>  [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
>  [mutex_lock+31/35] mutex_lock+0x1f/0x23
>  [rtnl_lock+16/18] rtnl_lock+0x10/0x12
>  [ip_mc_leave_group+24/170] ip_mc_leave_group+0x18/0xaa
>  [ip_setsockopt+1608/2365] ip_setsockopt+0x648/0x93d
>  [udp_setsockopt+67/74] udp_setsockopt+0x43/0x4a
>  [sock_common_setsockopt+30/37] sock_common_setsockopt+0x1e/0x25
>  [sys_setsockopt+105/133] sys_setsockopt+0x69/0x85
>  [sys_socketcall+488/577] sys_socketcall+0x1e8/0x241
>  [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> 
> and
> 
> multiload-app D 000003D0     0  4159      1 (NOTLB)
>        c733ee10 00200092 aa7df471 000003d0 00000000 00200046 aa7df471 000003d0
>        c7277624 c03b9b54 c72fea98 00000000 c036f520 c733ee40 c036f4e0 00200246
>        c733ee60 c0299cc7 00000000 00000002 c0299dfe 00200246 00000000 00000000
> Call Trace:
>  [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
>  [mutex_lock+31/35] mutex_lock+0x1f/0x23
>  [rtnl_lock+16/18] rtnl_lock+0x10/0x12
>  [dev_ioctl+1066/1134] dev_ioctl+0x42a/0x46e
>  [sock_ioctl+446/458] sock_ioctl+0x1be/0x1ca
>  [do_ioctl+34/104] do_ioctl+0x22/0x68
>  [vfs_ioctl+575/594] vfs_ioctl+0x23f/0x252
>  [sys_ioctl+49/74] sys_ioctl+0x31/0x4a
>  [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> 
> 
> and
> 
> wpa_supplican D 000003D0     0  6511      1 (NOTLB)
>        c724cc50 00200092 a33ddada 000003d0 00000000 00200046 a33ddada 000003d0
>        c7191624 c03b9b54 c72fe330 c46c64d0 c036f520 c724cc80 c036f4e0 00200246
>        c724cca0 c0299cc7 00000000 00000002 c0299dfe 00200046 00000000 00000000
> Call Trace:
>  [__mutex_lock_slowpath+330/610] __mutex_lock_slowpath+0x14a/0x262
>  [mutex_lock+31/35] mutex_lock+0x1f/0x23
>  [rtnetlink_rcv+26/68] rtnetlink_rcv+0x1a/0x44
>  [netlink_data_ready+21/85] netlink_data_ready+0x15/0x55
>  [netlink_sendskb+34/83] netlink_sendskb+0x22/0x53
>  [netlink_unicast+443/469] netlink_unicast+0x1bb/0x1d5
>  [netlink_sendmsg+582/594] netlink_sendmsg+0x246/0x252
>  [sock_sendmsg+204/229] sock_sendmsg+0xcc/0xe5
>  [sys_sendto+204/236] sys_sendto+0xcc/0xec
>  [sys_send+54/56] sys_send+0x36/0x38
>  [sys_socketcall+318/577] sys_socketcall+0x13e/0x241
>  [sysenter_past_esp+95/153] sysenter_past_esp+0x5f/0x99
> 
> and lots more of the same.
> 
> 
> There is something seriously scrogged with networking locking in -mm.  We
> know there's one locking bug which manifests in the bare net-2.6.22 tree,
> and I'm suspecting that there are other (albeit perhaps related) locking
> bugs triggered by something else in -mm.  Something which precedes
> git-net.patch in the series file (ie: another subsystem tree).
> 
> So for now, we should assume that there's no freezer-related problem being
> demonstrated here.

Doesn't look like it ATM. 

> 

Regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-04-25 21:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-25  5:27 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- software suspend failed (1 tasks refusing to freeze) Miles Lane
2007-04-25  5:41 ` Andrew Morton
2007-04-25  5:49   ` Miles Lane
2007-04-25  5:54     ` Andrew Morton
2007-04-25  6:41       ` Miles Lane
2007-04-25  7:07         ` Andrew Morton
2007-04-25  7:32           ` Andi Kleen
     [not found]           ` <a44ae5cd0704250048i474d7249l7d5b27f6adaf2742@mail.gmail.com>
2007-04-25 19:52             ` Rafael J. Wysocki
     [not found]             ` <20070425011902.714d85d6.akpm@linux-foundation.org>
     [not found]               ` <a44ae5cd0704251300o2804e71dwebeda266415b691b@mail.gmail.com>
2007-04-25 21:33                 ` Gautham R Shenoy
     [not found]                 ` <20070425141245.1bb29930.akpm@linux-foundation.org>
2007-04-25 21:37                   ` Gautham R Shenoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox