PM: cannot hibernate -- BUG at kernel/workqueue.c:3659

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
@ 2012-01-24 15:05 Jiri Slaby
  2012-01-24 16:18 ` [linux-pm] " Srivatsa S. Bhat
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Slaby @ 2012-01-24 15:05 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux-pm mailing list, LKML, Jiri Slaby

Hi,

this is a freshly booted system. When I do s2dsk, I see:
...
Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
------------[ cut here ]------------
kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
invalid opcode: 0000 [#1] SMP
CPU 0
Modules linked in:

Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
Bochs Bochs
RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
freeze_workqueues_begin+0x195/0x1a0
RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
ffff880047251980)
Stack:
 ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
 ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
 00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
Call Trace:
 [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
 [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
 [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
 [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
 [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
 [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
 [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
 [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
 RSP <ffff880046f01d68>
---[ end trace 632574abdc098963 ]---

I wanted to reproduce a different crash which happens on 3.2.1. When
there is not enough free swap, the kernel crashes (when s2disk is used).
I will post more info after the one above is resolved and I can
reproduce the latter :).

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-pm] PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-24 15:05 PM: cannot hibernate -- BUG at kernel/workqueue.c:3659 Jiri Slaby
@ 2012-01-24 16:18 ` Srivatsa S. Bhat
  2012-01-24 16:24   ` Jiri Slaby
  0 siblings, 1 reply; 15+ messages in thread
From: Srivatsa S. Bhat @ 2012-01-24 16:18 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Rafael J. Wysocki, Linux-pm mailing list, LKML, Jiri Slaby

Hi Jiri,

On 01/24/2012 08:35 PM, Jiri Slaby wrote:

> Hi,
> 
> this is a freshly booted system. When I do s2dsk, I see:
> ...
> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> ------------[ cut here ]------------
> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> invalid opcode: 0000 [#1] SMP
> CPU 0
> Modules linked in:
> 
> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
> Bochs Bochs
> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
> freeze_workqueues_begin+0x195/0x1a0
> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
> ffff880047251980)
> Stack:
>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
> Call Trace:
>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>  RSP <ffff880046f01d68>
> ---[ end trace 632574abdc098963 ]---
> 


I couldn't find any obvious root-cause from a quick check. Is this completely
reproducible upon a fresh boot?
 
Regards,
Srivatsa S. Bhat
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-24 16:18 ` [linux-pm] " Srivatsa S. Bhat
@ 2012-01-24 16:24   ` Jiri Slaby
  2012-01-24 22:36     ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Slaby @ 2012-01-24 16:24 UTC (permalink / raw)
  To: Srivatsa S. Bhat; +Cc: Linux-pm mailing list, LKML, Jiri Slaby

On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
> Hi Jiri,
> 
> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
> 
>> Hi,
>>
>> this is a freshly booted system. When I do s2dsk, I see:
>> ...
>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>> ------------[ cut here ]------------
>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>> invalid opcode: 0000 [#1] SMP
>> CPU 0
>> Modules linked in:
>>
>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
>> Bochs Bochs
>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
>> freeze_workqueues_begin+0x195/0x1a0
>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
>> ffff880047251980)
>> Stack:
>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
>> Call Trace:
>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>>  RSP <ffff880046f01d68>
>> ---[ end trace 632574abdc098963 ]---
>>
> 
> 
> I couldn't find any obvious root-cause from a quick check. Is this completely
> reproducible upon a fresh boot?

True.

The cause is that the function is called twice:
# s2disk
PM: Marking nosave pages: 000000000009f000 - 0000000000100000
PM: Basic memory bitmaps created
Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.01 seconds) done.
PM: Preallocating image memory... done (allocated 264170 pages)
PM: Allocated 1056680 kbytes in 0.45 seconds (2348.17 MB/s)
Freezing remaining freezable tasks ... Pid: 2650, comm: s2disk Not
tainted 3.3.0-rc1-next-20120124_64+ #1628
Call Trace:
 [<ffffffff8109edd2>] ? getnstimeofday+0x52/0xf0
 [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
 [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
 [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
 [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
 [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
 [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
 [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
 [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
 [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
(elapsed 0.03 seconds) done.
PM: freeze of devices complete after 57.216 msecs
PM: late freeze of devices complete after 0.720 msecs
ACPI: Preparing to enter system sleep state S4
PM: Saving platform NVS memory
Disabling non-boot CPUs ...
CPU 1 is now offline
PM: Creating hibernation image:
PM: Need to copy 24112 pages
PM: Normal pages needed: 24112 + 1024, available pages: 277076
PM: Hibernation image created (24112 pages copied)
Enabling non-boot CPUs ...
Booting Node 0 Processor 1 APIC 0x1
smpboot cpu 1: start_ip = 9a000
Calibrating delay loop (skipped) already calibrated this CPU
NMI watchdog disabled (cpu1): hardware events not enabled
CPU1 is up
ACPI: Waking up from system sleep state S4
PM: early thaw of devices complete after 0.093 msecs
ata_piix 0000:00:01.1: setting latency timer to 64
e1000 0000:00:03.0: setting latency timer to 64
e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
ata2.01: NODEV after polling detection
ata1.00: configured for MWDMA2
ata2.00: configured for MWDMA2
ata1.01: configured for MWDMA2
sd 0:0:1:0: [sdb] Starting disk
sd 0:0:0:0: [sda] Starting disk
PM: thaw of devices complete after 192.317 msecs
PM: Preallocating image memory... done (allocated 264295 pages)
PM: Allocated 1057180 kbytes in 0.12 seconds (8809.83 MB/s)
Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
------------[ cut here ]------------
kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in:

Pid: 2650, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1628
Bochs Bochs
RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>]
freeze_workqueues_begin+0x1a1/0x1b0
RSP: 0018:ffff8800456d7d68  EFLAGS: 00010292
RAX: 0000000000000023 RBX: 0000000000000001 RCX: 0000000000000073
RDX: 00000000000000ea RSI: 0000000000000046 RDI: ffffffff81b51f7c
RBP: ffff8800456d7d98 R08: ffffffff81a9d760 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 00007fff11d8709c R14: ffffffffffffffff R15: 0000000000000004
FS:  00007f4744397700(0000) GS:ffff880049700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000001f46bc0 CR3: 000000004483b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process s2disk (pid: 2650, threadinfo ffff8800456d6000, task
ffff880046c93300)
Stack:
 ffff8800456d7d98 0000000000000001 0000000000000000 00007fff11d8709c
 ffffffffffffffff 0000000000000004 ffff8800456d7e18 ffffffff81096cc9
 00000000ffff008f 0000000000000004 ffff8800456d7e18 000000004f1edaaf
Call Trace:
 [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
 [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
 [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
 [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
 [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
 [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
 [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
 [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 1d 94 5a 00 0f 0b
48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 06 94 5a 00 <0f> 0b
66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b
RIP  [<ffffffff8107e371>] freeze_workqueues_begin+0x1a1/0x1b0
 RSP <ffff8800456d7d68>
---[ end trace 696a23a038efce41 ]---



>  
> Regards,
> Srivatsa S. Bhat
> IBM Linux Technology Center
> 


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-24 16:24   ` Jiri Slaby
@ 2012-01-24 22:36     ` Rafael J. Wysocki
  2012-01-24 22:47       ` Jiri Slaby
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2012-01-24 22:36 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Linux-pm mailing list, LKML, Srivatsa S. Bhat, Jiri Slaby

On Tuesday, January 24, 2012, Jiri Slaby wrote:
> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
> > Hi Jiri,
> > 
> > On 01/24/2012 08:35 PM, Jiri Slaby wrote:
> > 
> >> Hi,
> >>
> >> this is a freshly booted system. When I do s2dsk, I see:
> >> ...
> >> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> >> ------------[ cut here ]------------
> >> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> >> invalid opcode: 0000 [#1] SMP
> >> CPU 0
> >> Modules linked in:
> >>
> >> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
> >> Bochs Bochs
> >> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
> >> freeze_workqueues_begin+0x195/0x1a0
> >> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
> >> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
> >> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
> >> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
> >> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
> >> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
> >> ffff880047251980)
> >> Stack:
> >>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
> >>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
> >>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
> >> Call Trace:
> >>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
> >>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
> >>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
> >>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
> >>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
> >>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
> >>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> >> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
> >> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
> >> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
> >> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
> >>  RSP <ffff880046f01d68>
> >> ---[ end trace 632574abdc098963 ]---
> >>
> > 
> > 
> > I couldn't find any obvious root-cause from a quick check. Is this completely
> > reproducible upon a fresh boot?
> 
> True.
> 
> The cause is that the function is called twice:

Which function?

Rafael


> # s2disk
> PM: Marking nosave pages: 000000000009f000 - 0000000000100000
> PM: Basic memory bitmaps created
> Syncing filesystems ... done.
> Freezing user space processes ... (elapsed 0.01 seconds) done.
> PM: Preallocating image memory... done (allocated 264170 pages)
> PM: Allocated 1056680 kbytes in 0.45 seconds (2348.17 MB/s)
> Freezing remaining freezable tasks ... Pid: 2650, comm: s2disk Not
> tainted 3.3.0-rc1-next-20120124_64+ #1628
> Call Trace:
>  [<ffffffff8109edd2>] ? getnstimeofday+0x52/0xf0
>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> (elapsed 0.03 seconds) done.
> PM: freeze of devices complete after 57.216 msecs
> PM: late freeze of devices complete after 0.720 msecs
> ACPI: Preparing to enter system sleep state S4
> PM: Saving platform NVS memory
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> PM: Creating hibernation image:
> PM: Need to copy 24112 pages
> PM: Normal pages needed: 24112 + 1024, available pages: 277076
> PM: Hibernation image created (24112 pages copied)
> Enabling non-boot CPUs ...
> Booting Node 0 Processor 1 APIC 0x1
> smpboot cpu 1: start_ip = 9a000
> Calibrating delay loop (skipped) already calibrated this CPU
> NMI watchdog disabled (cpu1): hardware events not enabled
> CPU1 is up
> ACPI: Waking up from system sleep state S4
> PM: early thaw of devices complete after 0.093 msecs
> ata_piix 0000:00:01.1: setting latency timer to 64
> e1000 0000:00:03.0: setting latency timer to 64
> e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> ata2.01: NODEV after polling detection
> ata1.00: configured for MWDMA2
> ata2.00: configured for MWDMA2
> ata1.01: configured for MWDMA2
> sd 0:0:1:0: [sdb] Starting disk
> sd 0:0:0:0: [sda] Starting disk
> PM: thaw of devices complete after 192.317 msecs
> PM: Preallocating image memory... done (allocated 264295 pages)
> PM: Allocated 1057180 kbytes in 0.12 seconds (8809.83 MB/s)
> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> ------------[ cut here ]------------
> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> invalid opcode: 0000 [#1] SMP
> CPU 1
> Modules linked in:
> 
> Pid: 2650, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1628
> Bochs Bochs
> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>]
> freeze_workqueues_begin+0x1a1/0x1b0
> RSP: 0018:ffff8800456d7d68  EFLAGS: 00010292
> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 0000000000000073
> RDX: 00000000000000ea RSI: 0000000000000046 RDI: ffffffff81b51f7c
> RBP: ffff8800456d7d98 R08: ffffffff81a9d760 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 00007fff11d8709c R14: ffffffffffffffff R15: 0000000000000004
> FS:  00007f4744397700(0000) GS:ffff880049700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000001f46bc0 CR3: 000000004483b000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process s2disk (pid: 2650, threadinfo ffff8800456d6000, task
> ffff880046c93300)
> Stack:
>  ffff8800456d7d98 0000000000000001 0000000000000000 00007fff11d8709c
>  ffffffffffffffff 0000000000000004 ffff8800456d7e18 ffffffff81096cc9
>  00000000ffff008f 0000000000000004 ffff8800456d7e18 000000004f1edaaf
> Call Trace:
>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 1d 94 5a 00 0f 0b
> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 06 94 5a 00 <0f> 0b
> 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b
> RIP  [<ffffffff8107e371>] freeze_workqueues_begin+0x1a1/0x1b0
>  RSP <ffff8800456d7d68>
> ---[ end trace 696a23a038efce41 ]---
> 
> 
> 
> >  
> > Regards,
> > Srivatsa S. Bhat
> > IBM Linux Technology Center
> > 
> 
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-24 22:36     ` Rafael J. Wysocki
@ 2012-01-24 22:47       ` Jiri Slaby
  2012-01-24 23:02         ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Slaby @ 2012-01-24 22:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, LKML, Srivatsa S. Bhat, Jiri Slaby

On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
>>> Hi Jiri,
>>>
>>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
>>>
>>>> Hi,
>>>>
>>>> this is a freshly booted system. When I do s2dsk, I see:
>>>> ...
>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>> ------------[ cut here ]------------
>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>> invalid opcode: 0000 [#1] SMP
>>>> CPU 0
>>>> Modules linked in:
>>>>
>>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
>>>> Bochs Bochs
>>>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
>>>> freeze_workqueues_begin+0x195/0x1a0
>>>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
>>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
>>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
>>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
>>>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
>>>> ffff880047251980)
>>>> Stack:
>>>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>>>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>>>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
>>>> Call Trace:
>>>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>>>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>>>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>>>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>>>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>>>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
>>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
>>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
>>>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>>>>  RSP <ffff880046f01d68>
>>>> ---[ end trace 632574abdc098963 ]---
>>>>
>>>
>>>
>>> I couldn't find any obvious root-cause from a quick check. Is this completely
>>> reproducible upon a fresh boot?
>>
>> True.
>>
>> The cause is that the function is called twice:
> 
> Which function?

The one where the BUG is. Maybe the functions which should clear the
flag is not called in between? See:

>>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
                         ^^^^^^^^^^^^^^^^^^^^^^^
>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>> (elapsed 0.03 seconds) done.
...
>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>> ------------[ cut here ]------------
>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
...
>> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>
>> freeze_workqueues_begin+0x1a1/0x1b0
   ^^^^^^^^^^^^^^^^^^^^^^^
>> Call Trace:
>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b

-- 
js
suse labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-24 22:47       ` Jiri Slaby
@ 2012-01-24 23:02         ` Rafael J. Wysocki
  2012-01-25  0:04           ` Jiri Slaby
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2012-01-24 23:02 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Linux-pm mailing list, LKML, Srivatsa S. Bhat, Jiri Slaby

On Tuesday, January 24, 2012, Jiri Slaby wrote:
> On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
> > On Tuesday, January 24, 2012, Jiri Slaby wrote:
> >> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
> >>> Hi Jiri,
> >>>
> >>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> this is a freshly booted system. When I do s2dsk, I see:
> >>>> ...
> >>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> >>>> ------------[ cut here ]------------
> >>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> >>>> invalid opcode: 0000 [#1] SMP
> >>>> CPU 0
> >>>> Modules linked in:
> >>>>
> >>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
> >>>> Bochs Bochs
> >>>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
> >>>> freeze_workqueues_begin+0x195/0x1a0
> >>>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
> >>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
> >>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
> >>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
> >>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
> >>>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
> >>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
> >>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
> >>>> ffff880047251980)
> >>>> Stack:
> >>>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
> >>>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
> >>>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
> >>>> Call Trace:
> >>>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
> >>>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
> >>>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
> >>>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
> >>>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
> >>>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
> >>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> >>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
> >>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
> >>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
> >>>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
> >>>>  RSP <ffff880046f01d68>
> >>>> ---[ end trace 632574abdc098963 ]---
> >>>>
> >>>
> >>>
> >>> I couldn't find any obvious root-cause from a quick check. Is this completely
> >>> reproducible upon a fresh boot?
> >>
> >> True.
> >>
> >> The cause is that the function is called twice:
> > 
> > Which function?
> 
> The one where the BUG is. Maybe the functions which should clear the
> flag is not called in between? See:
> 
> >>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
>                          ^^^^^^^^^^^^^^^^^^^^^^^
> >>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
> >>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
> >>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
> >>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
> >>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
> >>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
> >>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> >> (elapsed 0.03 seconds) done.
> ...
> >> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> >> ------------[ cut here ]------------
> >> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> ...
> >> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>
> >> freeze_workqueues_begin+0x1a1/0x1b0
>    ^^^^^^^^^^^^^^^^^^^^^^^
> >> Call Trace:
> >>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
> >>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
> >>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
> >>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
> >>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
> >>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
> >>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b

Ah.  So this is linux-next, right?

Can you please test the linux-next branch of the linux-pm tree and see if
the problem is reproducible in there?

Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-24 23:02         ` Rafael J. Wysocki
@ 2012-01-25  0:04           ` Jiri Slaby
  2012-01-25  0:10             ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Slaby @ 2012-01-25  0:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Jiri Slaby, LKML, Srivatsa S. Bhat

On 01/25/2012 12:02 AM, Rafael J. Wysocki wrote:
> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>> On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
>>>>> Hi Jiri,
>>>>>
>>>>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> this is a freshly booted system. When I do s2dsk, I see:
>>>>>> ...
>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>> ------------[ cut here ]------------
>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>>>> invalid opcode: 0000 [#1] SMP
>>>>>> CPU 0
>>>>>> Modules linked in:
>>>>>>
>>>>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
>>>>>> Bochs Bochs
>>>>>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
>>>>>> freeze_workqueues_begin+0x195/0x1a0
>>>>>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
>>>>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
>>>>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
>>>>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
>>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>>>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
>>>>>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
>>>>>> ffff880047251980)
>>>>>> Stack:
>>>>>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>>>>>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>>>>>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
>>>>>> Call Trace:
>>>>>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>>>>>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>>>>>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>>>>>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>>>>>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>>>>>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
>>>>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
>>>>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
>>>>>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>>>>>>  RSP <ffff880046f01d68>
>>>>>> ---[ end trace 632574abdc098963 ]---
>>>>>>
>>>>>
>>>>>
>>>>> I couldn't find any obvious root-cause from a quick check. Is this completely
>>>>> reproducible upon a fresh boot?
>>>>
>>>> True.
>>>>
>>>> The cause is that the function is called twice:
>>>
>>> Which function?
>>
>> The one where the BUG is. Maybe the functions which should clear the
>> flag is not called in between? See:
>>
>>>>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
>>                          ^^^^^^^^^^^^^^^^^^^^^^^
>>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>> (elapsed 0.03 seconds) done.
>> ...
>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>> ------------[ cut here ]------------
>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>> ...
>>>> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>
>>>> freeze_workqueues_begin+0x1a1/0x1b0
>>    ^^^^^^^^^^^^^^^^^^^^^^^
>>>> Call Trace:
>>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> 
> Ah.  So this is linux-next, right?

Right.

> Can you please test the linux-next branch of the linux-pm tree and see if
> the problem is reproducible in there?

Yeah, 100%. Just try it with a small enough swap.

thanks,
-- 
js

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25  0:04           ` Jiri Slaby
@ 2012-01-25  0:10             ` Rafael J. Wysocki
  2012-01-25 14:25               ` Jiri Slaby
  2012-01-25 15:31               ` [linux-pm] " Srivatsa S. Bhat
  0 siblings, 2 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2012-01-25  0:10 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Linux-pm mailing list, Jiri Slaby, LKML, Srivatsa S. Bhat

On Wednesday, January 25, 2012, Jiri Slaby wrote:
> On 01/25/2012 12:02 AM, Rafael J. Wysocki wrote:
> > On Tuesday, January 24, 2012, Jiri Slaby wrote:
> >> On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
> >>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
> >>>> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
> >>>>> Hi Jiri,
> >>>>>
> >>>>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> this is a freshly booted system. When I do s2dsk, I see:
> >>>>>> ...
> >>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> >>>>>> ------------[ cut here ]------------
> >>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> >>>>>> invalid opcode: 0000 [#1] SMP
> >>>>>> CPU 0
> >>>>>> Modules linked in:
> >>>>>>
> >>>>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
> >>>>>> Bochs Bochs
> >>>>>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
> >>>>>> freeze_workqueues_begin+0x195/0x1a0
> >>>>>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
> >>>>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
> >>>>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
> >>>>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
> >>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >>>>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
> >>>>>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
> >>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>>>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
> >>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
> >>>>>> ffff880047251980)
> >>>>>> Stack:
> >>>>>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
> >>>>>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
> >>>>>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
> >>>>>> Call Trace:
> >>>>>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>>>>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
> >>>>>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
> >>>>>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
> >>>>>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
> >>>>>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
> >>>>>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
> >>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> >>>>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
> >>>>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
> >>>>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
> >>>>>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
> >>>>>>  RSP <ffff880046f01d68>
> >>>>>> ---[ end trace 632574abdc098963 ]---
> >>>>>>
> >>>>>
> >>>>>
> >>>>> I couldn't find any obvious root-cause from a quick check. Is this completely
> >>>>> reproducible upon a fresh boot?
> >>>>
> >>>> True.
> >>>>
> >>>> The cause is that the function is called twice:
> >>>
> >>> Which function?
> >>
> >> The one where the BUG is. Maybe the functions which should clear the
> >> flag is not called in between? See:
> >>
> >>>>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
> >>                          ^^^^^^^^^^^^^^^^^^^^^^^
> >>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
> >>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
> >>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
> >>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
> >>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
> >>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
> >>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> >>>> (elapsed 0.03 seconds) done.
> >> ...
> >>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
> >>>> ------------[ cut here ]------------
> >>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
> >> ...
> >>>> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>
> >>>> freeze_workqueues_begin+0x1a1/0x1b0
> >>    ^^^^^^^^^^^^^^^^^^^^^^^
> >>>> Call Trace:
> >>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
> >>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
> >>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
> >>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
> >>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
> >>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
> >>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
> >>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
> > 
> > Ah.  So this is linux-next, right?
> 
> Right.
> 
> > Can you please test the linux-next branch of the linux-pm tree and see if
> > the problem is reproducible in there?
> 
> Yeah, 100%. Just try it with a small enough swap.

Ah, thanks, so that's an error code path problem and most likely in the Linus'
tree.

Srivatsa, any ideas?

Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25  0:10             ` Rafael J. Wysocki
@ 2012-01-25 14:25               ` Jiri Slaby
  2012-01-25 15:31               ` [linux-pm] " Srivatsa S. Bhat
  1 sibling, 0 replies; 15+ messages in thread
From: Jiri Slaby @ 2012-01-25 14:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Jiri Slaby, LKML, Srivatsa S. Bhat

On 01/25/2012 01:10 AM, Rafael J. Wysocki wrote:
> On Wednesday, January 25, 2012, Jiri Slaby wrote:
>> Yeah, 100%. Just try it with a small enough swap.
> 
> Ah, thanks, so that's an error code path problem and most likely in the Linus'
> tree.

Just to emphasize what I wrote earlier. I would apprectiate this fixed
in 3.2 stable too.

-- 
js

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-pm] PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25  0:10             ` Rafael J. Wysocki
  2012-01-25 14:25               ` Jiri Slaby
@ 2012-01-25 15:31               ` Srivatsa S. Bhat
  2012-01-25 16:00                 ` Srivatsa S. Bhat
  2012-01-26 19:39                 ` Srivatsa S. Bhat
  1 sibling, 2 replies; 15+ messages in thread
From: Srivatsa S. Bhat @ 2012-01-25 15:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Slaby, Linux-pm mailing list, Jiri Slaby, LKML, Baohua.Song,
	Tejun Heo, pavel@ucw.cz

On 01/25/2012 05:40 AM, Rafael J. Wysocki wrote:

> On Wednesday, January 25, 2012, Jiri Slaby wrote:
>> On 01/25/2012 12:02 AM, Rafael J. Wysocki wrote:
>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>> On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
>>>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>>>> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
>>>>>>> Hi Jiri,
>>>>>>>
>>>>>>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> this is a freshly booted system. When I do s2dsk, I see:
>>>>>>>> ...
>>>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>>>> ------------[ cut here ]------------
>>>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>>>>>> invalid opcode: 0000 [#1] SMP
>>>>>>>> CPU 0
>>>>>>>> Modules linked in:
>>>>>>>>
>>>>>>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
>>>>>>>> Bochs Bochs
>>>>>>>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
>>>>>>>> freeze_workqueues_begin+0x195/0x1a0
>>>>>>>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
>>>>>>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
>>>>>>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
>>>>>>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
>>>>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>>>>>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
>>>>>>>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
>>>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
>>>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
>>>>>>>> ffff880047251980)
>>>>>>>> Stack:
>>>>>>>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>>>>>>>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>>>>>>>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
>>>>>>>> Call Trace:
>>>>>>>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>>>>>>>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>>>>>>>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>>>>>>>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>>>>>>>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>>>>>>>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>>>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
>>>>>>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
>>>>>>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
>>>>>>>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>>>>>>>>  RSP <ffff880046f01d68>
>>>>>>>> ---[ end trace 632574abdc098963 ]---
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I couldn't find any obvious root-cause from a quick check. Is this completely
>>>>>>> reproducible upon a fresh boot?
>>>>>>
>>>>>> True.
>>>>>>
>>>>>> The cause is that the function is called twice:
>>>>>
>>>>> Which function?
>>>>
>>>> The one where the BUG is. Maybe the functions which should clear the
>>>> flag is not called in between? See:
>>>>
>>>>>>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
>>>>                          ^^^^^^^^^^^^^^^^^^^^^^^
>>>>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>> (elapsed 0.03 seconds) done.
>>>> ...
>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>> ------------[ cut here ]------------
>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>> ...
>>>>>> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>
>>>>>> freeze_workqueues_begin+0x1a1/0x1b0
>>>>    ^^^^^^^^^^^^^^^^^^^^^^^
>>>>>> Call Trace:
>>>>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>
>>> Ah.  So this is linux-next, right?
>>
>> Right.
>>
>>> Can you please test the linux-next branch of the linux-pm tree and see if
>>> the problem is reproducible in there?
>>
>> Yeah, 100%. Just try it with a small enough swap.
> 
> Ah, thanks, so that's an error code path problem and most likely in the Linus'
> tree.
> 
> Srivatsa, any ideas?
> 


Ok, I will need to quote a part of the userspace utility to explain the
problem.

In suspend.c inside the suspend-utils userspace package, I see a loop such
as:

        error = freeze(snapshot_fd);
	...
        attempts = 2;
        do {
                if (set_image_size(snapshot_fd, image_size)) {
                        error = errno;
                        break;
                }
                if (atomic_snapshot(snapshot_fd, &in_suspend)) {
                        error = errno;
                        break;
                }
                if (!in_suspend) {
                        /* first unblank the console, see console_codes(4) */
                        printf("\e[13]");
                        printf("%s: returned to userspace\n", my_name);
                        free_snapshot(snapshot_fd);
                        break;
                }

                error = write_image(snapshot_fd, resume_fd, -1);
                if (error) {
                        free_swap_pages(snapshot_fd);
                        free_snapshot(snapshot_fd);
                        image_size = 0;
                        error = -error;
                        if (error != ENOSPC)
                                break;
                } else {
                        splash.progress(100);
#ifdef CONFIG_BOTH
                        if (s2ram_kms || s2ram) {
                                /* If we die (and allow system to continue)
                                 * between now and reset_signature(), very bad
                                 * things will happen. */
                                error = suspend_to_ram(snapshot_fd);
                                if (error)
                                        goto Shutdown;
                                reset_signature(resume_fd);
                                free_swap_pages(snapshot_fd);
                                free_snapshot(snapshot_fd);
                                if (!s2ram_kms)
                                        s2ram_resume();
                                goto Unfreeze;
                        }
Shutdown:
#endif
                        close(resume_fd);
                        suspend_shutdown(snapshot_fd);
                }
        } while (--attempts);

...
Unfreeze:
        unfreeze(snapshot_fd);


Let me reply to this thread so that I can comment on the above code.

Regards,
Srivatsa S. Bhat

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-pm] PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25 15:31               ` [linux-pm] " Srivatsa S. Bhat
@ 2012-01-25 16:00                 ` Srivatsa S. Bhat
  2012-01-25 18:44                   ` Srivatsa S. Bhat
  2012-01-25 22:00                   ` Jiri Slaby
  2012-01-26 19:39                 ` Srivatsa S. Bhat
  1 sibling, 2 replies; 15+ messages in thread
From: Srivatsa S. Bhat @ 2012-01-25 16:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Slaby, Linux-pm mailing list, Jiri Slaby, LKML, Baohua.Song,
	Tejun Heo, pavel@ucw.cz

On 01/25/2012 09:01 PM, Srivatsa S. Bhat wrote:

> On 01/25/2012 05:40 AM, Rafael J. Wysocki wrote:
> 
>> On Wednesday, January 25, 2012, Jiri Slaby wrote:
>>> On 01/25/2012 12:02 AM, Rafael J. Wysocki wrote:
>>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>>> On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
>>>>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>>>>> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
>>>>>>>> Hi Jiri,
>>>>>>>>
>>>>>>>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> this is a freshly booted system. When I do s2dsk, I see:
>>>>>>>>> ...
>>>>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>>>>> ------------[ cut here ]------------
>>>>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>>>>>>> invalid opcode: 0000 [#1] SMP
>>>>>>>>> CPU 0
>>>>>>>>> Modules linked in:
>>>>>>>>>
>>>>>>>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
>>>>>>>>> Bochs Bochs
>>>>>>>>> RIP: 0010:[<ffffffff8107e365>]  [<ffffffff8107e365>]
>>>>>>>>> freeze_workqueues_begin+0x195/0x1a0
>>>>>>>>> RSP: 0018:ffff880046f01d68  EFLAGS: 00010292
>>>>>>>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
>>>>>>>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
>>>>>>>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
>>>>>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>>>>>>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
>>>>>>>>> FS:  00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
>>>>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>>>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
>>>>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>>>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
>>>>>>>>> ffff880047251980)
>>>>>>>>> Stack:
>>>>>>>>>  ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>>>>>>>>>  ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>>>>>>>>>  00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
>>>>>>>>> Call Trace:
>>>>>>>>>  [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>>>>  [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>>>>>>>>>  [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>>>>>>>>>  [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>>>>>>>>>  [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>>>>>>>>>  [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>>>>>>>>>  [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>>>>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>>>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
>>>>>>>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
>>>>>>>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
>>>>>>>>> RIP  [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>>>>>>>>>  RSP <ffff880046f01d68>
>>>>>>>>> ---[ end trace 632574abdc098963 ]---
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I couldn't find any obvious root-cause from a quick check. Is this completely
>>>>>>>> reproducible upon a fresh boot?
>>>>>>>
>>>>>>> True.
>>>>>>>
>>>>>>> The cause is that the function is called twice:
>>>>>>
>>>>>> Which function?
>>>>>
>>>>> The one where the BUG is. Maybe the functions which should clear the
>>>>> flag is not called in between? See:
>>>>>
>>>>>>>  [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
>>>>>                          ^^^^^^^^^^^^^^^^^^^^^^^
>>>>>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>>> (elapsed 0.03 seconds) done.
>>>>> ...
>>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>>> ------------[ cut here ]------------
>>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>>> ...
>>>>>>> RIP: 0010:[<ffffffff8107e371>]  [<ffffffff8107e371>
>>>>>>> freeze_workqueues_begin+0x1a1/0x1b0
>>>>>    ^^^^^^^^^^^^^^^^^^^^^^^
>>>>>>> Call Trace:
>>>>>>>  [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>>  [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>>>>  [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>>>>  [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>>>>  [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>>>>  [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>>>>  [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>>>>  [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>
>>>> Ah.  So this is linux-next, right?
>>>
>>> Right.
>>>
>>>> Can you please test the linux-next branch of the linux-pm tree and see if
>>>> the problem is reproducible in there?
>>>
>>> Yeah, 100%. Just try it with a small enough swap.
>>
>> Ah, thanks, so that's an error code path problem and most likely in the Linus'
>> tree.
>>
>> Srivatsa, any ideas?
>>
> 
> 
> Ok, I will need to quote a part of the userspace utility to explain the
> problem.
> 




Commit 2aede851 (PM / Hibernate: Freeze kernel threads after preallocating
memory) split up the freezing phase and moved the freezing of kernel threads
to hibernation_snapshot() function.
(This is a pretty old commit btw, I think it came in sometime around 3.2).

And there is a BUG_ON() in freeze_workqueues_begin() which complains if
this function is called when the kernel threads are already frozen.

Now, coming to the userspace utility:

> In suspend.c inside the suspend-utils userspace package, I see a loop such
> as:
> 
>         error = freeze(snapshot_fd);
> 	...
>         attempts = 2;
>         do {
>                 if (set_image_size(snapshot_fd, image_size)) {
>                         error = errno;
>                         break;
>                 }
>                 if (atomic_snapshot(snapshot_fd, &in_suspend)) {



                        ^^^ 
        This ends up in the call to hibernation_snapshot().

>                         error = errno;
>                         break;
>                 }
>                 if (!in_suspend) {
>                         /* first unblank the console, see console_codes(4) */
>                         printf("\e[13]");
>                         printf("%s: returned to userspace\n", my_name);
>                         free_snapshot(snapshot_fd);
>                         break;
>                 }
> 
>                 error = write_image(snapshot_fd, resume_fd, -1);
>                 if (error) {




        Since swap space is limited, we come here.

>                         free_swap_pages(snapshot_fd);
>                         free_snapshot(snapshot_fd);
>                         image_size = 0;
>                         error = -error;
>                         if (error != ENOSPC)
>                                 break;




        And because of the above 'if' condition, we don't break here.
        IOW, we will run the loop a second time.

>                 } else {
>                         splash.progress(100);
> #ifdef CONFIG_BOTH
>                         if (s2ram_kms || s2ram) {
>                                 /* If we die (and allow system to continue)
>                                  * between now and reset_signature(), very bad
>                                  * things will happen. */
>                                 error = suspend_to_ram(snapshot_fd);
>                                 if (error)
>                                         goto Shutdown;
>                                 reset_signature(resume_fd);
>                                 free_swap_pages(snapshot_fd);
>                                 free_snapshot(snapshot_fd);
>                                 if (!s2ram_kms)
>                                         s2ram_resume();
>                                 goto Unfreeze;
>                         }
> Shutdown:
> #endif
>                         close(resume_fd);
>                         suspend_shutdown(snapshot_fd);
>                 }
>         } while (--attempts);
> 
> ...
> Unfreeze:
>         unfreeze(snapshot_fd);
> 
> 


So, since the loop is run a second time if not enough space was available,
we end up calling hibernation_snapshot(), IOW freeze_workqueues_begin() a
second time - and this is what makes the BUG_ON() to fire!

To solve this, I feel we can simply omit the BUG_ON() inside
freeze_workqueues_begin() and replace it with a simple check (just like
what we do in thaw_workqueues())...

And moreover, after the change that moved the freezing of kernel threads
to hibernation_snapshot(), it is quite natural to hit a scenario such
as this, because the entire freezing phase is not in one single place.
IOW, the existing BUG_ON() is not qualified to be there anymore!

(Also, exposing a straight-forward method like this to userspace, to
trigger a BUG_ON sounds ridiculous to start with!)

Of course, the other method would be to refactor the kernel code and stuff
like that, but that not only would be messy but it would also involve
breaking existing userspace applications, AFAICS.



So, Jiri, can you please try the following patch and see if it works for
you as expected? I'll be happy to provide a formal patch with a changelog
if this works.


---
 kernel/workqueue.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index bec7b5b..cb26c5d 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3656,7 +3656,9 @@ void freeze_workqueues_begin(void)
 
 	spin_lock(&workqueue_lock);
 
-	BUG_ON(workqueue_freezing);
+	if (workqueue_freezing)
+		goto out_unlock;
+
 	workqueue_freezing = true;
 
 	for_each_gcwq_cpu(cpu) {
@@ -3678,6 +3680,7 @@ void freeze_workqueues_begin(void)
 		spin_unlock_irq(&gcwq->lock);
 	}
 
+out_unlock:
 	spin_unlock(&workqueue_lock);
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25 16:00                 ` Srivatsa S. Bhat
@ 2012-01-25 18:44                   ` Srivatsa S. Bhat
  2012-01-25 22:00                   ` Jiri Slaby
  1 sibling, 0 replies; 15+ messages in thread
From: Srivatsa S. Bhat @ 2012-01-25 18:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Slaby, Linux PM mailing list, LKML, Baohua.Song, Tejun Heo,
	Linux-pm mailing list, Jiri Slaby

> 

> Commit 2aede851 (PM / Hibernate: Freeze kernel threads after preallocating
> memory) split up the freezing phase and moved the freezing of kernel threads
> to hibernation_snapshot() function.
> (This is a pretty old commit btw, I think it came in sometime around 3.2).
> 
> And there is a BUG_ON() in freeze_workqueues_begin() which complains if
> this function is called when the kernel threads are already frozen.
> 
> Now, coming to the userspace utility:
> 
>> In suspend.c inside the suspend-utils userspace package, I see a loop such
>> as:
>>
>>         error = freeze(snapshot_fd);
>> 	...
>>         attempts = 2;
>>         do {
>>                 if (set_image_size(snapshot_fd, image_size)) {
>>                         error = errno;
>>                         break;
>>                 }
>>                 if (atomic_snapshot(snapshot_fd, &in_suspend)) {
>                         ^^^ 
>         This ends up in the call to hibernation_snapshot().
> 
>>                         error = errno;
>>                         break;
>>                 }
>>                 if (!in_suspend) {
>>                         /* first unblank the console, see console_codes(4) */
>>                         printf("\e[13]");
>>                         printf("%s: returned to userspace\n", my_name);
>>                         free_snapshot(snapshot_fd);
>>                         break;
>>                 }
>>
>>                 error = write_image(snapshot_fd, resume_fd, -1);
>>                 if (error) {
> 
>         Since swap space is limited, we come here.
> 
>>                         free_swap_pages(snapshot_fd);
>>                         free_snapshot(snapshot_fd);

				^^^
	This function makes the ioctl call SNAPSHOT_FREE.

>>                         image_size = 0;
>>                         error = -error;
>>                         if (error != ENOSPC)
>>                                 break;
> 
> 
> 
> 
>         And because of the above 'if' condition, we don't break here.
>         IOW, we will run the loop a second time.
> 
>>                 } else {
>>                         splash.progress(100);
>> #ifdef CONFIG_BOTH
>>                         if (s2ram_kms || s2ram) {
>>                                 /* If we die (and allow system to continue)
>>                                  * between now and reset_signature(), very bad
>>                                  * things will happen. */
>>                                 error = suspend_to_ram(snapshot_fd);
>>                                 if (error)
>>                                         goto Shutdown;
>>                                 reset_signature(resume_fd);
>>                                 free_swap_pages(snapshot_fd);
>>                                 free_snapshot(snapshot_fd);
>>                                 if (!s2ram_kms)
>>                                         s2ram_resume();
>>                                 goto Unfreeze;
>>                         }
>> Shutdown:
>> #endif
>>                         close(resume_fd);
>>                         suspend_shutdown(snapshot_fd);
>>                 }
>>         } while (--attempts);
>>
>> ...
>> Unfreeze:
>>         unfreeze(snapshot_fd);
>>
>>
> 
> 
> So, since the loop is run a second time if not enough space was available,
> we end up calling hibernation_snapshot(), IOW freeze_workqueues_begin() a
> second time - and this is what makes the BUG_ON() to fire!
> 


SNAPSHOT_CREATE_IMAGE has a check for data->ready such as:

        if (data->mode != O_RDONLY || !data->frozen  || data->ready) {
                error = -EPERM;
                break;
        }

data->ready would be set to 1 only under SNAPSHOT_CREATE_IMAGE. However,
SNAPSHOT_FREE (invoked at the place shown above) will reset the value to 0.
This makes it possible for hibernation_snapshot() and hence
freeze_workqueues_begin() to be called a second time, which is unfortunate.

And actually, the patch I posted in my previous mail is not really the right
long-term fix, though it might fix the particular issue that Jiri is facing..

Because, allowing hibernation_snapshot() to get called a second time while
kernel threads are still frozen brings us to the same situation that commit
2aede851 (PM / Hibernate: Freeze kernel threads after preallocating memory)
tried to prevent! IOW, a call to hibernate_preallocate_memory() would be
done inside hibernation_snapshot(), when kernel threads are frozen.. which
is known to break XFS, to give one example as mentioned in the changelog
of the above commit.

So, the right way to fix this IMHO, would be to split up thaw_processes()
just like freezing phase:

/* freezes or thaws user space processes */
freeze_processes() - thaw_processes()

/* freezes or thaws kernel threads */
freeze_kernel_threads() - thaw_kernel_threads()

We have to insert this thaw_kernel_threads() at appropriate places in such a
way as to not require another ioctl if possible... Then things would be
more symmetric (and hence more easy to understand) and we can avoid getting
into strange situations as discussed here.

But before we venture into that, it would be good to know if the patch posted
in the previous mail fixes the particular problem reported in this thread,
atleast just to see if there are other problems lurking that we aren't aware
of yet..

Regards,
Srivatsa S. Bhat

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25 16:00                 ` Srivatsa S. Bhat
  2012-01-25 18:44                   ` Srivatsa S. Bhat
@ 2012-01-25 22:00                   ` Jiri Slaby
  1 sibling, 0 replies; 15+ messages in thread
From: Jiri Slaby @ 2012-01-25 22:00 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: LKML, Baohua.Song, Tejun Heo, Linux-pm mailing list, Jiri Slaby

On 01/25/2012 05:00 PM, Srivatsa S. Bhat wrote:
> So, Jiri, can you please try the following patch and see if it works for
> you as expected? I'll be happy to provide a formal patch with a changelog
> if this works.

FWIW it works, thanks. If you want me test a proper fix, jsut let me know.

> ---
>  kernel/workqueue.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index bec7b5b..cb26c5d 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3656,7 +3656,9 @@ void freeze_workqueues_begin(void)
>  
>  	spin_lock(&workqueue_lock);
>  
> -	BUG_ON(workqueue_freezing);
> +	if (workqueue_freezing)
> +		goto out_unlock;
> +
>  	workqueue_freezing = true;
>  
>  	for_each_gcwq_cpu(cpu) {
> @@ -3678,6 +3680,7 @@ void freeze_workqueues_begin(void)
>  		spin_unlock_irq(&gcwq->lock);
>  	}
>  
> +out_unlock:
>  	spin_unlock(&workqueue_lock);
>  }
>  


-- 
js

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-25 15:31               ` [linux-pm] " Srivatsa S. Bhat
  2012-01-25 16:00                 ` Srivatsa S. Bhat
@ 2012-01-26 19:39                 ` Srivatsa S. Bhat
  2012-01-27  1:10                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 15+ messages in thread
From: Srivatsa S. Bhat @ 2012-01-26 19:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Slaby, LKML, Baohua.Song, Tejun Heo, Linux-pm mailing list,
	Jiri Slaby


Hi Rafael,

On 01/25/2012 09:01 PM, Srivatsa S. Bhat wrote:

> 
> Ok, I will need to quote a part of the userspace utility to explain the
> problem.
> 
> In suspend.c inside the suspend-utils userspace package, I see a loop such
> as:
> 
>         error = freeze(snapshot_fd);
> 	...
>         attempts = 2;
>         do {
>                 if (set_image_size(snapshot_fd, image_size)) {
>                         error = errno;
>                         break;
>                 }
>                 if (atomic_snapshot(snapshot_fd, &in_suspend)) {
>                         error = errno;
>                         break;
>                 }
>                 if (!in_suspend) {
>                         /* first unblank the console, see console_codes(4) */
>                         printf("\e[13]");
>                         printf("%s: returned to userspace\n", my_name);
>                         free_snapshot(snapshot_fd);
>                         break;
>                 }
> 
>                 error = write_image(snapshot_fd, resume_fd, -1);
>                 if (error) {
>                         free_swap_pages(snapshot_fd);
>                         free_snapshot(snapshot_fd);
>                         image_size = 0;
>                         error = -error;
>                         if (error != ENOSPC)
>                                 break;
>                 } else {
>                         splash.progress(100);
> #ifdef CONFIG_BOTH
>                         if (s2ram_kms || s2ram) {
>                                 /* If we die (and allow system to continue)
>                                  * between now and reset_signature(), very bad
>                                  * things will happen. */
>                                 error = suspend_to_ram(snapshot_fd);
>                                 if (error)
>                                         goto Shutdown;
>                                 reset_signature(resume_fd);
>                                 free_swap_pages(snapshot_fd);
>                                 free_snapshot(snapshot_fd);
>                                 if (!s2ram_kms)
>                                         s2ram_resume();


Your patch alters how SNAPSHOT_FREE (IOW, free_snapshot() in this utility) is
handled. So, I was trying to see if there are any points of concern...

In the above code, s2ram_resume() gets invoked after free_snapshot(). Will that
pose any problems because kernel threads would have been thawed at that point,
after applying your patch?

And other than that, do you foresee any problems arising from the change caused
to SNAPSHOT_FREE by your patch? I mean, s2ram/s2disk/suspend-utils package are
not the only userspace utilities after all... so I just wanted to ensure that
we don't over-fit our solution to this particular utility and end up breaking
others...

Regards,
Srivatsa S. Bhat

>                                 goto Unfreeze;
>                         }
> Shutdown:
> #endif
>                         close(resume_fd);
>                         suspend_shutdown(snapshot_fd);
>                 }
>         } while (--attempts);
> 
> ...
> Unfreeze:
>         unfreeze(snapshot_fd);
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
  2012-01-26 19:39                 ` Srivatsa S. Bhat
@ 2012-01-27  1:10                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2012-01-27  1:10 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Jiri Slaby, LKML, Baohua.Song, Tejun Heo, Linux-pm mailing list,
	Jiri Slaby

On Thursday, January 26, 2012, Srivatsa S. Bhat wrote:
> 
> Hi Rafael,
> 
> On 01/25/2012 09:01 PM, Srivatsa S. Bhat wrote:
> 
> > 
> > Ok, I will need to quote a part of the userspace utility to explain the
> > problem.
> > 
> > In suspend.c inside the suspend-utils userspace package, I see a loop such
> > as:
> > 
> >         error = freeze(snapshot_fd);
> > 	...
> >         attempts = 2;
> >         do {
> >                 if (set_image_size(snapshot_fd, image_size)) {
> >                         error = errno;
> >                         break;
> >                 }
> >                 if (atomic_snapshot(snapshot_fd, &in_suspend)) {
> >                         error = errno;
> >                         break;
> >                 }
> >                 if (!in_suspend) {
> >                         /* first unblank the console, see console_codes(4) */
> >                         printf("\e[13]");
> >                         printf("%s: returned to userspace\n", my_name);
> >                         free_snapshot(snapshot_fd);
> >                         break;
> >                 }
> > 
> >                 error = write_image(snapshot_fd, resume_fd, -1);
> >                 if (error) {
> >                         free_swap_pages(snapshot_fd);
> >                         free_snapshot(snapshot_fd);
> >                         image_size = 0;
> >                         error = -error;
> >                         if (error != ENOSPC)
> >                                 break;
> >                 } else {
> >                         splash.progress(100);
> > #ifdef CONFIG_BOTH
> >                         if (s2ram_kms || s2ram) {
> >                                 /* If we die (and allow system to continue)
> >                                  * between now and reset_signature(), very bad
> >                                  * things will happen. */
> >                                 error = suspend_to_ram(snapshot_fd);
> >                                 if (error)
> >                                         goto Shutdown;
> >                                 reset_signature(resume_fd);
> >                                 free_swap_pages(snapshot_fd);
> >                                 free_snapshot(snapshot_fd);
> >                                 if (!s2ram_kms)
> >                                         s2ram_resume();
> 
> 
> Your patch alters how SNAPSHOT_FREE (IOW, free_snapshot() in this utility) is
> handled. So, I was trying to see if there are any points of concern...
> 
> In the above code, s2ram_resume() gets invoked after free_snapshot(). Will that
> pose any problems because kernel threads would have been thawed at that point,
> after applying your patch?

No, it shouldn't.  s2ram_resume() only executes quirks needed to restore the
state of graphics if KMS is not being used.  That shouldn't interfere with
any kernel threads.

> And other than that, do you foresee any problems arising from the change caused
> to SNAPSHOT_FREE by your patch? I mean, s2ram/s2disk/suspend-utils package are
> not the only userspace utilities after all... so I just wanted to ensure that
> we don't over-fit our solution to this particular utility and end up breaking
> others...

I'm quite sure they are the only package using the interface in
kernel/power/user.c.  At least, I'm not aware of any other users. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-01-27  1:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-24 15:05 PM: cannot hibernate -- BUG at kernel/workqueue.c:3659 Jiri Slaby
2012-01-24 16:18 ` [linux-pm] " Srivatsa S. Bhat
2012-01-24 16:24   ` Jiri Slaby
2012-01-24 22:36     ` Rafael J. Wysocki
2012-01-24 22:47       ` Jiri Slaby
2012-01-24 23:02         ` Rafael J. Wysocki
2012-01-25  0:04           ` Jiri Slaby
2012-01-25  0:10             ` Rafael J. Wysocki
2012-01-25 14:25               ` Jiri Slaby
2012-01-25 15:31               ` [linux-pm] " Srivatsa S. Bhat
2012-01-25 16:00                 ` Srivatsa S. Bhat
2012-01-25 18:44                   ` Srivatsa S. Bhat
2012-01-25 22:00                   ` Jiri Slaby
2012-01-26 19:39                 ` Srivatsa S. Bhat
2012-01-27  1:10                   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).