From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Jiri Slaby <jirislaby@gmail.com>,
Linux-pm mailing list <linux-pm@lists.linux-foundation.org>,
Jiri Slaby <jslaby@suse.cz>, LKML <linux-kernel@vger.kernel.org>,
Baohua.Song@csr.com, Tejun Heo <tj@kernel.org>,
"pavel@ucw.cz" <pavel@ucw.cz>
Subject: Re: [linux-pm] PM: cannot hibernate -- BUG at kernel/workqueue.c:3659
Date: Wed, 25 Jan 2012 21:01:27 +0530 [thread overview]
Message-ID: <4F20204F.6040606@linux.vnet.ibm.com> (raw)
In-Reply-To: <201201250110.44360.rjw@sisk.pl>
On 01/25/2012 05:40 AM, Rafael J. Wysocki wrote:
> On Wednesday, January 25, 2012, Jiri Slaby wrote:
>> On 01/25/2012 12:02 AM, Rafael J. Wysocki wrote:
>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>> On 01/24/2012 11:36 PM, Rafael J. Wysocki wrote:
>>>>> On Tuesday, January 24, 2012, Jiri Slaby wrote:
>>>>>> On 01/24/2012 05:18 PM, Srivatsa S. Bhat wrote:
>>>>>>> Hi Jiri,
>>>>>>>
>>>>>>> On 01/24/2012 08:35 PM, Jiri Slaby wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> this is a freshly booted system. When I do s2dsk, I see:
>>>>>>>> ...
>>>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>>>> ------------[ cut here ]------------
>>>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>>>>>> invalid opcode: 0000 [#1] SMP
>>>>>>>> CPU 0
>>>>>>>> Modules linked in:
>>>>>>>>
>>>>>>>> Pid: 2669, comm: s2disk Not tainted 3.3.0-rc1-next-20120124_64+ #1627
>>>>>>>> Bochs Bochs
>>>>>>>> RIP: 0010:[<ffffffff8107e365>] [<ffffffff8107e365>]
>>>>>>>> freeze_workqueues_begin+0x195/0x1a0
>>>>>>>> RSP: 0018:ffff880046f01d68 EFLAGS: 00010292
>>>>>>>> RAX: 0000000000000023 RBX: 0000000000000001 RCX: 00000000000000c9
>>>>>>>> RDX: 0000000000000077 RSI: 0000000000000046 RDI: ffffffff81b51f7c
>>>>>>>> RBP: ffff880046f01d98 R08: ffffffff81a9d760 R09: 0000000000000000
>>>>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>>>>>>> R13: 00007fff579464dc R14: ffffffffffffffff R15: 0000000000000004
>>>>>>>> FS: 00007f3c65d54700(0000) GS:ffff880049600000(0000) knlGS:0000000000000000
>>>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>>>> CR2: 00007f3c64f58c20 CR3: 0000000045b64000 CR4: 00000000000006f0
>>>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>>>> Process s2disk (pid: 2669, threadinfo ffff880046f00000, task
>>>>>>>> ffff880047251980)
>>>>>>>> Stack:
>>>>>>>> ffff880046f01d98 0000000000000001 0000000000000000 00007fff579464dc
>>>>>>>> ffffffffffffffff 0000000000000004 ffff880046f01e18 ffffffff81096cb9
>>>>>>>> 00000000ffff0124 0000000000000004 ffff880046f01e18 000000004f1ec7d1
>>>>>>>> Call Trace:
>>>>>>>> [<ffffffff81096cb9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>>>> [<ffffffff81096ed5>] freeze_kernel_threads+0x25/0x90
>>>>>>>> [<ffffffff81097b55>] hibernation_snapshot+0x75/0x2e0
>>>>>>>> [<ffffffff8109d724>] snapshot_ioctl+0x314/0x4e0
>>>>>>>> [<ffffffff81130856>] do_vfs_ioctl+0x96/0x550
>>>>>>>> [<ffffffff8111ff7b>] ? vfs_write+0x10b/0x180
>>>>>>>> [<ffffffff81130d5a>] sys_ioctl+0x4a/0x80
>>>>>>>> [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>>>> Code: c7 c6 0a a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 19 94 5a 00 0f 0b
>>>>>>>> 48 c7 c6 27 a4 92 81 48 c7 c7 16 65 92 81 31 c0 e8 02 94 5a 00 <0f> 0b
>>>>>>>> 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 82 4b b9 81 48 89
>>>>>>>> RIP [<ffffffff8107e365>] freeze_workqueues_begin+0x195/0x1a0
>>>>>>>> RSP <ffff880046f01d68>
>>>>>>>> ---[ end trace 632574abdc098963 ]---
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I couldn't find any obvious root-cause from a quick check. Is this completely
>>>>>>> reproducible upon a fresh boot?
>>>>>>
>>>>>> True.
>>>>>>
>>>>>> The cause is that the function is called twice:
>>>>>
>>>>> Which function?
>>>>
>>>> The one where the BUG is. Maybe the functions which should clear the
>>>> flag is not called in between? See:
>>>>
>>>>>> [<ffffffff8107e206>] freeze_workqueues_begin+0x36/0x1b0
>>>> ^^^^^^^^^^^^^^^^^^^^^^^
>>>>>> [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>> [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>>> [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>>> [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>>> [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>>> [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>>> [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>>> [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>>>> (elapsed 0.03 seconds) done.
>>>> ...
>>>>>> Freezing remaining freezable tasks ... BUG: 'workqueue_freezing' is true!
>>>>>> ------------[ cut here ]------------
>>>>>> kernel BUG at /l/latest/linux/kernel/workqueue.c:3659!
>>>> ...
>>>>>> RIP: 0010:[<ffffffff8107e371>] [<ffffffff8107e371>
>>>>>> freeze_workqueues_begin+0x1a1/0x1b0
>>>> ^^^^^^^^^^^^^^^^^^^^^^^
>>>>>> Call Trace:
>>>>>> [<ffffffff81096cc9>] try_to_freeze_tasks+0x1b9/0x2d0
>>>>>> [<ffffffff81096ee5>] freeze_kernel_threads+0x25/0x90
>>>>>> [<ffffffff81097b65>] hibernation_snapshot+0x75/0x2e0
>>>>>> [<ffffffff8109d734>] snapshot_ioctl+0x314/0x4e0
>>>>>> [<ffffffff81130866>] do_vfs_ioctl+0x96/0x550
>>>>>> [<ffffffff8111ff8b>] ? vfs_write+0x10b/0x180
>>>>>> [<ffffffff81130d6a>] sys_ioctl+0x4a/0x80
>>>>>> [<ffffffff81630e22>] system_call_fastpath+0x16/0x1b
>>>
>>> Ah. So this is linux-next, right?
>>
>> Right.
>>
>>> Can you please test the linux-next branch of the linux-pm tree and see if
>>> the problem is reproducible in there?
>>
>> Yeah, 100%. Just try it with a small enough swap.
>
> Ah, thanks, so that's an error code path problem and most likely in the Linus'
> tree.
>
> Srivatsa, any ideas?
>
Ok, I will need to quote a part of the userspace utility to explain the
problem.
In suspend.c inside the suspend-utils userspace package, I see a loop such
as:
error = freeze(snapshot_fd);
...
attempts = 2;
do {
if (set_image_size(snapshot_fd, image_size)) {
error = errno;
break;
}
if (atomic_snapshot(snapshot_fd, &in_suspend)) {
error = errno;
break;
}
if (!in_suspend) {
/* first unblank the console, see console_codes(4) */
printf("\e[13]");
printf("%s: returned to userspace\n", my_name);
free_snapshot(snapshot_fd);
break;
}
error = write_image(snapshot_fd, resume_fd, -1);
if (error) {
free_swap_pages(snapshot_fd);
free_snapshot(snapshot_fd);
image_size = 0;
error = -error;
if (error != ENOSPC)
break;
} else {
splash.progress(100);
#ifdef CONFIG_BOTH
if (s2ram_kms || s2ram) {
/* If we die (and allow system to continue)
* between now and reset_signature(), very bad
* things will happen. */
error = suspend_to_ram(snapshot_fd);
if (error)
goto Shutdown;
reset_signature(resume_fd);
free_swap_pages(snapshot_fd);
free_snapshot(snapshot_fd);
if (!s2ram_kms)
s2ram_resume();
goto Unfreeze;
}
Shutdown:
#endif
close(resume_fd);
suspend_shutdown(snapshot_fd);
}
} while (--attempts);
...
Unfreeze:
unfreeze(snapshot_fd);
Let me reply to this thread so that I can comment on the above code.
Regards,
Srivatsa S. Bhat
next prev parent reply other threads:[~2012-01-25 15:31 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-24 15:05 PM: cannot hibernate -- BUG at kernel/workqueue.c:3659 Jiri Slaby
2012-01-24 16:18 ` [linux-pm] " Srivatsa S. Bhat
2012-01-24 16:24 ` Jiri Slaby
2012-01-24 16:24 ` [linux-pm] " Jiri Slaby
2012-01-24 22:36 ` Rafael J. Wysocki
2012-01-24 22:36 ` [linux-pm] " Rafael J. Wysocki
2012-01-24 22:47 ` Jiri Slaby
2012-01-24 22:47 ` [linux-pm] " Jiri Slaby
2012-01-24 23:02 ` Rafael J. Wysocki
2012-01-24 23:02 ` [linux-pm] " Rafael J. Wysocki
2012-01-25 0:04 ` Jiri Slaby
2012-01-25 0:04 ` [linux-pm] " Jiri Slaby
2012-01-25 0:10 ` Rafael J. Wysocki
2012-01-25 0:10 ` [linux-pm] " Rafael J. Wysocki
2012-01-25 14:25 ` Jiri Slaby
2012-01-25 14:25 ` [linux-pm] " Jiri Slaby
2012-01-25 15:31 ` Srivatsa S. Bhat [this message]
2012-01-25 16:00 ` Srivatsa S. Bhat
2012-01-25 18:44 ` Srivatsa S. Bhat
2012-01-25 18:44 ` [linux-pm] " Srivatsa S. Bhat
2012-01-25 23:51 ` Rafael J. Wysocki
2012-01-26 11:05 ` Jiri Slaby
2012-01-27 1:01 ` Rafael J. Wysocki
2012-01-26 19:22 ` Srivatsa S. Bhat
2012-01-27 1:01 ` Rafael J. Wysocki
2012-01-27 10:04 ` Srivatsa S. Bhat
2012-01-27 22:44 ` Rafael J. Wysocki
2012-01-28 15:41 ` Srivatsa S. Bhat
2012-01-29 0:14 ` Rafael J. Wysocki
2012-01-25 22:00 ` Jiri Slaby
2012-01-25 22:00 ` [linux-pm] " Jiri Slaby
2012-01-26 19:39 ` Srivatsa S. Bhat
2012-01-26 19:39 ` [linux-pm] " Srivatsa S. Bhat
2012-01-27 1:10 ` Rafael J. Wysocki
2012-01-27 1:10 ` [linux-pm] " Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F20204F.6040606@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=Baohua.Song@csr.com \
--cc=jirislaby@gmail.com \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=pavel@ucw.cz \
--cc=rjw@sisk.pl \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.