public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Stephen Berman <stephen.berman@gmx.net>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org
Subject: Re: power-off delay/hang due to commit 6d25be57 (mainline)
Date: Thu, 11 Jun 2020 00:49:26 +0200	[thread overview]
Message-ID: <87imfyh6yx.fsf@gmx.net> (raw)
In-Reply-To: <20200610102514.4vdzu5u7d6vnpicn@linutronix.de> (Sebastian Andrzej Siewior's message of "Wed, 10 Jun 2020 12:25:14 +0200")

[-- Attachment #1: Type: text/plain, Size: 2018 bytes --]

On Wed, 10 Jun 2020 12:25:14 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> On 2020-06-10 10:21:29 [+0200], Stephen Berman wrote:
>> (GMX put your email into my spam folder, so I didn't see it before I
>> sent my followup about removing the wifi firmware.)
>
> okay.
>
>> On Tue, 9 Jun 2020 22:23:39 +0200 Sebastian Andrzej Siewior
>> <bigeasy@linutronix.de> wrote:
>> >   scripts/decode_stacktrace.sh vmlinux $(pwd)/ kernel_power_off+0x3d/0x70
>> >
>> > should reveal that.
>>
>> I saved the call trace lines to a file and did this (with `$(pwd)/' the
>> script did not run):
>>
>> steve [ /sources/linux-5.6.4 ]$ ./scripts/decode_stacktrace.sh vmlinux
>> . kernel_power_off+0x3d/0x70 < /home/steve/temp/call_trace
>> [1352.***] Call Trace:
>> [1352.***] ? __schedule (core.c:?)
>> [1352.***] schedule (??:?)
>> [1352.***] schedule_timeout (??:?)
>> [1352.***] ? acpi_os_release_object (??:?)
>> [1352.***] ? acpi_ut_update_object_reference (??:?)
>> [1352.***] wait_for_completion (??:?)
>> [1352.***] ? wake_up_q (??:?)
>> [1352.***] flush_workqueue (??:?)
>> [1352.***] kernel_power_off (??:?)
>> [1352.***] __do_sys_reboot (reboot.c:?)
>> [1352.***] ? do_send_sig_info (??:?)
>> [1352.***] ? kill_pid_info (??:?)
>> [1352.***] ? kill_something_info (signal.c:?)
>> [1352.***] ? __x64_sys_kill (??:?)
>> [1352.***] do_syscall_64 (??:?)
>> [1352.***] entry_SYSCALL_64_after_hwframe (??:?)
>> [1352.***] RIP: 0033:0x7f95dd6992c3
>> [1352.***] Code: Bad RIP value.
>> objdump: '/tmp/tmp.BlpNDdSQpG.o': No such file
>>
>> Did I do something wrong or do I need to pass certain compiler flags
>> when building the kernel (when I build Emacs I pass CFLAGS='-Og -g3')?
>
> the arguments last argument should not be kernel_power_off+0x3d/0x70. I
> should have been entered after "enter" but you can also pipe the whole
> file as you did.
> If it gives you the ?? then might need
>   CONFIG_DEBUG_INFO

Thanks.  I set that, recompiled and now get this:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: decode_stacktrace --]
[-- Type: application/text/plain, Size: 2667 bytes --]

steve [ /sources/linux-5.6.4 ]$ ./scripts/decode_stacktrace.sh vmlinux $(pwd)/ < /home/steve/temp/call_trace
[1352.***] Call Trace:
[1352.***] ? __schedule (/sources/linux-5.6.4/kernel/sched/core.c:3380 /sources/linux-5.6.4/kernel/sched/core.c:4080) 
[1352.***] schedule (/sources/linux-5.6.4/./arch/x86/include/asm/bitops.h:207 (discriminator 1) /sources/linux-5.6.4/./include/asm-generic/bitops/instrumented-non-atomic.h:111 (discriminator 1) /sources/linux-5.6.4/./include/linux/thread_info.h:84 (discriminator 1) /sources/linux-5.6.4/./include/linux/sched.h:1820 (discriminator 1) /sources/linux-5.6.4/kernel/sched/core.c:4156 (discriminator 1)) 
[1352.***] schedule_timeout (/sources/linux-5.6.4/kernel/time/timer.c:1872) 
[1352.***] ? acpi_os_release_object (/sources/linux-5.6.4/drivers/acpi/osl.c:1698) 
[1352.***] ? acpi_ut_update_object_reference (/sources/linux-5.6.4/drivers/acpi/acpica/utdelete.c:631) 
[1352.***] wait_for_completion (/sources/linux-5.6.4/./include/linux/spinlock.h:363 /sources/linux-5.6.4/kernel/sched/completion.c:84 /sources/linux-5.6.4/kernel/sched/completion.c:104 /sources/linux-5.6.4/kernel/sched/completion.c:115 /sources/linux-5.6.4/kernel/sched/completion.c:136) 
[1352.***] ? wake_up_q (/sources/linux-5.6.4/kernel/sched/core.c:4348) 
[1352.***] flush_workqueue (/sources/linux-5.6.4/kernel/workqueue.c:2831) 
[1352.***] kernel_power_off (/sources/linux-5.6.4/kernel/reboot.c:292) 
[1352.***] __do_sys_reboot (/sources/linux-5.6.4/kernel/reboot.c:366) 
[1352.***] ? do_send_sig_info (/sources/linux-5.6.4/kernel/signal.c:1295) 
[1352.***] ? kill_pid_info (/sources/linux-5.6.4/kernel/signal.c:1447) 
[1352.***] ? kill_something_info (/sources/linux-5.6.4/kernel/signal.c:1562) 
[1352.***] ? __x64_sys_kill (/sources/linux-5.6.4/kernel/signal.c:3639) 
[1352.***] do_syscall_64 (/sources/linux-5.6.4/arch/x86/entry/common.c:294) 
[1352.***] entry_SYSCALL_64_after_hwframe (/sources/linux-5.6.4/arch/x86/entry/entry_64.S:184) 
[1352.***] RIP: 0033:0x7f95dd6992c3
[1352.***] Code: Bad RIP value.
objdump: '/tmp/tmp.AjlgcOiArF.o': No such file

Code starting with the faulting instruction
===========================================
[1352.***] RSP: 002b:00007ffc4a3e4b88 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9
[1352.***] RAX: ffffffffffffffda RBX: 00007ffc4a3e4cd8 RCX: 00007f95dd6992c3
[1352.***] RDX: 000000004321fedc RSI: 0000000028121969 RDI: 00000000fee1dead
[1352.***] RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000
[1352.***] R10: 0000000000400968 R11: 0000000000000282 R12: 0000000000000000
[1352.***] R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000

[-- Attachment #3: Type: text/plain, Size: 2937 bytes --]


[...]
>> By the other patch do you mean the following?  (This email was also put
>> into my spam by GMX and AFAICT has not yet shown up on the vger list.)
>
> Yes. But now that I played a little with it, I made dis:
[...]
> So you should have output like:
> |[   14.131696] Flushing events_freezable_power_
> the flush that hangs
>
> |[   17.203246] Flushing events_freezable_power_ over, 0
> It has been noticed (0 at the end)
>
> |[   17.203849] Showing busy workqueues and worker pools:
> |[   17.204400] workqueue events_freezable_power_: flags=0x84
> |[   17.205009]   pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
> |[   17.205774]     in-flight: 2147:disk_events_workfn
> |[   17.206268] workqueue kblockd: flags=0x18
> |[   17.206682]   pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 refcnt=2
> |[   17.207428]     pending: blk_mq_requeue_work
> |[ 17.207958] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=2s workers=3 idle:
> | 535 42
>
> A dump of all pending worqueue items.
>
> |[   20.275239] Flushing events_freezable_power_ over, 0
> |[   20.275812] Showing busy workqueues and worker pools:
> |[   20.276385] workqueue events_freezable_power_: flags=0x84
> |[   20.276972]   pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
> |[   20.277712]     in-flight: 2147:disk_events_workfn
> |[   20.278245] workqueue kblockd: flags=0x18
> |[   20.278685]   pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 refcnt=2
> |[   20.279471]     pending: blk_mq_requeue_work
> |[ 20.279989] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=5s workers=3 idle:
> | 535 42
>
> Another atempt. As you see `blk_mq_requeue_work' is references in both
> outputs but it makes progress (it is scheduled multiple times). The item
> `disk_events_workfn' stucks.

I rebuilt kernel 5.6.4 with that patch, but before calling `shutdown -h
now' I guess I should have set printk_delay, because this time there was
a lot of output and it went by too fast.  I'll try to delay it the next
time.  After the initial output (which went by too fast for me to
photograph) there were endlessly repeated screenfuls of output like this:

[ 2578.402398] , acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred

and at the bottom of the screen this:

[ 2578.427680] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=4 idle: 1331

After this line there was a brief pause (so I was able to photograph it)
and then it repeated, always the same.  In the last line pool, cpus,
node, flags, nice and hung were always 0, and workers was always 4.
Only idle changed, but seemed to cycle through a few values.  After
watching this repetition for several minutes, I finally did a hard
reboot.

Steve Berman

  reply	other threads:[~2020-06-10 22:49 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01 15:46 power-off delay/hang due to commit 6d25be57 (mainline) Stephen Berman
2020-05-06 21:57 ` Sebastian Andrzej Siewior
2020-05-08 21:30   ` Stephen Berman
2020-05-13 22:04     ` Sebastian Andrzej Siewior
2020-05-14 21:39       ` Stephen Berman
2020-05-22 16:40         ` Sebastian Andrzej Siewior
2020-06-09 10:06           ` Stephen Berman
2020-06-09 20:23             ` Sebastian Andrzej Siewior
2020-06-10  8:21               ` Stephen Berman
2020-06-10 10:25                 ` Sebastian Andrzej Siewior
2020-06-10 22:49                   ` Stephen Berman [this message]
2020-06-11 15:39                     ` Stephen Berman
2020-06-12 11:01                       ` Sebastian Andrzej Siewior
2020-06-14 12:12                         ` Stephen Berman
2020-06-14 17:10                           ` Sebastian Andrzej Siewior
2020-06-15  7:58                             ` Stephen Berman
2020-06-15 14:51                               ` Sebastian Andrzej Siewior
2020-06-15 15:41                                 ` Stephen Berman
2020-06-15 15:58                                   ` Sebastian Andrzej Siewior
2020-06-15 16:19                                     ` Stephen Berman
2020-06-15 16:32                                       ` Sebastian Andrzej Siewior
2020-06-16  7:14                                 ` Stephen Berman
2020-06-16  7:38                                   ` Sebastian Andrzej Siewior
2020-06-16  8:13                                     ` Stephen Berman
2020-06-16 15:55                                       ` Sebastian Andrzej Siewior
2020-06-16 20:28                                         ` Stephen Berman
2020-06-17 14:27                                           ` Sebastian Andrzej Siewior
2020-06-17 21:09                                             ` Stephen Berman
2020-06-24 20:11                                               ` Sebastian Andrzej Siewior
2020-06-24 21:49                                                 ` Stephen Berman
2020-07-14 13:44                                                   ` Sebastian Andrzej Siewior
2020-07-14 13:54                                                     ` Rafael J. Wysocki
2020-07-14 14:11                                                       ` Sebastian Andrzej Siewior
2020-07-14 15:53                                                         ` Rafael J. Wysocki
2020-07-14 16:10                                                           ` Sebastian Andrzej Siewior
2020-08-11 10:27                                                           ` Sebastian Andrzej Siewior
2020-08-11 14:02                                                             ` Rafael J. Wysocki
2020-07-19 10:07                                                         ` Stephen Berman
2020-08-11 11:58                                                           ` Stephen Berman
2020-08-11 13:29                                                             ` Sebastian Andrzej Siewior
2020-08-11 14:34                                                               ` Rafael J. Wysocki
2020-08-11 15:25                                                                 ` Sebastian Andrzej Siewior
2020-08-11 17:22                                                                   ` Stephen Berman
2020-08-11 18:49                                                                     ` Sebastian Andrzej Siewior
2020-10-06 21:49                                                                       ` Sebastian Andrzej Siewior
2020-10-07 16:18                                                                         ` Rafael J. Wysocki
2020-10-26 17:20                                                                           ` Sebastian Andrzej Siewior
2020-12-02 18:03                                                                             ` Sebastian Andrzej Siewior
2020-12-02 18:31                                                                               ` Rafael J. Wysocki
2020-12-02 19:13                                                                                 ` Rafael J. Wysocki
2020-12-31 20:46                                                                                   ` Rafael J. Wysocki
2021-01-02 11:03                                                                                     ` Rafael J. Wysocki
2021-01-04 15:38                                                                                     ` Stephen Berman
2021-01-24 13:49                                                                                       ` Stephen Berman
2021-01-25 16:25                                                                                         ` Rafael J. Wysocki
2020-06-20 19:08               ` [PATCH] SCSI: Disable CD-ROM poll on shutdown kernel test robot
2020-06-09 21:26             ` power-off delay/hang due to commit 6d25be57 (mainline) Stephen Berman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87imfyh6yx.fsf@gmx.net \
    --to=stephen.berman@gmx.net \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox