From: stuart hayes <stuart.w.hayes@gmail.com>
To: Bert Karwatzki <spasswolf@web.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-kernel@vger.kernel.org, linux-next@vger.kernel.org,
Tejun Heo <tj@kernel.org>
Subject: Re: hung tasks on shutdown in linux-next-202409{20,23,24,25}
Date: Mon, 30 Sep 2024 16:11:55 -0500 [thread overview]
Message-ID: <f4547877-8aa2-45a0-b05d-624eb4e2d296@gmail.com> (raw)
In-Reply-To: <20240929105329.4797-1-spasswolf@web.de>
On 9/29/2024 5:53 AM, Bert Karwatzki wrote:
> Summary: The introduction of async reboot in commit 8064952c6504
> ("driver core: shut down devices asynchronously") leads to frequent hangs on
> shutdown even after commit 4f2c346e6216 ("driver core: fix async device shutdown hang")
> is introduced.
>
> I did some further experimenting (and lots of reboots ...) and found out that
> the bug is preemption related, for me it only occurs when using CONFIG_PREEMPT=y
> or CONFIG_PREEMPT_RT=y. When using CONFIG_PREEMPT_NONE=y or
> CONFIG_PREEMPT_VOLUNTARY=y everything works fine.
>
> Test results (linux-next-20240925):
> PREEMPT_NONE 20 reboots, no fail
> PREEMPT_VOLUNTARY 20 reboots, no fail
> PREEMPT 3 reboots, 4th reboot failed
> PREEMPT_RT 2 reboots, 3rd reboot failed
>
> The behaviour can be improved by increasing the number of min_active items
> in the async workqueue:
>
Thank you for continuing to look at this! That is interesting data.
I see from an earlier message that drm_atomic_helper_dirtyfb is holding a lock when
the hang occurs:
> T115;4 locks held by kworker/7:2/343:
> T115; #0: ffff91ea00050d48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x4a4/0x580
> T115; #1: ffffbaf182e07e58 ((work_completion)(&helper->damage_work)){+.+.}-{0:0}, at: process_one_work+0x1c7/0x580
> T115; #2: ffffbaf182e07d00 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_atomic_helper_dirtyfb+0x47/0x280
> T115; #3: ffff91ea13b80528 (crtc_ww_class_mutex){+.+.}-{3:3}, at: modeset_lock+0xbf/0x1b0
Except for NVMe drives, the shutdown process with the async shutdown patches should be
the same as the shutdown process without the patch--that is, the devices should be shut
down one after the other, in the same order... the only difference is that the individual
device shutdowns are scheduled in a workqueue where they wait for the previous device
shutdown to finish, instead of being shut down one at a time in a loop in the systemd
task. So I'm wondering if the async shutdown could somehow exposing some sort of race in
a display device driver's shutdown function.
A full CPU backtrace (which you could get from setting /proc/sys/kernel/hung_task_all_cpu_backtrace
before reproducing the error) would be extremely helpful if you have the inclination... :)
next prev parent reply other threads:[~2024-09-30 21:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-29 10:53 hung tasks on shutdown in linux-next-202409{20,23,24,25} Bert Karwatzki
2024-09-30 21:11 ` stuart hayes [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-09-29 10:52 Bert Karwatzki
2024-09-25 21:37 Bert Karwatzki
2024-09-25 21:48 ` stuart hayes
2024-09-25 11:40 Bert Karwatzki
2024-09-25 12:09 ` Greg Kroah-Hartman
2024-09-25 19:15 ` Bert Karwatzki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f4547877-8aa2-45a0-b05d-624eb4e2d296@gmail.com \
--to=stuart.w.hayes@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=spasswolf@web.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox