public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Khasnis Soumya <soumya.khasnis@sony.com>
To: Daniel Lezcano <daniel.lezcano@linaro.org>, gregkh@linuxfoundation.org
Cc: rafael@kernel.org, linux-kernel@vger.kernel.org,
	daniel.lezcano@linaro.org, festevam@denx.de, lee@kernel.org,
	benjamin.bara@skidata.com, dmitry.osipenko@collabora.com,
	ldmldm05@gmail.com, soumya.khasnis@sony.com,
	srinavasa.nagaraju@sony.com, Madhusudan.Bobbili@sony.com,
	shingo.takeuchi@sony.com, keita.aihara@sony.com,
	masaya.takahashi@sony.com
Subject: Re: [PATCH v5] driver core: Add timeout for device shutdown
Date: Wed, 19 Jun 2024 10:00:00 +0000	[thread overview]
Message-ID: <20240619100000.GA10362@sony.com> (raw)
In-Reply-To: <ecf55d97-363d-4731-bcfa-81cb4e58f2c7@linaro.org>

On Thu, Jun 13, 2024 at 01:51:57PM +0200, Daniel Lezcano wrote:
> On 13/06/2024 10:43, Greg KH wrote:
> > On Thu, Jun 13, 2024 at 08:32:26AM +0000, Soumya Khasnis wrote:
> >> The device shutdown callbacks invoked during shutdown/reboot
> >> are prone to errors depending on the device state or mishandling
> >> by one or more driver. In order to prevent a device hang in such
> >> scenarios, we bail out after a timeout while dumping a meaningful
> >> call trace of the shutdown callback to kernel logs, which blocks
> >> the shutdown or reboot process.
> > 
> > Again, this is not a "device shutdown" timeout, it is a "the whole
> > system has not shutdown this fast" timeout.
> > 
> > And in looking at my system, it doesn't shutdown in 10 seconds as it is
> > madly flushing a ton of stuff out to the disks, and they are slow
> > beasts.  So your 10 second default would cause me data loss on my
> > workstation, not good!
> 
> Thanks for pointing this out. It is exactly what I was worried about ...
Thank you for comments Daniel and Greg, let me explain.

Typically reboot/shutdown sequence involves following steps in User land before kernel restart/shutdown sequence is entered.

1.	Terminate all services (except shutdown critical tasks)
2.	Sync File systems
3.	Unmount File systems
4.	Trigger kernel reboot(LINUX_REBOOT_CMD_RESTART/LINUX_REBOOT_CMD_POWER_OFF) system call

A userspace watchdog can be setup for above as exists on Android system.
This needs large timeout value because it involves syncing data to disks.  

Below is the kernel restart sequence after control moves to kernel in step 4).
The issue we intend to address here is that the device driver shutdown callbacks may hang
due to unresponsive device or a broken driver.

|-kernel_restart()
              |- kernel_restart_prepare()
                     |- device_shutdown() // Iterates over the device hierarchy and invokes the shutdown callbacks (class/bus/driver->shutdown)
              |- syscore_shutdown()
              |- machine_restart()

I still believe a 10 sec timeout as default is reasonable for the device_shutdown().
Not all drivers necessarily implement a shutdown callback and the timeout can be configured for large systems as needed.


> 
> [ ... ]
> 
> > Isn't this just a bug in your drivers?  Why not fix them?  Or if you
> > really have to have 10 seconds to shut down, use a watchdog timer that
> > you trigger from userspace and stop petting once you want to shut down.
> > Then, if it expires it will reset the machine, all of your policy
> > decisions would have been done in userspace, no need to get the kernel
> > involved at all.
> 
> +1
> 
> 
> -- 
> <https://urldefense.com/v3/__http://www.linaro.org/__;!!JmoZiZGBv3RvKRSx!_c6dCsrFBbO_ivlpLdqDvkFPd2bIFgHN48Xbjt4dqXVv5_QYeLwNMJOuy_jh5vBfqDUbNuCQ23qnLmHmRRCvtllhT_Uq$ [linaro[.]org]> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <https://urldefense.com/v3/__http://www.facebook.com/pages/Linaro__;!!JmoZiZGBv3RvKRSx!_c6dCsrFBbO_ivlpLdqDvkFPd2bIFgHN48Xbjt4dqXVv5_QYeLwNMJOuy_jh5vBfqDUbNuCQ23qnLmHmRRCvtqiO2qBL$ [facebook[.]com]> Facebook |
> <https://urldefense.com/v3/__http://twitter.com/*!/linaroorg__;Iw!!JmoZiZGBv3RvKRSx!_c6dCsrFBbO_ivlpLdqDvkFPd2bIFgHN48Xbjt4dqXVv5_QYeLwNMJOuy_jh5vBfqDUbNuCQ23qnLmHmRRCvtrJS5bNz$ [twitter[.]com]> Twitter |
> <https://urldefense.com/v3/__http://www.linaro.org/linaro-blog/__;!!JmoZiZGBv3RvKRSx!_c6dCsrFBbO_ivlpLdqDvkFPd2bIFgHN48Xbjt4dqXVv5_QYeLwNMJOuy_jh5vBfqDUbNuCQ23qnLmHmRRCvthplPsVl$ [linaro[.]org]> Blog
> 

  reply	other threads:[~2024-06-19 10:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13  8:32 [PATCH v5] driver core: Add timeout for device shutdown Soumya Khasnis
2024-06-13  8:43 ` Greg KH
2024-06-13 11:51   ` Daniel Lezcano
2024-06-19 10:00     ` Khasnis Soumya [this message]
2024-06-19 10:48       ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240619100000.GA10362@sony.com \
    --to=soumya.khasnis@sony.com \
    --cc=Madhusudan.Bobbili@sony.com \
    --cc=benjamin.bara@skidata.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dmitry.osipenko@collabora.com \
    --cc=festevam@denx.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=keita.aihara@sony.com \
    --cc=ldmldm05@gmail.com \
    --cc=lee@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masaya.takahashi@sony.com \
    --cc=rafael@kernel.org \
    --cc=shingo.takeuchi@sony.com \
    --cc=srinavasa.nagaraju@sony.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox