* kexec: device shutdown vs. remove
@ 2016-07-23 20:51 Benjamin Herrenschmidt
2016-07-24 5:18 ` Guenter Roeck
2016-07-24 5:24 ` Eric W. Biederman
0 siblings, 2 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2016-07-23 20:51 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org
Cc: Eric W. Biederman, Joel Stanley, Jeremy Kerr, Greg KH
Hi !
This is somewhat of a recurring issue, some of my previous attempts on
lkml, I suspect, were just drowned in the noise. Eric, we had a quick
discussion about this a while back but I don't think we reached a
conclusion.
A bit of context: On OpenPOWER machines, we have a Linux based
bootloader, so we rely heavily on kexec to boot distro kernels, and
this has been causing us grief, mostly in the device driver space.
Device drivers need to be quiesced before kexec. More specifically
the device *hardware* needs that, ie we want DMAs to stop and the
device to be put into a state where it can reliably be picked up by the
driver in the new kernel.
Today, kexec calls device_shutdown() to achieve that. I argue that this
is the wrong thing to do and instead we should do someting that causes
the various drivers ->remove() function to be called (whether that
implies actually unbinding the driver or not).
I believe we do this for historical reasons, as ->remove() used to
depend on CONFIG_HOTPLUG while ->shutdown() was always around but that
is no longer the case.
The most visible issue with ->shutdown() that we encouter is that a lot
of drivers simply don't implement it.
The *real* issue however is that it's the wrong thing to do anyway. It
is a call intended to be called when the machine will be shutdown, as
such not only it is very much optional (and rarely implemented), but it
can also (and will in some cases) power bits of hardware off which is
not what you want to do if a new driver will try to pick up the pieces.
Arguably, the most correct semantic is provided by ->remove() since
that corresponds to removing a driver and binding a new one to the
device. IE. the same flow as doing rmmod/insmod of a new driver.
In practice, we obseve that a lot more drivers implement ->remove(). A
few were "fixed" to have ->shutdown() for kexec stake over time, but in
many case it's a duplication of ->remove() (ugh...).
So I would like to discuss this or at least get feedback and an overall
agreement. I can provide patches to test fairly soon.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-23 20:51 kexec: device shutdown vs. remove Benjamin Herrenschmidt
@ 2016-07-24 5:18 ` Guenter Roeck
2016-07-24 13:13 ` Benjamin Herrenschmidt
2016-07-24 5:24 ` Eric W. Biederman
1 sibling, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2016-07-24 5:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-kernel@vger.kernel.org, Eric W. Biederman, Joel Stanley,
Jeremy Kerr, Greg KH
On Sun, Jul 24, 2016 at 06:51:52AM +1000, Benjamin Herrenschmidt wrote:
> Hi !
>
> This is somewhat of a recurring issue, some of my previous attempts on
> lkml, I suspect, were just drowned in the noise. Eric, we had a quick
> discussion about this a while back but I don't think we reached a
> conclusion.
>
> A bit of context: On OpenPOWER machines, we have a Linux based
> bootloader, so we rely heavily on kexec to boot distro kernels, and
> this has been causing us grief, mostly in the device driver space.
>
> Device drivers need to be quiesced before kexec. More specifically
> the device *hardware* needs that, ie we want DMAs to stop and the
> device to be put into a state where it can reliably be picked up by the
> driver in the new kernel.
>
> Today, kexec calls device_shutdown() to achieve that. I argue that this
> is the wrong thing to do and instead we should do someting that causes
> the various drivers ->remove() function to be called (whether that
> implies actually unbinding the driver or not).
>
> I believe we do this for historical reasons, as ->remove() used to
> depend on CONFIG_HOTPLUG while ->shutdown() was always around but that
> is no longer the case.
>
> The most visible issue with ->shutdown() that we encouter is that a lot
> of drivers simply don't implement it.
>
> The *real* issue however is that it's the wrong thing to do anyway. It
> is a call intended to be called when the machine will be shutdown, as
> such not only it is very much optional (and rarely implemented), but it
> can also (and will in some cases) power bits of hardware off which is
> not what you want to do if a new driver will try to pick up the pieces.
>
> Arguably, the most correct semantic is provided by ->remove() since
> that corresponds to removing a driver and binding a new one to the
> device. IE. the same flow as doing rmmod/insmod of a new driver.
>
> In practice, we obseve that a lot more drivers implement ->remove(). A
> few were "fixed" to have ->shutdown() for kexec stake over time, but in
> many case it's a duplication of ->remove() (ugh...).
>
> So I would like to discuss this or at least get feedback and an overall
> agreement. I can provide patches to test fairly soon.
>
I suspect that using (or depending on) the remove function may not be feasible
anymore after the recent effort by Paul Gortmaker to make drivers explicitly
non-modular if they are only configurable as boolean. In many cases, this
involved dropping remove functions.
Guenter
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-23 20:51 kexec: device shutdown vs. remove Benjamin Herrenschmidt
2016-07-24 5:18 ` Guenter Roeck
@ 2016-07-24 5:24 ` Eric W. Biederman
2016-07-24 13:15 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2016-07-24 5:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-kernel@vger.kernel.org, Joel Stanley, Jeremy Kerr, Greg KH
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> Hi !
>
> This is somewhat of a recurring issue, some of my previous attempts on
> lkml, I suspect, were just drowned in the noise. Eric, we had a quick
> discussion about this a while back but I don't think we reached a
> conclusion.
>
> A bit of context: On OpenPOWER machines, we have a Linux based
> bootloader, so we rely heavily on kexec to boot distro kernels, and
> this has been causing us grief, mostly in the device driver space.
>
> Device drivers need to be quiesced before kexec. More specifically
> the device *hardware* needs that, ie we want DMAs to stop and the
> device to be put into a state where it can reliably be picked up by the
> driver in the new kernel.
>
> Today, kexec calls device_shutdown() to achieve that. I argue that this
> is the wrong thing to do and instead we should do someting that causes
> the various drivers ->remove() function to be called (whether that
> implies actually unbinding the driver or not).
>
> I believe we do this for historical reasons, as ->remove() used to
> depend on CONFIG_HOTPLUG while ->shutdown() was always around but that
> is no longer the case.
>
> The most visible issue with ->shutdown() that we encouter is that a lot
> of drivers simply don't implement it.
>
> The *real* issue however is that it's the wrong thing to do anyway. It
> is a call intended to be called when the machine will be shutdown, as
> such not only it is very much optional (and rarely implemented), but it
> can also (and will in some cases) power bits of hardware off which is
> not what you want to do if a new driver will try to pick up the pieces.
>
> Arguably, the most correct semantic is provided by ->remove() since
> that corresponds to removing a driver and binding a new one to the
> device. IE. the same flow as doing rmmod/insmod of a new driver.
>
> In practice, we obseve that a lot more drivers implement ->remove(). A
> few were "fixed" to have ->shutdown() for kexec stake over time, but in
> many case it's a duplication of ->remove() (ugh...).
>
> So I would like to discuss this or at least get feedback and an overall
> agreement. I can provide patches to test fairly soon.
I thought I had given that feedback awhile ago.
To recap. I wanted the reboot path and the kexec path to be the same.
(Because arguably they are the same and have the same requirements,
although a lot of firmware toggles the machines reset line in that case
making that less true).
People didn't want to have all of the non-hardware specific cleanup
people do in the reboot path because it might cause problems with
machines. So shutdown was born.
In practice as you have observed the remove code is tested and the
shutdown code is not.
In practice we have an emergency reboot path that doesn't do any
hardware shutdown. Which probably better fills the original need
of a reboot that doesn't spend time cleaning up.
If you are willing to do the work to merge shutdown into remove and
simplify the drivers, perform the testing and the other state I am in
favor of the change. I think we have had enough time to see if have two
methods was maintainable for the driver authors.
Eric
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-24 5:18 ` Guenter Roeck
@ 2016-07-24 13:13 ` Benjamin Herrenschmidt
2016-07-24 21:36 ` Eric W. Biederman
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2016-07-24 13:13 UTC (permalink / raw)
To: Guenter Roeck
Cc: linux-kernel@vger.kernel.org, Eric W. Biederman, Joel Stanley,
Jeremy Kerr, Greg KH
On Sat, 2016-07-23 at 22:18 -0700, Guenter Roeck wrote:
> I suspect that using (or depending on) the remove function may not be feasible
> anymore after the recent effort by Paul Gortmaker to make drivers explicitly
> non-modular if they are only configurable as boolean. In many cases, this
> involved dropping remove functions.
A lot of drivers we care about are modular. But maybe the right
approach is to do something like remove() if it exist and shutdown() if
it doesn't ? Or a new callback for kexec ? quiesce() ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-24 5:24 ` Eric W. Biederman
@ 2016-07-24 13:15 ` Benjamin Herrenschmidt
2016-07-24 21:35 ` Eric W. Biederman
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2016-07-24 13:15 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-kernel@vger.kernel.org, Joel Stanley, Jeremy Kerr, Greg KH
On Sun, 2016-07-24 at 00:24 -0500, Eric W. Biederman wrote:
> If you are willing to do the work to merge shutdown into remove and
> simplify the drivers, perform the testing and the other state I am in
> favor of the change. I think we have had enough time to see if have two
> methods was maintainable for the driver authors.
Well, remove is going away in some drivers at least...
Also shutdown() has two different meaning between kexec and actual
machine shutdown...
Should we create a new one instead ? Something like quiesce() ? If
absent, look for remove(), if absent too, look for shutdown() ...
Or we continue doing shutdown() for now with a fallback to remove() if
shutdown is NULL (this is what I've been toying with internally).
Cheers,
ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-24 13:15 ` Benjamin Herrenschmidt
@ 2016-07-24 21:35 ` Eric W. Biederman
0 siblings, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2016-07-24 21:35 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-kernel@vger.kernel.org, Joel Stanley, Jeremy Kerr, Greg KH
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Sun, 2016-07-24 at 00:24 -0500, Eric W. Biederman wrote:
>> If you are willing to do the work to merge shutdown into remove and
>> simplify the drivers, perform the testing and the other state I am in
>> favor of the change. I think we have had enough time to see if have two
>> methods was maintainable for the driver authors.
>
> Well, remove is going away in some drivers at least...
>
> Also shutdown() has two different meaning between kexec and actual
> machine shutdown...
>
> Should we create a new one instead ? Something like quiesce() ? If
> absent, look for remove(), if absent too, look for shutdown() ...
>
> Or we continue doing shutdown() for now with a fallback to remove() if
> shutdown is NULL (this is what I've been toying with internally).
A shutdown method that doesn't work for kexec is a poorly tested buggy
implementation of shutdown.
I don't driver authors for the confusion but it remains true that
shutdown has always been called in the kexec path. So if your shutdown
method does not work for kexec it is buggy (by definition).
Eric
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-24 13:13 ` Benjamin Herrenschmidt
@ 2016-07-24 21:36 ` Eric W. Biederman
2016-07-25 0:28 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2016-07-24 21:36 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Guenter Roeck, linux-kernel@vger.kernel.org, Joel Stanley,
Jeremy Kerr, Greg KH
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Sat, 2016-07-23 at 22:18 -0700, Guenter Roeck wrote:
>> I suspect that using (or depending on) the remove function may not be feasible
>> anymore after the recent effort by Paul Gortmaker to make drivers explicitly
>> non-modular if they are only configurable as boolean. In many cases, this
>> involved dropping remove functions.
>
> A lot of drivers we care about are modular. But maybe the right
> approach is to do something like remove() if it exist and shutdown() if
> it doesn't ? Or a new callback for kexec ? quiesce() ?
Perhaps remove if shutdown does not exist. What this really takes is
someone to care enough to sort through this mess.
Eric
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kexec: device shutdown vs. remove
2016-07-24 21:36 ` Eric W. Biederman
@ 2016-07-25 0:28 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2016-07-25 0:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Guenter Roeck, linux-kernel@vger.kernel.org, Joel Stanley,
Jeremy Kerr, Greg KH
On Sun, 2016-07-24 at 16:36 -0500, Eric W. Biederman wrote:
> > A lot of drivers we care about are modular. But maybe the right
> > approach is to do something like remove() if it exist and
> shutdown() if
> > it doesn't ? Or a new callback for kexec ? quiesce() ?
>
> Perhaps remove if shutdown does not exist. What this really takes is
> someone to care enough to sort through this mess.
Right, I have a test patch doing just that which I'm about to start
testing internally.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-07-25 0:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-23 20:51 kexec: device shutdown vs. remove Benjamin Herrenschmidt
2016-07-24 5:18 ` Guenter Roeck
2016-07-24 13:13 ` Benjamin Herrenschmidt
2016-07-24 21:36 ` Eric W. Biederman
2016-07-25 0:28 ` Benjamin Herrenschmidt
2016-07-24 5:24 ` Eric W. Biederman
2016-07-24 13:15 ` Benjamin Herrenschmidt
2016-07-24 21:35 ` Eric W. Biederman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox