[Q] Why does kexec use device_shutdown rather than ubind them

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: linux-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Garrett <matthew.garrett@nebula.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	kexec@lists.infradead.org
Subject: [Q] Why does kexec use device_shutdown rather than ubind them
Date: Fri, 17 Jan 2014 14:39:48 +1100	[thread overview]
Message-ID: <1389929988.7406.18.camel@pasglop> (raw)

Hi Folks !

Sorry for the semi-random CC list, not sure who owns kexec nowadays. So
we are working on a new crop of power servers for which the bootloader
is going to be using kexec.

As expected, we've been chasing a number of reliability issues mostly
due to drivers not behaving properly, such as leaving devices DMA'ing or
in a state that upsets the new kernel etc...

So far our approach has been to fix the drivers one by one, adding the
shutdown() method when it's missing, etc...

But that lead me to wonder ... why shutdown() in the first place ? The
semantic of shutdown() is that we are going to power the machine off. In
some cases, that method will actively participate in the shutdown,
powering things off, spinning disk down, etc....

It doesn't have the semantic of "put the device into a clean state for a
new driver to pick up". It's also rarely implemented.

On the other hand, the remove() routine is almost everywhere, and is
already well understood as needing to leave the device in a clean state,
as it's often used for rmmod (often by the driver developer
him/herself), more likely to be tested in a condition that doesn't
involve having the machine off immediately afterward but on the contrare
in a condition where a new driver can come and try to pick the device
up.

Additionally, remove() is also what KVM does when assigning devices to
guest, ie, the original driver is unbound from the host, and VFIO is
bound in its place.

So we have common purpose with kexec (somewhat) and possibly common (and
better) testing coverage with remove() than with shutdown().

I plan to experiment a bit in our bootloader see if that makes a
difference, maybe doing a first pass of unbind for anything that can be
unbound, and shutdown for the rest.

(I'll probably also sneak it a PCIe hot reset at the end but that's more
platform specific).

Any opinion ?

Cheers,
Ben.

next             reply	other threads:[~2014-01-17  3:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-17  3:39 Benjamin Herrenschmidt [this message]
2014-01-17  4:52 ` [Q] Why does kexec use device_shutdown rather than ubind them Eric W. Biederman
2014-01-17  5:59   ` Benjamin Herrenschmidt
2014-01-17 14:13     ` Vivek Goyal
2014-01-17 15:57       ` Sumner, William
2014-01-17 21:00       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1389929988.7406.18.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.garrett@nebula.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox