linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Khalid Aziz <khalid.aziz@hp.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>,
	linux-kernel@vger.kernel.org, bhelgaas@google.com,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH] Disable Bus Master on PCI device shutdown
Date: Wed, 06 Jun 2012 12:42:07 -0700	[thread overview]
Message-ID: <87obowxm5s.fsf@xmission.com> (raw)
In-Reply-To: <1339006060.25761.689.camel@lyra> (Khalid Aziz's message of "Wed, 06 Jun 2012 12:07:40 -0600")

Khalid Aziz <khalid.aziz@hp.com> writes:

> On Wed, 2012-06-06 at 18:42 +0100, Matthew Garrett wrote:
>> On Wed, Jun 06, 2012 at 11:32:36AM -0600, Khalid Aziz wrote:
>> 
>> > Do we agree that if device shutdown routine cleanly shuts down all I/O,
>> > clearing PCI Bus Mster bit should be safe?
>> 
>> In the absence of hardware that dislikes the bus master bit ever being 
>> disabled, yes. Do we know if hardware is ever tested in that situation?
>
> I will wait for device vendors to comment on that. I can't claim I have
> tested more than a few devices that way.

Testing is easy.  kexec into a new kernel.  Shrug.  A long standing
useful kernel feature.  In all other cases I expec the firmware triggers
a board level reset of the hardware to avoid issues during reboot.

>> > If yes, then we only have to deal with broken devices. So the approach 
>> > could be to disable Bus Master bit unless the device ID matches a 
>> > blacklist which we update as we find broken devices. I really don't 
>> > like the idea of maintaining blacklists in the kernel for such things 
>> > but is that a more practical approach? If blacklist does not sound 
>> > good, maybe we can ask drivers to tell PCI subsystem if they are not 
>> > ok with clearing Bus Master bit and then PCI subsystem could skip 
>> > those devices.
>> 
>> Or we could just put responsibility on the drivers to ensure that the 
>> hardware won't continue doing any DMA, either by shutting down the 
>> engines or clearing the bit.

Where the responsibily has squarely been for the last decade, and we
still have issues in the common case.

> I assume device shutdown routine should stop all I/O and shutting down
> DMA engine. Disabling Bus Master bit is just an extra measure of safety.
> I do like the idea of disabling Bus Master bit in device shutdown
> routine. After all, drivers know their hardware best. On the other hand,
> it is change to lots of driver code to implement this which means it
> will end up happening slowly over period of time. I don't mind doing the
> work up front on a good number of drivers I feel comfortable modifying.
> I am ok with pulling out code to clear bus master bit from PCI subsystem
> and replacing it with modified shutdown routines for a few drivers to
> start with.

Absent anyone even knowing if there are devices that exist that can not
tolerate their bus master bit being flipped when DMA is not ongoing I
think the current state of the code is good.  When we find the broken
hardware that can not tolerate a standard PCI bit being used in a
standard way we can add a flag in the core to avoid doing that.

pci_device_shutdown calls drv->shutdown before calling
pci_device_disable.  Which means that only devices that have trouble
with this bit being flipped while DMA is ongoing and don't bother
to stop their own DMA will have a problem.

As for shifting problems I do think we have shifted the problem in a
very positive way.  Now instead of having a random failure at a random
location caused by DMA happing at a random moment for no expected reason
we have failures happening when we disable or enable a device, which
should be much more debugable.

If we encounter devices that can't have their bus master bit disabled at
all we can move that functionality into the drivers or add some sort of
flag so that pci_device_shutdown avoids this on real hardware.

> Does any one see any other issues with modifying driver shutdown
> routines for disabling Bus Master bit? Bjorn, any opinions?

I don't have a problem with moving it all of the way into the drivers
I just think it might be a little bit silly at this point.

Ultimately I don't see the complaint raised by this thread.  Either
the drivers for the broadcom devices in questoin are buggy before we
added the pci_disable_device or those drivers are not buggy.

If we really want to do something to reduce the testing burden and make
certain things work better in general we need to merge the device
shutdown and the device remove methods.  Shrug.  People keep getting
squeamish when I suggest that.

Eric

  reply	other threads:[~2012-06-06 19:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-27 19:00 [PATCH] Disable Bus Master on PCI device shutdown Khalid Aziz
2012-05-03 23:52 ` Bjorn Helgaas
2012-05-04 17:15   ` Bjorn Helgaas
2012-06-06 13:50 ` Matthew Garrett
2012-06-06 16:17   ` Khalid Aziz
2012-06-06 16:27     ` Matthew Garrett
2012-06-06 17:32       ` Khalid Aziz
2012-06-06 17:42         ` Matthew Garrett
2012-06-06 18:07           ` Khalid Aziz
2012-06-06 19:42             ` Eric W. Biederman [this message]
2012-06-06 20:09               ` Matthew Garrett
2012-06-07 17:43                 ` Khalid Aziz
2012-06-07 14:21               ` Khalid Aziz
2012-06-06 20:16             ` Myron Stowe
2012-06-06 23:03               ` Khalid Aziz
2012-06-06 23:18                 ` Myron Stowe
2012-06-06 20:50   ` Alan Cox
2012-06-07 17:07     ` Andi Kleen
2012-06-07 17:13       ` Alan Cox
2012-06-07 17:36       ` Khalid Aziz
2012-06-07 17:08   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87obowxm5s.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=bhelgaas@google.com \
    --cc=khalid.aziz@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mjg59@srcf.ucam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).