public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Khalid Aziz <khalid@gonehiking.org>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>,
	e1000-devel@lists.sourceforge.net, linux-pci@vger.kernel.org,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-kernel@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Matthew Garrett <mjg@redhat.com>,
	khalid.aziz@oracle.com
Subject: Re: [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
Date: Tue, 05 Feb 2013 08:28:11 -0700	[thread overview]
Message-ID: <1360078091.4485.32.camel@rhapsody> (raw)
In-Reply-To: <CAErSpo7k96noiCtGYUFyuUQhE2GQA8ro5-9WC-JU_1HOWKUGvg@mail.gmail.com>

On Mon, 2013-02-04 at 16:13 -0700, Bjorn Helgaas wrote:
> On Mon, Feb 4, 2013 at 3:20 PM, Khalid Aziz <khalid@gonehiking.org> wrote:
> > On Mon, 2013-02-04 at 15:55 +0400, Konstantin Khlebnikov wrote:
> >> Matthew Garrett and Alan Cox said (see LKML link below) that clearing bus-master
> >> for all PCI devices may lead to unpredictable consequences, some devices ignores
> >> this bit and continues DMA, some of them hang after that or crash whole system.
> >> Probably we should leave here only warning and disable bus-mastering for each
> >> driver individually in ->shutdown() callback.
> >
> > Agreed that the right place for shutting down a PCI device properly and
> > clearing its Bus Master bit, is the driver shutdown routine, if only all
> > drivers supplied a shutdown routine. As it is today, there are too many
> > drivers that do not provide a shutdown routine, ata_piix, Marvell SATA
> > driver, ATI AGP driver just to name a few among a large number of them.
> > Yet kexec is expected to work inspite of these drivers especially since
> > kdump depends on it. So until all PCI drivers supply a shutdown routine,
> > this is just a band-aid to disable interrupt and Bus Master bit in
> > pci_device_shutdown(). Most drivers do seem to supply a suspend and
> > resume function and it was discussed many years ago if it is feasible to
> > use the suspend() routine for drivers to shut devices down cleanly.
> > Maybe it is time to revisit that discussion.
> 
> This patch as posted doesn't do anything with IRQs.  It only clears
> PCI_COMMAND_MASTER.
> 
> I'm open to considering something with IRQs, but I don't understand
> exactly what we should do.  In your response to the previous version
> (https://lkml.org/lkml/2013/1/28/720) you suggested this:
> 
>   pci_clear_master(pci_dev);
>   pcibios_disable_device(pci_dev);
> 
> Did you figure out specifically why pcibios_disable_device() helps?
> Using pcibios_disable_device() doesn't seem like the ideal solution
> because on most architectures, it is an empty function with no obvious
> connection to IRQs.  On x86 with ACPI, it cleans up some ACPI PCI IRQ
> stuff, but as far as I can tell, it doesn't actually touch the PCI
> device itself or even the IOAPIC to which it's connected, so I'm not
> sure how this would help kexec.
> 
> Bjorn

Hi Bjorn,

My reading of the code was that pcibios_disable_device() does clear the
interrupt on x86 and ia64. I am not deeply familiar with the ACPI code
and I might be interpreting it incorrectly, so please do correct me if I
am reading it incorrectly. Here is the code sequence I see:

pcibios_disable_device() ->
   pcibios_disable_irq() ->
       acpi_pci_irq_disable() -> 
           acpi_pci_link_free_irq() ->
              acpi_evaluate_object(link->device->handle, "_DIS", NULL,
NULL);

My understanding is the evaluation of ACPI _DIS method will disable the
interrupt from the device. Does that sound reasonable?

The problem this code attempts to solve is I/O devices continuing to be
active as we start to boot a kexec'd kernel. That activity can come from
DMA (has been seen with NICs for sure, but can happen from a SATA/IDE
controller as well when a pending read completes). When a DMA activity
overwrites a section of memory area in use by the new kexec'd kernel, it
takes lot of work to narrow that memory corruption down. The right way
to quiesce I/O devices is to call shutdown() function for every active
driver, which pci_device_shutdown() does today. If every driver provided
a proper shutdown() function, we would be done. Since that is not the
case, we need to stop potentially active devices from interfering with
kexec'd kernel. Too many drivers rely upon firmware reinitializing the
device when system is shut down. The two ways I can think of are to stop
DMA by clearing Bus Master bit and turn off the interrupt, which have
been shown to get kexec (and thus kdump) working on machines it didn't
work on before. 

This is a non-trivial problem to solve and I am very open to better
ideas.

--
Khalid


  reply	other threads:[~2013-02-05 15:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
2013-02-04 11:55 ` [PATCH v2 1/7] e1000e: fix pci-device enable-counter balance Konstantin Khlebnikov
2013-02-04 11:55 ` [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown() Konstantin Khlebnikov
2013-02-04 22:20   ` Khalid Aziz
2013-02-04 23:13     ` Bjorn Helgaas
2013-02-05 15:28       ` Khalid Aziz [this message]
2013-02-05 19:22         ` Bjorn Helgaas
2013-02-06  0:21           ` Khalid Aziz
2013-02-04 11:56 ` [PATCH v2 3/7] PCI: catch enable-counter underflows Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 4/7] PCI/PM: clear state_saved during suspend Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 5/7] e1000e: fix runtime power management transitions Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback Konstantin Khlebnikov
2013-02-04 20:22   ` Rafael J. Wysocki
2013-02-04 20:57     ` Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 7/7] e1000e: fix accessing to suspended device Konstantin Khlebnikov
2013-02-04 20:23 ` [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Rafael J. Wysocki
2013-02-12  0:34   ` Bjorn Helgaas
2013-02-12  0:43     ` Bjorn Helgaas
2013-02-12 20:27       ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360078091.4485.32.camel@rhapsody \
    --to=khalid@gonehiking.org \
    --cc=ak@linux.intel.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=bhelgaas@google.com \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=khalid.aziz@oracle.com \
    --cc=khlebnikov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mjg@redhat.com \
    --cc=rafael.j.wysocki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox