linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <keith.busch@intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Linux PCI <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Sinan Kaya <okaya@kernel.org>, Thomas Tai <thomas.tai@oracle.com>,
	"poza@codeaurora.org" <poza@codeaurora.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCHv2 16/20] PCI/pciehp: Implement error handling callbacks
Date: Mon, 10 Sep 2018 11:22:46 -0600	[thread overview]
Message-ID: <20180910172246.GA7753@localhost.localdomain> (raw)
In-Reply-To: <20180910170848.7q2qii2mm655eghw@wunner.de>

On Mon, Sep 10, 2018 at 07:08:48PM +0200, Lukas Wunner wrote:
> On Mon, Sep 10, 2018 at 10:45:28AM -0600, Keith Busch wrote:
> > On Mon, Sep 10, 2018 at 06:09:26PM +0200, Lukas Wunner wrote:
> > > On Mon, Sep 10, 2018 at 08:56:42AM -0600, Keith Busch wrote:
> > > > The sysfs entries still function. Their actions are only temporarily
> > > > stalled during error handling. Once the slot reset is called, the
> > > > ctrl->pending_events is queried to take requested actions.
> > > 
> > > Okay I see.  Still, releasing the IRQ and requesting it again seems fairly
> > > heavy-wheight.  Why not just acquire ctrl->reset_lock?
> > 
> > That was looking like a nice way to handle it, but it introduces
> > circular locking between ctrl->reset_lock and pci_bus_sem:
> > 
> > CPU A                               CPU B
> > ---------------------------------   ------------------------
> > pci_walk_bus                        pciehp_ist
> >  down_read(&pci_bus_sem)             down_read(&ctrl->reset_lock);
> >   pcie_portdrv_error_detected         pciehp_handle_presence_or_link_change
> >    pciehp_error_detected               pciehp_unconfigure_device
> >     down_write(&ctrl->reset_lock)       pci_stop_and_remove_bus_devicea
> >                                          down_write(&pci_bus_sem);
> 
> Why is pciehp bringing down the slot?  Is that in reaction to a
> Link Down caused by the error?  

This could just be something that happens if a hotplug and error event
occur about the same time.

> Can this be solved with Sinan's
> approach to check in pciehp whether PCI_EXP_DEVSTA_FED is set
> and if so, waiting for it to be handled?

That only helps if the downstream port detected a fatal error, but
error handling happens for any device reported error.

> FWIW you can use synchronize_irq() in pciehp_error_detected()
> if you need to wait for the IRQ thread to stop before taking
> the reset_lock.

That would just be a different dead lock: pciehp_error_detected is
holding a read lock on pci_bus_sem, and irq thread may request a
write lock, so both threads are dead locked on each other.

  reply	other threads:[~2018-09-10 22:16 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-05 20:35 [PATCHv2 00/20] PCI, error handling and hot plug Keith Busch
2018-09-05 20:35 ` [PATCHv2 01/20] PCI: Simplify disconnected marking Keith Busch
2018-09-05 20:35 ` [PATCHv2 02/20] PCI: Fix faulty logic in pci_reset_bus() Keith Busch
2018-09-05 20:35 ` [PATCHv2 03/20] PCI: Add required waits on link active Keith Busch
2018-09-06 11:42   ` Lukas Wunner
2018-09-06 14:44     ` Keith Busch
2018-09-05 20:35 ` [PATCHv2 04/20] PCI/AER: Remove dead code Keith Busch
2018-09-05 20:35 ` [PATCHv2 05/20] PCI/ERR: Use slot reset if available Keith Busch
2018-09-05 20:35 ` [PATCHv2 06/20] PCI/ERR: Handle fatal error recovery Keith Busch
2018-09-05 20:35 ` [PATCHv2 07/20] PCI/ERR: Always use the first downstream port Keith Busch
2018-09-05 20:35 ` [PATCHv2 08/20] PCI/ERR: Simplify broadcast callouts Keith Busch
2018-09-05 20:35 ` [PATCHv2 09/20] PCI/ERR: Report current recovery status for udev Keith Busch
2018-09-05 20:35 ` [PATCHv2 10/20] PCI/ERR: Remove devices on recovery failure Keith Busch
2018-09-05 20:35 ` [PATCHv2 11/20] PCI/portdrv: Provide pci error callbacks Keith Busch
2018-09-05 20:35 ` [PATCHv2 12/20] PCI/portdrv: Restore pci state on slot reset Keith Busch
2018-09-05 20:35 ` [PATCHv2 13/20] PCI: Make link active reporting detection generic Keith Busch
2018-09-06 12:38   ` Lukas Wunner
2018-09-05 20:35 ` [PATCHv2 14/20] PCI: Create recursive bus walk Keith Busch
2018-09-05 20:35 ` [PATCHv2 15/20] PCI/pciehp: Fix powerfault detection order Keith Busch
2018-09-06 19:36   ` Bjorn Helgaas
2018-09-06 19:50     ` Keith Busch
2018-09-07 16:53       ` Bjorn Helgaas
2018-09-07 20:03         ` Bjorn Helgaas
2018-09-07 20:18           ` Keith Busch
2018-09-18 21:46             ` Bjorn Helgaas
2018-09-18 22:11               ` Keith Busch
2018-09-18 22:11                 ` Keith Busch
2018-09-07 20:26           ` Lukas Wunner
2018-09-05 20:35 ` [PATCHv2 16/20] PCI/pciehp: Implement error handling callbacks Keith Busch
2018-09-06 18:23   ` Thomas Tai
2018-09-06 18:49     ` Keith Busch
2018-09-10 13:20   ` Lukas Wunner
2018-09-10 14:56     ` Keith Busch
2018-09-10 16:09       ` Lukas Wunner
2018-09-10 16:18         ` Keith Busch
2018-09-10 16:45         ` Keith Busch
2018-09-10 17:08           ` Lukas Wunner
2018-09-10 17:22             ` Keith Busch [this message]
2018-09-05 20:35 ` [PATCHv2 17/20] PCI/pciehp: Ignore link events during DPC event Keith Busch
2018-09-05 20:35 ` [PATCHv2 18/20] PCI/DPC: Wait for link active after reset Keith Busch
2018-09-05 20:35 ` [PATCHv2 19/20] PCI/DPC: Link reset code cleanup Keith Busch
2018-09-05 20:35 ` [PATCHv2 20/20] PCI: Unify device inaccessible Keith Busch
2018-09-06  4:20   ` Benjamin Herrenschmidt
2018-09-06 17:30 ` [PATCHv2 00/20] PCI, error handling and hot plug Thomas Tai
2018-09-06 17:36   ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180910172246.GA7753@localhost.localdomain \
    --to=keith.busch@intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=hch@lst.de \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=okaya@kernel.org \
    --cc=poza@codeaurora.org \
    --cc=thomas.tai@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).