All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: okaya@codeaurora.org
Cc: linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	Michael Neuling <michael.neuling@au1.ibm.com>,
	linux-pci-owner@vger.kernel.org
Subject: Re: PCIe resets/restore and lack of CRS wait
Date: Thu, 22 Mar 2018 16:12:11 +1100	[thread overview]
Message-ID: <1521695531.16434.294.camel@kernel.crashing.org> (raw)
In-Reply-To: <f6af9b6e424c469f433735a106b8ee69@codeaurora.org>

On Thu, 2018-03-22 at 00:36 -0400, okaya@codeaurora.org wrote:
> On 2018-03-22 00:03, Benjamin Herrenschmidt wrote:
> > Hi Folks !
> > 
> > So while chasing some issues in our EEH error handling, we noticed that
> > the generic code has about a bazillion of "reset" path for devices,
> > most of them seemingly missing a wait for CRS after the reset.
> > 
> > That includes PM based resets or wakeups (can a D3->D0 transition cause
> > CRS to be returned ? Unclear but we should try to be safe), but mostly
> > it includes anything that resets the pcie port (PERST) or the secondary
> > bridge reset (hot resets).
> > 
> > For example take __pci_reset_function_locked(...), it can call
> > pci_parent_bus_reset() which will perform a hot reset but will *not*
> > wait for CRS.
> > 
> > There are a plethora of reset path in there that are similar, it's
> > actually hard to figure out which is what, but they all have in common
> > that they don't wait for CRS with the notable exception of the FLR
> > case.
> 
> Bjorn merged my change for 4.17. Kernel should handle these now.

Ah nice ! I'll check that out, I was checking 4.16-rc6 !

> > I'm keen on doing a rather "blanket" fix by adding a CRS wait inside
> > pci_dev_restore(). Would you guys agree ?
> > 
> > Also why does pci_flr_wait() not use vendor/device ID but instead waits
> > on the COMMAND register being all 1's ? It's not clear to me ...
> > VID/DID will give a very specific signature for CRS which is ffff0001
> > while COMMAND could return all 1's for other reasons (device unplugged
> > for example).
> > 
> 
> Because if you read vendor id of a virtual function, you get 0xffffffff

Ah indeed, I forgot about that... 

Thanks !

Cheers,
Ben.

  reply	other threads:[~2018-03-22  5:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-22  4:03 PCIe resets/restore and lack of CRS wait Benjamin Herrenschmidt
2018-03-22  4:36 ` okaya
2018-03-22  5:12   ` Benjamin Herrenschmidt [this message]
2018-03-22  6:24     ` Benjamin Herrenschmidt
2018-03-22 11:25       ` okaya
2018-03-22 13:46         ` Benjamin Herrenschmidt
2018-03-22 13:58           ` Sinan Kaya
2018-03-22 14:14             ` Bjorn Helgaas
2018-03-22 17:54               ` Sinan Kaya
2018-03-22 22:02                 ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1521695531.16434.294.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=linux-pci-owner@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=michael.neuling@au1.ibm.com \
    --cc=okaya@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.