All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org,
	alex.williamson@redhat.com,
	Jeffy Chen <jeffy.chen@rock-chips.com>,
	Shawn Lin <shawn.lin@rock-chips.com>
Subject: Re: [PATCH] pci: Disable master abort while waiting for an FLR to complete
Date: Mon, 15 May 2017 09:36:24 -0700	[thread overview]
Message-ID: <20170515163623.GA15237@google.com> (raw)
In-Reply-To: <20170512181107.13853.34680.stgit@localhost.localdomain>

On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This patch is meant to address issues seen when performing on FLR on some
> systems as the long wait can result in a master abort when reading a
> function that is not yet ready.
> 
> To prevent the master abort from being reported up we should disable
> reporting for it while waiting on the reset. Once the reset is completed
> then we can re-enable the master abort for the bus.
> 
> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset")
> 
> Reported-by: Brian Norris <briannorris@chromium.org>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
> 
> I haven't been able to test this code as I don't have a system that can
> reproduce the issue. The fix itself is based on the issue as reported by
> Brian so I am hoping he can test this on the Samsung Chromebook Plus with
> RK399 OP1 that this was seen on.

FYI, it's "RK3399" not "RK399". No harm done :)

>  drivers/pci/pci.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 7904d02ffdb9..acbdbabeaa0e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
>   */
>  static void pci_flr_wait(struct pci_dev *dev)
>  {
> +	struct pci_dev *bridge = dev->bus->self;
>  	int i = 0;
> +	u16 bctl;
>  	u32 id;
>  
> +	/*
> +	 * Disable MasterAbortMode while we are waiting to avoid reporting
> +	 * bus errors for reading the command register on a device that is
> +	 * not ready (in some architectures)

I assume it's intentional to only do this *after* you've started the
reset (but before you start polling)?

> +	 */
> +	if (bridge) {
> +		pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl);
> +		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL,
> +				      bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
> +	}
> +
>  	do {
>  		msleep(100);
>  		pci_read_config_dword(dev, PCI_COMMAND, &id);

IIUC, the patch works fine, in that I no longer get an abort from the
RC.

>  	} while (i++ < 10 && id == ~0);

BTW, the RK3399 host controller doesn't actually return all 1's on
aborted transactions. So this loop doesn't really work for it at all.

I take it that this (seemingly widely used) pattern all derives from the
PCI spec, which notes:

  "The host bus bridge, in PC compatible systems, must return all 1's on
  a read transaction and discard data on a write transaction when
  terminated with Master-Abort."

RK3399 is far from a "PC compatible system", but I don't know if that's
a real excuse here...

>  
> +	/* restore bridge state */
> +	if (bridge)
> +		pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl);
> +
>  	if (id == ~0)
>  		dev_warn(&dev->dev, "Failed to return from FLR\n");

And there's no actual error code returned here; so since several of the
error cases I can trigger on my hardware seem to never actually recover
(no longer how long I wait and/or poll), it seems like we should
propagate the error upward. Right now, we still unconditionally continue
with the pci_dev_restore(), even though that's doomed to abort too...

So there may be several implementation bugs with RK3399. I don't know
how much can be fixed on Rockchip's side, vs. how much can be
accomodated in the PCI core.

Brian

>  	else if (i > 1)
> 

  reply	other threads:[~2017-05-15 16:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-12 18:13 [PATCH] pci: Disable master abort while waiting for an FLR to complete Alexander Duyck
2017-05-15 16:36 ` Brian Norris [this message]
2017-05-15 19:48   ` Alexander Duyck
2017-05-15 20:00     ` Alex Williamson
2017-05-15 21:00     ` Brian Norris
2017-05-15 21:51       ` Brian Norris
2017-06-14 22:34 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170515163623.GA15237@google.com \
    --to=briannorris@chromium.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=jeffy.chen@rock-chips.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=shawn.lin@rock-chips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.