linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Yijing Wang <wangyijing@huawei.com>
Cc: linux-pci@vger.kernel.org, Jordan_Hargrave@Dell.com,
	keith.busch@intel.com, jon.mason@intel.com,
	Jon Mason <jdmason@kudzu.us>
Subject: Re: [PATCH] PCI: update device mps when doing pci hotplug
Date: Wed, 3 Sep 2014 16:42:01 -0600	[thread overview]
Message-ID: <20140903224201.GD26073@google.com> (raw)
In-Reply-To: <1406621877-12022-1-git-send-email-wangyijing@huawei.com>

On Tue, Jul 29, 2014 at 04:17:57PM +0800, Yijing Wang wrote:
> Currently we don't update device's mps value when doing
> pci device hot-add. The hot-added device's mps will be set
> to default value (128B). But the upstream port device's mps
> may be larger than 128B which was set by firmware during
> system bootup. In this case the new added device may not
> work normally. This issue was found in huawei 5885 server
> and Dell R620 server. And if we run the platform with windows,
> this problem is gone. This patch try to update the hot added
> device mps equal to its parent mps, if device mpss < parent mps,
> print warning.
> 
> References: https://bugzilla.kernel.org/show_bug.cgi?id=60671
> Reported-by: Keith Busch <keith.busch@intel.com>
> Reported-by: Jordan_Hargrave@Dell.com
> Reported-by: Yijing Wang <wangyijing@huawei.com>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> Cc: Jon Mason <jdmason@kudzu.us>
> ---
>  drivers/pci/probe.c |   39 +++++++++++++++++++++++++++++++++++++++
>  1 files changed, 39 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index e3cf8a2..583ca52 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1613,6 +1613,44 @@ static void pcie_write_mrrs(struct pci_dev *dev)
>  		dev_err(&dev->dev, "MRRS was unable to be configured with a safe value.  If problems are experienced, try running with pci=pcie_bus_safe\n");
>  }
>  
> +/**
> + * pcie_bus_update_set - update device mps when device doing hot-add
> + * @dev: PCI device to set
> + * 
> + * After device hot add, mps will be set to default(128B), But the 
> + * upstream port device's mps may be larger than 128B which was set 
> + * by firmware during system bootup. Then we should update the device
> + * mps to equal to its parent mps, Or the device can not work normally.
> + */
> +static void pcie_bus_update_set(struct pci_dev *dev)
> +{
> +	int mps, p_mps, mpss;
> +	struct pci_dev *parent;
> +
> +	if (!pci_is_pcie(dev) || !dev->bus->self 
> +			|| !dev->bus->self->is_hotplug_bridge)

Part of this looks redundant because pcie_bus_configure_set() already
checks pci_is_pcie().  And I don't know why we need to test
is_hotplug_bridge here; MPS settings need to be consistent regardless of
whether the upstream bridge supports hotplug.

> +		return;
> +	
> +	parent = dev->bus->self;
> +	mps = pcie_get_mps(dev);
> +	p_mps = pcie_get_mps(parent);
> +
> +	if (mps >= p_mps)
> +		return;
> +
> +	mpss = 128 << dev->pcie_mpss;
> +	if (mpss < p_mps) {
> +		dev_warn(&dev->dev, "MPSS %d smaller than upstream MPS %d\n"
> +				"If necessary, use \"pci=pcie_bus_safe\" boot parameter to avoid this problem\n",
> +				mpss, p_mps);
> +		return;

Since we can't configure the new device correctly, we really shouldn't
allow a driver to bind to it.  The current design doesn't have much
provision for doing that, so warning is probably all we can do.

> +	}
> +
> +	pcie_write_mps(dev, p_mps);
> +	dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n", 
> +			pcie_get_mps(dev), 128 << dev->pcie_mpss, mps);
> +}
> +
>  static void pcie_bus_detect_mps(struct pci_dev *dev)
>  {
>  	struct pci_dev *bridge = dev->bus->self;
> @@ -1637,6 +1675,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
>  		return 0;
>  
>  	if (pcie_bus_config == PCIE_BUS_TUNE_OFF) {
> +		pcie_bus_update_set(dev);

You're only adding this to the PCIE_BUS_TUNE_OFF path.  Can't the same
problem occur for other pcie_bus_config settings?

>  		pcie_bus_detect_mps(dev);
>  		return 0;
>  	}

I have some long-term ideas here (below), but to make progress in the short
term, I think we just need to make sure this handles all pcie_bus_config
settings.

Bjorn



Stepping back a long ways, I think the current design is hard to use.
It's set up with the idea that we (1) enumerate all the devices in the
system, and then (2) configure MPS for everything all at once.

That's not a very good fit when we start hotplugging devices, and it's
part of the reason MPS configuration is not well integrated into the PCI
core and doesn't get done at all for most architectures.

What I'd prefer is something that could be done in the core as each device
is enumerated, e.g., in or near pci_device_add().  I know there's tension
between the need to do this before drivers bind to the device and the
desire to enumerate the whole hierarchy before committing to MPS settings.
But we need to handle that tension anyway for hot-added devices, so we
might as well deal with it at boot-time and use the same code path for
both boot-time and hot-add time.

I have in mind something like this:

  pcie_configure_mps(struct pci_dev *dev)
  {
    int ret;

    if (!pci_is_pci(dev))
      return;

    if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
      /* set my MPS to dev->pcie_mpss (max supported size) */
      return;
    }

    if (dev->pcie_mpss >= upstream bridge MPS) {
      /* set my MPS to upstream bridge MPS */
      return;
    }

    ret = pcie_set_hierarchy_mps(pcie_root_port(dev), dev->mpss);
    if (ret == failure)
      /* emit warning, can't enable this device */
  }

  struct pci_dev *pcie_root_port(struct pci_dev *dev)
  {
    if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
      return dev;

    return pcie_root_port(dev->bus->self);
  }

  pcie_set_hierarchy_mps(struct pci_dev *root, int mpss)
  {
    struct pci_bus *secondary;
    struct pci_dev *dev;
    int ret;

    if (root->driver)
      return -EINVAL;

    secondary = root->subordinate;
    if (secondary) {
      list_for_each_entry(dev, &secondary->devices, bus_list) {
	ret = pcie_set_hierarchy(dev, mpss);
	if (ret)
	  return ret;
      }
    }

    /* set my MPS to mpss */
    return 0;
  }

  parent reply	other threads:[~2014-09-03 22:41 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-29  8:17 [PATCH] PCI: update device mps when doing pci hotplug Yijing Wang
2014-07-29 16:18 ` Alex Williamson
2014-07-29 16:30   ` Keith Busch
2014-07-29 16:42     ` Alex Williamson
2014-07-29 19:04       ` Keith Busch
2014-07-30  3:35       ` Yijing Wang
2014-07-30  3:27   ` Yijing Wang
2014-07-30  3:33 ` Ethan Zhao
2014-07-30  3:42   ` Yijing Wang
2014-07-30  3:58     ` Ethan Zhao
2014-07-30  4:42       ` Yijing Wang
2014-07-30  6:26 ` Ethan Zhao
2014-07-30  6:57   ` Yijing Wang
2014-07-30  7:17     ` Ethan Zhao
2014-07-30  8:13       ` Yijing Wang
2014-07-30  8:38         ` Ethan Zhao
2014-07-30  9:17           ` Yijing Wang
2014-07-30 19:41             ` Jordan_Hargrave
2014-09-03 19:20               ` Bjorn Helgaas
2014-09-03 22:42 ` Bjorn Helgaas [this message]
2014-09-04  6:12   ` Yijing Wang
2014-09-04 13:16     ` Bjorn Helgaas
2014-09-05  1:27       ` Yijing Wang
2014-09-05 14:37         ` Keith Busch
2014-09-24 22:41         ` Keith Busch
2014-09-24 23:30           ` Bjorn Helgaas
2014-09-25  1:23             ` Yijing Wang
2014-09-25 16:46               ` Keith Busch
2014-09-26  3:22                 ` Yijing Wang
2014-10-02 15:31                   ` Jordan_Hargrave
  -- strict thread matches above, loose matches on Subject: below --
2014-07-29  8:23 Yijing Wang
2013-02-05  3:55 Yijing Wang
2013-05-28  3:15 ` Yijing Wang
2013-07-29 23:33   ` Bjorn Helgaas
2013-07-30  3:20     ` Yijing Wang
2013-07-30  3:42       ` Bjorn Helgaas
2013-07-30 22:29         ` Bjorn Helgaas
2013-07-31  9:15           ` Yijing Wang
2013-07-31 17:53             ` Bjorn Helgaas
2013-07-31 20:42               ` Bjorn Helgaas
2013-08-01  1:23                 ` Yijing Wang
2013-08-01  1:21               ` Yijing Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140903224201.GD26073@google.com \
    --to=bhelgaas@google.com \
    --cc=Jordan_Hargrave@Dell.com \
    --cc=jdmason@kudzu.us \
    --cc=jon.mason@intel.com \
    --cc=keith.busch@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=wangyijing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).