From: Bjorn Helgaas <bhelgaas@google.com>
To: Yijing Wang <wangyijing@huawei.com>
Cc: Jon Mason <jdmason@kudzu.us>,
linux-pci@vger.kernel.org, Hanjun Guo <guohanjun@huawei.com>,
jiang.liu@huawei.com, joe.jin@oracle.com
Subject: Re: [PATCH v8 6/6] PCI: update device mps when doing pci hotplug
Date: Thu, 22 Aug 2013 12:18:23 -0600 [thread overview]
Message-ID: <20130822181823.GA25721@google.com> (raw)
In-Reply-To: <1377141888-7000-7-git-send-email-wangyijing@huawei.com>
[+cc Joe]
On Thu, Aug 22, 2013 at 11:24:48AM +0800, Yijing Wang wrote:
> Currently we don't update device's mps value when doing
> pci device hot-add. The hot-added device's mps will be set
> to default value (128B). But the upstream port device's mps
> may be larger than 128B which was set by firmware during
> system bootup. In this case the new added device may not
> work normally. This patch try to update the hot added device
> mps equal to its parent mps, if device mpss < parent mps,
> print warning.
>
> References: https://bugzilla.kernel.org/show_bug.cgi?id=60671
> Reported-by: Yijing Wang <wangyijing@huawei.com>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> Cc: Jon Mason <jdmason@kudzu.us>
> Cc: stable@vger.kernel.org # 3.4+
> ---
> drivers/pci/probe.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 47 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 4afd158..06e88c5 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1602,6 +1602,43 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
> return 0;
> }
>
> +static int pcie_bus_update_set(struct pci_dev *dev, void *data)
> +{
> + int mps, p_mps, mpss;
> + struct pci_dev *parent;
> +
> + if (!pci_is_pcie(dev) || !dev->bus->self)
> + return 0;
> +
> + parent = dev->bus->self;
> + mps = pcie_get_mps(dev);
> + p_mps = pcie_get_mps(dev->bus->self);
> +
> + if (mps >= p_mps)
> + return 0;
> +
> + /* we only update the device mps, unless its parent device is root port,
> + * and it is the only slot directly connected to root port.
> + */
> + mpss = 128 << dev->pcie_mpss;
> + if (mpss >= p_mps) {
> + pcie_write_mps(dev, p_mps);
> + } else if (pci_pcie_type(parent) == PCI_EXP_TYPE_ROOT_PORT) {
> + pcie_write_mps(parent, mpss);
> + pcie_write_mps(dev, mpss);
> + } else
> + dev_warn(&dev->dev, "MPS %d MPSS %d both smaller than upstream MPS %d\n"
> + "If necessary, use \"pci=pcie_bus_peer2peer\" boot parameter to avoid this problem\n",
> + mps, 128 << dev->pcie_mpss, p_mps);
> + return 0;
> +}
> +
> +static void pcie_bus_update_setting(struct pci_bus *bus)
> +{
> + if (bus->self->is_hotplug_bridge)
> + pci_walk_bus(bus, pcie_bus_update_set, NULL);
> +}
> +
> /* pcie_bus_configure_settings requires that pci_walk_bus work in a top-down,
> * parents then children fashion. If this changes, then this code will not
> * work as designed.
> @@ -1616,8 +1653,17 @@ void pcie_bus_configure_settings(struct pci_bus *bus)
> if (!pci_is_pcie(bus->self))
> return;
>
> - if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
> + if (pcie_bus_config == PCIE_BUS_TUNE_OFF) {
> + /* Sometimes we should update device mps here,
> + * eg. after hot add, device mps value will be
> + * set to default(128B), but the upstream port
> + * mps value may be larger than 128B, if we do
> + * not update the device mps, it maybe can not
> + * work normally.
> + */
> + pcie_bus_update_setting(bus);
I think the strategy of updating the device MPS when possible makes
sense, but I don't think we should do it in PCIE_BUS_TUNE_OFF mode.
That mode is documented as "Disable PCIe MPS tuning and use the
BIOS-configured MPS defaults." This patch changes that to something
like "Disable PCIe MPS tuning, except for hot-added devices" and there
is no longer a way to tell Linux to never touch MPS.
Eventually, I think the default mode should change to PCIE_BUS_SAFE,
where Linux changes MPS settings at boot-time and at hotplug-time to
make sure every device works. (This mode assumes no peer-to-peer
DMA.) I know this was tried in the past, and we tripped over all
sorts of issues, but it's not clear how many were problems with the
Linux code and how many were unsolvable BIOS or platform issues.
Then we'd have these choices:
PCIE_BUS_TUNE_OFF Never touch MPS
PCIE_BUS_PEER2PEER Set all MPS to 128, so peer-to-peer DMA works
PCIE_BUS_SAFE Configure each device with largest safe MPS
(assumes no peer-to-peer DMA)
PCIE_BUS_PERFORMANCE Use MRRS in addition to MPS
(assumes no peer-to-peer DMA)
The hot-add issue [1] could be regarded as a BIOS bug -- the BIOS
programmed a hotplug bridge with MPS=256. A hot-added device powers
up with MPS=128, so it's only safe for BIOS to set MPS=256 if the OS
is smart enough to change the bridge MPS, the device MPS, or both, at
hot-add time. That doesn't seem like a good assumption for a BIOS to
make.
I think we should always *warn* about potential MPS issues, even in
PCIE_BUS_TUNE_OFF mode. That would help diagnose the hot-add issue as
well as issues like the ones Joe Jin reported [2] and [3].
I think what we should do is *always* call pcie_bus_configure_set(),
no matter what mode we're in, but make pcie_bus_configure_set() smart
enough to do different things (print warnings, adjust settings, do the
stuff you added in pcie_bus_update_set(), etc.) depending on what mode
we're in.
Bjorn
> return;
> + }
>
> /* FIXME - Peer to peer DMA is possible, though the endpoint would need
> * to be aware to the MPS of the destination. To work around this,
> --
> 1.7.1
>
>
[1] https://bugzilla.kernel.org/show_bug.cgi?id=60671
[2] http://lkml.kernel.org/r/4FFA9B96.6040901@oracle.com
[3] http://lkml.kernel.org/r/509B5038.8090304@oracle.com
next prev parent reply other threads:[~2013-08-22 18:18 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-22 3:24 [PATCH v8 0/6] Update device MPS Yijing Wang
2013-08-22 3:24 ` [PATCH v8 1/6] PCI: Drop "PCI-E" prefix from Max Payload Size message Yijing Wang
2013-08-22 3:24 ` [PATCH v8 2/6] PCI: Simplify pcie_bus_configure_settings() interface Yijing Wang
2013-08-22 3:24 ` [PATCH v8 3/6] PCI: Remove unnecessary check for pcie_get_mps() failure Yijing Wang
2013-08-22 3:24 ` [PATCH v8 4/6] PCI: Simplify MPS test for Downstream Port Yijing Wang
2013-08-22 3:24 ` [PATCH v8 5/6] PCI: Don't restrict MPS for slots below Root Ports Yijing Wang
2013-08-22 3:24 ` [PATCH v8 6/6] PCI: update device mps when doing pci hotplug Yijing Wang
2013-08-22 18:18 ` Bjorn Helgaas [this message]
2013-08-26 3:42 ` Yijing Wang
2013-08-26 21:33 ` Bjorn Helgaas
2013-08-27 0:39 ` Yinghai Lu
2013-08-27 1:49 ` Yijing Wang
-- strict thread matches above, loose matches on Subject: below --
2013-08-29 21:09 Bjorn Helgaas
2013-08-29 21:47 ` Yinghai Lu
2013-08-29 22:22 ` Bjorn Helgaas
2013-08-29 22:46 ` Yinghai Lu
2013-08-30 15:41 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130822181823.GA25721@google.com \
--to=bhelgaas@google.com \
--cc=guohanjun@huawei.com \
--cc=jdmason@kudzu.us \
--cc=jiang.liu@huawei.com \
--cc=joe.jin@oracle.com \
--cc=linux-pci@vger.kernel.org \
--cc=wangyijing@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).