From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Lukas Wunner <lukas@wunner.de>,
Chris Chiu <chris.chiu@canonical.com>,
linux-pci@vger.kernel.org
Subject: Re: [PATCH 1/2] PCI: Take multifunction devices into account when distributing resources
Date: Wed, 9 Nov 2022 14:41:15 +0200 [thread overview]
Message-ID: <Y2uf62ifyeb5RciE@black.fi.intel.com> (raw)
In-Reply-To: <20221108211130.GA501583@bhelgaas>
On Tue, Nov 08, 2022 at 03:11:30PM -0600, Bjorn Helgaas wrote:
> On Thu, Nov 03, 2022 at 12:32:53PM +0200, Mika Westerberg wrote:
> > It is possible to have PCIe switch upstream port a multifunction device.
>
> I can't quite parse this. I guess the point is that a Switch Upstream
> Port may be one of the functions of a multifunction device?
Yes.
> > The resource distribution code does not take this into account properly
> > and therefore it expands the upstream port resource windows too much,
> > not leaving space for the other functions (in the multifunction device)
> > and this leads to an issue that Jonathan reported. He runs QEMU with
> > the following topoology (QEMU parameters):
> >
> > -device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2 \
> > -device x3130-upstream,id=sw1,bus=root_port13,multifunction=on \
> > -device e1000,bus=root_port13,addr=0.1 \
> > -device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3 \
> > -device e1000,bus=fun1
> >
> > The first e1000 NIC here is another function in the switch upstream
> > port. This leads to following errors:
> >
> > pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04]
> > pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04]
> > pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000]
> > e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0]
> >
> > Fix this by taking into account the possible multifunction devices when
> > uptream port resources are distributed.
>
> Can you include the link to Jonathan's report?
Sure I will.
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > ---
> > Hi,
> >
> > This is the formal patch that resulted from the discussion here:
> >
> > https://lore.kernel.org/linux-pci/20220905080232.36087-5-mika.westerberg@linux.intel.com/T/#m724289d0ee0c1ae07628744c283116e60efaeaf1
> >
> > Only change from that version is that we loop through all resources of
> > the multifunction device.
> >
> > drivers/pci/setup-bus.c | 63 ++++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 59 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > index b4096598dbcb..c8787b187ee4 100644
> > --- a/drivers/pci/setup-bus.c
> > +++ b/drivers/pci/setup-bus.c
> > @@ -1830,10 +1830,65 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
> > * bridges below.
> > */
> > if (hotplug_bridges + normal_bridges == 1) {
> > - dev = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> > - if (dev->subordinate)
> > - pci_bus_distribute_available_resources(dev->subordinate,
> > - add_list, io, mmio, mmio_pref);
> > + /* Upstream port must be the first */
>
> Do you have any citation or reasoning for this handy? We had this
> assumption before, and it seems true that an Upstream Port must be
> Function 0 because a variety of Link-related things have to be in
> Function 0, e.g., ARI ASPM Control, ARI Clock PM, Autonomous Width
> Disable, Flit Mode Disable, LTR Enable, OBFF Enable, etc. But those
> are all pretty oblique.
>
> I guess it's better to have the comment than not, but is the sort of
> assertion that makes one wonder why it is true.
Unfortunately I was not able to find such reference from the PCIe spec :(
> > + bridge = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> > + if (!bridge->subordinate)
> > + return;
> > +
> > + /*
> > + * It is possible to have switch upstream port as a part
> > + * of a multifunction device. For this reason reduce the
> > + * resources occupied by the other functions before
> > + * distributing the rest.
>
> The space consumed by the peer functions of the Switch Upstream Port
> is determined by their BAR sizes, so I don't think we actually reduce
> that.
>
> I *think* the point here is to reduce the space available for
> distribution by the amount required by the peers of the Switch
> Upstream Port, right? I.e., "mmio" is the amount of space we have to
> distribute, and before splitting it across devices on the secondary
> bus, we need to save out the space required for peers on the primary
> bus.
Yes, I will update the comment accordingly.
> > + */
> > + list_for_each_entry(dev, &bus->devices, bus_list) {
> > + int i;
> > +
> > + if (dev == bridge)
> > + continue;
> > +
> > + /*
> > + * It should be multifunction but if not stop
> > + * the distribution and bail out.
> > + */
> > + if (!dev->multifunction)
> > + return;
>
> Why do we bother with this? If there are multiple devices on the bus,
> don't we want to consider them all, regardless of whether
> dev->multifunction is set? It seems like a gratuitous check.
Agreed, I will remove it.
>
> > + for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> > + const struct resource *dev_res = &dev->resource[i];
> > + resource_size_t dev_sz;
> > + struct resource *b_res;
> > +
> > + if (dev_res->flags & IORESOURCE_IO) {
> > + b_res = &io;
> > + } else if (dev_res->flags & IORESOURCE_MEM) {
> > + if (dev_res->flags & IORESOURCE_PREFETCH)
> > + b_res = &mmio_pref;
> > + else
> > + b_res = &mmio;
> > + } else {
> > + continue;
> > + }
> > +
> > + /* Size aligned to bridge window */
> > + align = pci_resource_alignment(bridge, b_res);
> > + dev_sz = ALIGN(resource_size(dev_res), align);
> > +
> > + pci_dbg(dev, "%pR aligned to %llx\n", dev_res,
>
> %#llx to avoid confusion and match other output.
OK.
prev parent reply other threads:[~2022-11-09 12:40 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-03 10:32 [PATCH 1/2] PCI: Take multifunction devices into account when distributing resources Mika Westerberg
2022-11-03 10:32 ` [PATCH 2/2] Revert "Revert "PCI: Distribute available resources for root buses, too"" Mika Westerberg
2022-11-08 21:11 ` [PATCH 1/2] PCI: Take multifunction devices into account when distributing resources Bjorn Helgaas
2022-11-09 12:41 ` Mika Westerberg [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2uf62ifyeb5RciE@black.fi.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=chris.chiu@canonical.com \
--cc=helgaas@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=rafael.j.wysocki@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).