From: Bjorn Helgaas <helgaas@kernel.org>
To: "Veal, Bryan E." <bryan.e.veal@intel.com>
Cc: Keith Busch <keith.busch@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
x86@kernel.org, linux-pci@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Bjorn Helgaas <bhelgaas@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Jon Derrick <jonathan.derrick@intel.com>
Subject: Re: [PATCHv8 0/5] Driver for new "VMD" device
Date: Mon, 22 Feb 2016 16:10:24 -0600 [thread overview]
Message-ID: <20160222221024.GA20879@localhost> (raw)
In-Reply-To: <20160120220111.GE7973@localhost>
On Wed, Jan 20, 2016 at 04:01:11PM -0600, Bjorn Helgaas wrote:
> On Sat, Jan 16, 2016 at 02:19:38PM -0800, Veal, Bryan E. wrote:
> > On Fri, Jan 15, 2016 at 03:49:02PM -0600, Bjorn Helgaas wrote:
> > > Even though you found this issue before posting the RFC code, I assume
> > > the issue is still relevant in the current code, and you still want to
> > > clear IORESOURCE_MEM_64, right?
> >
> > Yes.
> >
> > > This is where I get confused. IORESOURCE_MEM_64 *should* mean "the
> > > hardware register associated with this resource can accommodate a
> > > 64-bit value." If we're using IORESOURCE_MEM_64 to decide whether to
> > > use a prefetchable vs. a non-prefetchable window, that sounds broken.
> > >
> > > Can you point me to the relevant code, and maybe give an example? I'm
> > > pretty sure the code doesn't completely match the spec, and maybe this
> > > is a case where we have to set the flags non-intuitively to get the
> > > desired result.
> > >
> > > > Below the port, the "prefetchable" propoerty
> > > > *is* restrictive: the addresses can't be used for non-prefetchable BARs.
> > > >
> > > > Thus, in the specific case where a 64-bit non-prefetchable VMD bar happens
> > > > to contain a 32-bit address, removing the IORESOURCE_MEM_64 flag allows
> > > > the address resource to be used for *any* non-prefetchable BARs (32-bit or
> > > > 64-bit) downstream.
> > >
> > > If I understand correctly, these VMD BARs (VMD_MEMBAR1 and
> > > VMD_MEMBAR2) effectively become the host bridge windows available for
> > > devices below the VMD.
> > >
> > > I infer that if the VMD host bridge window is non-prefetchable and has
> > > IORESOURCE_MEM_64 set, we won't put a 32-bit non-prefetchable BAR in
> > > that window. That sounds like a bug, but let me be the first to admit
> > > that I don't understand our PCI resource allocation code.
> >
> > I don't think anything is broken. You are correct that the MEMBARs are
> > used as a host bridge window. The reason to clear the flag is a side
> > effect of that.
> >
> > For BARs, the flags describe capabilities. For resources, they are
> > interpreted as restrictions.
> >
> > If VMD has a 32-bit resource in a 64-bit non-prefetchable BAR, without
> > clearing the flag, it yields a host bridge resource, and thus root bus
> > resource, with IORESOURCE_MEM_64 set.
> >
> > Downstream of VMD, the root port's 32-bit non-prefetchable base/limit
> > registers can't handle the 64-bit resource, but the 64-bit prefetchable
> > window can, so that's where it ends up. (See pci_bus_alloc_resource().)
>
> OK, I think I finally found the critical comment, which is in
> __pci_assign_resource():
>
> Even if a 64-bit prefetchable bridge window is below 4GB, we can't
> put a 32-bit prefetchable resource in it because pbus_size_mem()
> assumes a 64-bit window will contain no 32-bit resources. If we
> assign things differently than they were sized, not everything will
> fit.
>
> There's no reason we can't put a Root Port's 32-bit non-prefetchable
> window inside a 64-bit VMD window that happens to be below 4GB,
> *except* for the fact that pbus_size_mem() assumes we won't do that.
>
> The VMD code needs a reference to that comment.
>
> I guess you're relying on BIOS to assign your non-prefetchable VMD BAR
> below 4GB even though it's a 64-bit BAR? If Linux assigned that BAR,
> e.g., after a hot-add of a VMD, we might put it above 4GB, and then
> Root Ports downstream from the VMD would not be able to use any
> non-prefetchable space.
I see another VMD patch on the list, but I'm still waiting for
resolution to this comment and question. For the first one, about
clearing IORESOURCE_MEM_64, I have in mind something like the
following patch.
I'm not sure how to deal with the question of a hot-added VMD. Maybe
all we can do now is add a comment to the effect that we assume BIOS
has assigned the non-prefetchable BAR below 4GB, and if Linux assigns
that BAR for hot-added VMDs, that assumption will likely break.
Bjorn
diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
index d57e480..7554722 100644
--- a/arch/x86/pci/vmd.c
+++ b/arch/x86/pci/vmd.c
@@ -532,6 +532,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd)
.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
};
+ /*
+ * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
+ * put 32-bit resources in the window.
+ *
+ * There's no hardware reason why a 64-bit window *couldn't*
+ * contain a 32-bit resource, but pbus_size_mem() computes the
+ * bridge window size assuming a 64-bit window will contain no
+ * 32-bit resources. __pci_assign_resource() enforces that
+ * artificial restriction to make sure everything will fit.
+ */
res = &vmd->dev->resource[VMD_MEMBAR1];
upper_bits = upper_32_bits(res->end);
flags = res->flags & ~IORESOURCE_SIZEALIGN;
next prev parent reply other threads:[~2016-02-22 22:10 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-12 20:18 [PATCHv8 0/5] Driver for new "VMD" device Keith Busch
2016-01-12 20:18 ` [PATCHv8 1/5] msi: Relax msi_domain_alloc() to support parentless MSI irqdomains Keith Busch
2016-01-12 20:18 ` [PATCHv8 2/5] x86/IRQ: Export IRQ domain function for module use Keith Busch
2016-01-12 20:18 ` [PATCHv8 3/5] x86/PCI: Allow PCI domain specific dma ops Keith Busch
2016-01-12 20:18 ` [PATCHv8 4/5] PCI/AER: Use 32 bit int type domains Keith Busch
2016-01-12 20:18 ` [PATCHv8 5/5] x86/PCI: Initial commit for new VMD device driver Keith Busch
2016-01-15 18:19 ` [PATCHv8 0/5] Driver for new "VMD" device Bjorn Helgaas
2016-01-15 19:31 ` Veal, Bryan E.
2016-01-15 21:49 ` Bjorn Helgaas
2016-01-16 22:19 ` Veal, Bryan E.
2016-01-20 22:01 ` Bjorn Helgaas
2016-02-22 22:10 ` Bjorn Helgaas [this message]
2016-02-23 18:24 ` Keith Busch
2016-02-25 14:42 ` Bjorn Helgaas
2016-02-25 14:50 ` Keith Busch
2016-02-26 15:29 ` Keith Busch
2016-01-15 19:32 ` Keith Busch
2016-01-15 19:44 ` Thomas Gleixner
2016-01-15 20:03 ` Bjorn Helgaas
2016-01-15 20:14 ` Thomas Gleixner
2016-01-15 19:48 ` Derrick, Jonathan
2016-01-15 19:54 ` Keith Busch
2016-01-15 20:02 ` Jon Derrick
2016-01-15 22:06 ` Bjorn Helgaas
2016-01-19 15:38 ` Keith Busch
2016-01-19 16:02 ` Christoph Hellwig
2016-01-19 16:36 ` Keith Busch
2016-01-19 22:05 ` Veal, Bryan E.
2016-01-20 20:43 ` Bjorn Helgaas
2016-01-26 16:46 ` Christoph Hellwig
2016-01-26 18:23 ` Veal, Bryan E.
2016-01-17 17:58 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160222221024.GA20879@localhost \
--to=helgaas@kernel.org \
--cc=bhelgaas@google.com \
--cc=bryan.e.veal@intel.com \
--cc=dan.j.williams@intel.com \
--cc=jonathan.derrick@intel.com \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.