From: Alex Williamson <alex.williamson@redhat.com>
To: "Sean O. Stalley" <sean.stalley@intel.com>
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, david.daney@cavium.com
Subject: Re: [RFC PATCH] pci: Identify Enhanced Allocation (EA) BAR Equivalent resources
Date: Thu, 14 Jan 2016 14:14:52 -0700 [thread overview]
Message-ID: <1452806092.14628.110.camel@redhat.com> (raw)
In-Reply-To: <20160114202347.GB3381@sean.stalley.intel.com>
On Thu, 2016-01-14 at 12:23 -0800, Sean O. Stalley wrote:
> On Thu, Jan 14, 2016 at 12:16:02PM -0700, Alex Williamson wrote:
> > On Thu, 2016-01-14 at 10:34 -0800, Sean O. Stalley wrote:
> > > On Thu, Jan 14, 2016 at 10:26:56AM -0700, Alex Williamson wrote:
> > > > We've done a pretty good job of abstracting EA from drivers, but
> > > > there
> > > > are some properties of BAR Equivalent resources that don't really
> > > > jive
> > > > with traditional PCI BARs. In particular, natural alignment is
> > > > only
> > > > encouraged, not required.
> > > >
> > > > Why does this matter? There are drivers like vfio-pci that will
> > > > happily gobble up the EA abstraction that's been implemented and
> > > > expose a device using EA to userspace as if those resources are
> > > > traditional BARs. Pretty cool. The vfio API is bus agnostic, so
> > > > it
> > > > doesn't care about alignment. The problem comes with PCI config
> > > > space
> > > > emulation where we don't let userspace manipulate the BAR value,
> > > > but
> > > > we do emulate BAR sizing. The abstraction kind of falls apart if
> > > > userspace gets garbage when they try to size what appears to be a
> > > > traditional BAR, but is actually a BAR equivalent.
> > > >
> > > > We could simply round up the size in vfio to make it naturally
> > > > aligned, but then we're imposing artificial sizes to the user and
> > > > we
> > > > have the discontinuity that BAR size emulation and vfio region size
> > > > reporting don't agree on the size. I think what we want to do is
> > > > expose EA to the user, reporting traditional BARs with BEIs as
> > > > zero-sized and providing additional regions for the user to access
> > > > each EA region, whether it has a BEI or not.
> > > >
> > > > To facilitate that, a flag indicating whether a PCI resource is a
> > > > traditional BAR or BAR equivalent seems much nicer than attempting
> > > > to size the BAR ourselves or deducing it through the EA capability.
> > >
> > > If vfio does size the resource, EA entries that are aligned could
> > > still be emulated as BARs, correct?
> > >
> > > I would think that emulating a BAR would be preferred when possible,
> > > for backwards-compatibility.
> >
> > If a BEI is naturally aligned, I can't think of any problems with
> > exposing it as a traditional BAR to userspace. I agree that there may
> > be some compatibility benefits there, so it may be useful to offer both
> > options. I don't think we can combine them though, it would violate
> > the EA spec to expose the traditional BAR and and the matching BEI.
> > We'd either need to hide the fake BAR or hide the EA entry defining
> > that BEI. A module option could define which is preferred or maybe an
> > ioctl.
>
> Would any functionality be lost if vfio:
> - emulates BARs & hide EA entry when EA resources are aligned.
> - exposes EA entries when the resources aren't aligned (no BAR emulation).
> ?
>
> I'm just wondering if giving userspace the option to pick is necessary,
> or if there is a setting that is always ideal.
That certainly might be a good default and the only use case I can
think where it wouldn't be ideal is if we want to expose EA to a VM for
the purpose of doing EA testing and development in a guest. A module
option would make more sense than defining a user interface for that
case though.
> > > > Thoughts?
> > >
> > > I like the idea of adding an EA flag.
> > >
> > > There were some cases in the kernel where it would be nice to know if a
> > > resource was fixed because it was EA or if something else was fixing it.
> > > Adding that flag was discussed during the code review of the EA code,
> > > but it was decided that we could get by without it.
> > >
> > > IIRC, most of the cases that required the flag had to do with EA entries
> > > for bridges. Since bridge support wasn't added, we didn't need the flag.
> >
> > By my reading of the spec, not all BEIs need to be fixed, is this just
> > a simplification to avoid sizing and mapping a BAR that doesn't exist
> > in the traditional sense? A flag on the resource seems like it would
> > be useful for that as well if we ever wanted to add the case where an
> > AE BAR equivalent could be remapped. Thanks,
>
> All of the usable BEIs have a HwInit Base & MaxOffset, and therefore a
> fixed range. The "unavailable for use" resources aren't explicitly HwInit,
> but the spec doesn't define how/when you can move them.
>
> The spec does define a writeable bit for resources,
> but doesn't define how to use it either. I think the intention was to
> be able to expand EA in the future to cover movable resources.
Wow, that's really confusing to have a writable bit per entry and then
define properties which forbid its use. We'd need to go through the
SIG and define a whole new set of properties just to get movable
versions, crazy. Thanks for the explanation.
>
> Anyway, I think having an explicit flag that says "This Resource is from EA"
> that is independent of "This resource is fixed" is a good idea.
>
>
> Acked-by: Sean O. Stalley <sean.stalley@intel.com>
Cool, thanks
Alex
> > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > > ---
> > > > drivers/pci/pci.c | 2 +-
> > > > include/linux/ioport.h | 2 ++
> > > > 2 files changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 314db8c..174c734 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -2229,7 +2229,7 @@ void pci_pm_init(struct pci_dev *dev)
> > > >
> > > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8
> > > > prop)
> > > > {
> > > > - unsigned long flags = IORESOURCE_PCI_FIXED;
> > > > + unsigned long flags = IORESOURCE_PCI_FIXED |
> > > > IORESOURCE_PCI_EA_BEI;
> > > >
> > > > switch (prop) {
> > > > case PCI_EA_P_MEM:
> > > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > > index 24bea08..5acc194 100644
> > > > --- a/include/linux/ioport.h
> > > > +++ b/include/linux/ioport.h
> > > > @@ -105,6 +105,8 @@ struct resource {
> > > > /* PCI control bits. Shares IORESOURCE_BITS with above PCI
> > > > ROM. */
> > > > #define IORESOURCE_PCI_FIXED (1<<4) /*
> > > > Do
> > > > not move resource */
> > > >
> > > > +/* PCI Enhanced Allocation defined BAR equivalent resource *
> > > > +#define IORESOURCE_PCI_EA_BEI (1<<5)
> > > >
> > > > /* helpers to define resources */
> > > > #define DEFINE_RES_NAMED(_start, _size, _name, _flags)
> > > > \
> > > >
> >
next prev parent reply other threads:[~2016-01-14 21:14 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 17:26 [RFC PATCH] pci: Identify Enhanced Allocation (EA) BAR Equivalent resources Alex Williamson
2016-01-14 18:34 ` Sean O. Stalley
2016-01-14 19:16 ` Alex Williamson
2016-01-14 20:23 ` Sean O. Stalley
2016-01-14 21:14 ` Alex Williamson [this message]
2016-01-14 23:02 ` Sean O. Stalley
2016-01-14 18:54 ` David Daney
2016-01-14 19:20 ` Alex Williamson
2016-01-14 19:27 ` Sean O. Stalley
2016-01-20 20:20 ` Alex Williamson
2016-01-21 17:48 ` Sean O. Stalley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1452806092.14628.110.camel@redhat.com \
--to=alex.williamson@redhat.com \
--cc=bhelgaas@google.com \
--cc=david.daney@cavium.com \
--cc=linux-pci@vger.kernel.org \
--cc=sean.stalley@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.