From: Yu Zhao <yuzhao@google.com>
To: Kelly.Zytaruk@amd.com
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org,
alex.williamson@redhat.com
Subject: Re: Architectural question regarding IOV support in Linux 3.13.4
Date: Mon, 24 Feb 2014 13:57:08 -0800 [thread overview]
Message-ID: <20140224215708.GC9421@google.com> (raw)
In-Reply-To: <1393275541.9111.219.camel@ul30vt.home>
On Mon, Feb 24, 2014 at 01:59:01PM -0700, Alex Williamson wrote:
> On Mon, 2014-02-24 at 19:22 +0000, Zytaruk, Kelly wrote:
> > >On Fri, Feb 21, 2014 at 2:03 PM, Alex Williamson <alex.williamson@redhat.com> wrote:
> > >On Fri, 2014-02-21 at 14:11 -0700, Bjorn Helgaas wrote:
> > >> [+cc Alex, Yu]
> > >>
> > >> On Fri, Feb 21, 2014 at 10:45 AM, Zytaruk, Kelly <Kelly.Zytaruk@amd.com> wrote:
> > >> >
> > >> > I am working with SR-IOV and I have a question regarding the function
> > >> > sriov_init() in ../drivers/pci/iov.c (linux versions 3.4.9 and 3.13.4)
> > >> >
> > >> > In sriov_init() the code first checks whether the PF is a Root complex
> > >> > endpoint (0x9) or an Express Endpoint (0x0) as shown in the code
> > >> > snippet below. If it is neither it returns the No device error.
> > >> >
> > >> > static int sriov_init(struct pci_dev *dev, int pos)
> > >> > {
> > >> > int i;
> > >> > int rc;
> > >> > int nres;
> > >> > u32 pgsz;
> > >> > u16 ctrl, total, offset, stride;
> > >> > struct pci_sriov *iov;
> > >> > struct resource *res;
> > >> > struct pci_dev *pdev;
> > >> >
> > >> > if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_END &&
> > >> > pci_pcie_type(dev) != PCI_EXP_TYPE_ENDPOINT)
> > >> > return -ENODEV;
> > >> >
> > >> > My question is why PCI_EXP_TYPE_LEG_END (0x1) is omitted as being a
> > >> > valid endpoint. By excluding Legacy endpoints it fails enabling
> > >> > SR-IOV on a VGA PF.
> > >> >
> > >> > Is there a design/specification reason why legacy was excluded or was
> > >> > it just an assumption that VGA would never support SR-IOV?
> > >> >
> > >> > If there is no valid reason to exclude PCI_EXP_TYPE_LEG_END, I would
> > >> > like to discuss having it included as a valid endpoint for SR-IOV.
> > >>
> > >> Good question. It looks like it's been that way since the beginning
> > >> [1], but I don't know why. I don't see any restriction in the spec
> > >> about SR-IOV and legacy endpoints.
> > >>
> > >> I also don't know whether VGA is an issue. There are some legacy
> > >> addressing issues for [mem 0xa0000-0xbffff] and [io 0x3b0-0x3bb] and
> > >> [io 0x3c0-0x3df]. For example, when a bridge has its VGA Enable bit
> > >> set, it positively decodes [mem 0xa0000-0xbffff] even if that range
> > >> isn't included in one of the bridge windows. I don't know whether a
> > >> VGA device is similarly allowed to decode that range even if it's not
> > >> in a BAR. If it is, I could imagine issues if enabling SR-IOV created
> > >> several VGA VFs.
> > >VFs cannot support I/O port space by definition, so I don't think a "VGA
> > >VF" could actually exist. There would be nothing wrong with a non-VGA
> > >GPU VF though. I also don't see how the differences in Legacy Endpoint
> > >rules versus a standard Endpoint would preclude supporting SR-IOV. I
> > >don't think the SR-IOV spec makes any demands on whether the PF requires
> > >I/O port resources, which I assume is the main reason for this to call
> > >itself Legacy. I'd guess it was likely just an oversight and we should
> > >add legacy endpoints (or remove the test altogether and trust that if a
> > >device has an SR-IOV capability, we should initialize it). Thanks,
> > >
> > >Alex
> > >
> > >I agree. I vaguely remember there was some reason that excludes legacy
> > >endpoints from using SR-IOV. But after a quick look at the latest specs,
> > >I didn't find any.
> > >
> > Only Legacy Endpoints can claim I/O. VFs are not allowed to claim I/O.
> > VGA devices claim I/O.
> >
> > The SR-IOV spec 1.1 section 3.4.1.6 states
> > “The Class Code register is read-only and is used to identify the generic function
> > of the device and, in some cases, a specific register level programming interface.
> > The field in the PF and associated VFs must return the same value when read.”
> >
> > If the PF is a VGA device then by the definition in the SR-IOV spec the class code
> > of the VF would also indicate it as a VGA device, ie Subclass 0x0 = VGA,
> > subclass 0x80 = OTHER_DISPLAY_CONTROLLER.
> >
> > Ideally we would want to have the PF sub-class as 0x0 and the VF subclass as 0x80
> > But the spec doesn't support this.
> >
> > One speculation as to why Legacy endpoints were omitted might be the assumption
> > that doing so would allow VGA VFs to be created.
> >
> > It is not reasonable to prevent a VGA PF from enabling SR-IOV as this is a real world
> > possibility. We might need to add more code elsewhere however to prevent a VF
> > from becoming a VGA device outside of passing it through to a guest VM.
> >
> > Any thoughts on this?
>
> Good catch, I think you'll need to contact the PCI SIG for
> clarification. A VF with a VGA class code implies I/O space that a VF
> is not allowed to have. I wouldn't be surprised if they had hoped VGA
> was a non-issue by this point and legacy-free graphics were standard.
>
> Probably the most appealing solution would be an ECN allowing the VF to
> expose a different class code from the PF. This seems generally useful
> even beyond the scope of this legacy resource issue. Probably a less
> appealing option for you would be to expose the PF as a non-VGA display
> controller, which avoids both the legacy endpoint problem and the VF
> class code problem. I suspect we wouldn't find a lot of support for a
> software only solution to virtualize the class code unless it was
> standardized. Such a change would require support in every hypervisor.
> Thanks,
>
> Alex
>
SR-IOV device can have mixed function types. So the third option would
be to make the function 0 regular VGA device and function 1 PF with
subclass 0x80. By doing this, you retain the compatibility for your
host VGA driver and solve the subclass problem.
prev parent reply other threads:[~2014-02-24 22:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-21 17:45 Architectural question regarding IOV support in Linux 3.13.4 Zytaruk, Kelly
2014-02-21 21:11 ` Bjorn Helgaas
2014-02-21 22:03 ` Alex Williamson
[not found] ` <CAOUHufai=eoR+tgVB5Zdp6veCQgY=pHDSAESZ5cLOjPTw-R8CQ@mail.gmail.com>
2014-02-24 19:22 ` Zytaruk, Kelly
2014-02-24 20:51 ` Bjorn Helgaas
2014-02-24 21:08 ` Zytaruk, Kelly
2014-12-02 16:42 ` Zytaruk, Kelly
2014-12-02 21:01 ` Bjorn Helgaas
2014-12-02 21:11 ` Zytaruk, Kelly
2014-12-02 21:26 ` Bjorn Helgaas
2014-02-24 20:59 ` Alex Williamson
2014-02-24 21:57 ` Yu Zhao [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140224215708.GC9421@google.com \
--to=yuzhao@google.com \
--cc=Kelly.Zytaruk@amd.com \
--cc=alex.williamson@redhat.com \
--cc=bhelgaas@google.com \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).