[Qemu-devel] Attaching PCI devices to the PCIe root complex

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Attaching PCI devices to the PCIe root complex
@ 2013-09-24 10:01 Laine Stump
  2013-09-25  7:01 ` Michael S. Tsirkin
  2013-10-01 21:14 ` Michael S. Tsirkin
  0 siblings, 2 replies; 19+ messages in thread
From: Laine Stump @ 2013-09-24 10:01 UTC (permalink / raw)
  To: qemu list

When I added support for the Q35-based machinetypes to libvirt, I
specifically prohibited attaching any PCI devices (with the exception of
graphics controllers) to the PCIe root complex, and had planned to
prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
downstream switch ports (xio-3130 device) as well. I did this because,
even though qemu currently allows attaching a normal PCI device in any
of these three places, the restriction exists for real hardware and I
didn't see any guarantee that qemu wouldn't add the restriction in the
future in order to more closely emulate real hardware.

However, since I did that, I've learned that many of the qemu "pci"
devices really should be considered as "pci or pcie". Gerd Hoffman lists
some of these cases in a bug he filed against libvirt:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983

I would like to loosen up the restrictions in libvirt, but want to make
sure that I don't allow something that could later be forbidden by qemu
(thus creating a compatibility problem during upgrades). Beyond Gerd's
specific requests to allow ehci, uhci, and hda controllers to attach to
PCIe ports, are there any other devices that I specifically should or
shouldn't allow? (I would rather be conservative in what I allow - it's
easy to allow more things later, but nearly impossible to revoke
permission once it's been allowed).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-24 10:01 [Qemu-devel] Attaching PCI devices to the PCIe root complex Laine Stump
@ 2013-09-25  7:01 ` Michael S. Tsirkin
  2013-09-25  8:48   ` Marcel Apfelbaum
  2013-10-01 21:14 ` Michael S. Tsirkin
  1 sibling, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-25  7:01 UTC (permalink / raw)
  To: Laine Stump; +Cc: qemu list

On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> When I added support for the Q35-based machinetypes to libvirt, I
> specifically prohibited attaching any PCI devices (with the exception of
> graphics controllers) to the PCIe root complex,

That's wrong I think. Anything attached to RC is an integrated
endpoint, and these can be PCI devices.

> and had planned to
> prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
> downstream switch ports (xio-3130 device) as well. I did this because,
> even though qemu currently allows attaching a normal PCI device in any
> of these three places, the restriction exists for real hardware and I
> didn't see any guarantee that qemu wouldn't add the restriction in the
> future in order to more closely emulate real hardware.
> 
> However, since I did that, I've learned that many of the qemu "pci"
> devices really should be considered as "pci or pcie". Gerd Hoffman lists
> some of these cases in a bug he filed against libvirt:
> 
>    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
> 
> I would like to loosen up the restrictions in libvirt, but want to make
> sure that I don't allow something that could later be forbidden by qemu
> (thus creating a compatibility problem during upgrades). Beyond Gerd's
> specific requests to allow ehci, uhci, and hda controllers to attach to
> PCIe ports, are there any other devices that I specifically should or
> shouldn't allow? (I would rather be conservative in what I allow - it's
> easy to allow more things later, but nearly impossible to revoke
> permission once it's been allowed).

IMO, we really need to grow an interface to query this kind of thing.

-- 
MST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25  7:01 ` Michael S. Tsirkin
@ 2013-09-25  8:48   ` Marcel Apfelbaum
  2013-09-25  8:59     ` Michael S. Tsirkin
                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Marcel Apfelbaum @ 2013-09-25  8:48 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu list, Laine Stump

On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> > When I added support for the Q35-based machinetypes to libvirt, I
> > specifically prohibited attaching any PCI devices (with the exception of
> > graphics controllers) to the PCIe root complex,
> 
> That's wrong I think. Anything attached to RC is an integrated
> endpoint, and these can be PCI devices.
I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
must be PCIe. But, from spec 1.3.2.3:
- A Root Complex Integrated Endpoint must not require I/O resources claimed through BAR(s).
- A Root Complex Integrated Endpoint must not generate I/O Requests.
- A Root Complex Integrated Endpoint is required to support MSI or MSI-X or both if an
interrupt resource is requested.
I suppose that this restriction can be removed for PCI devices that
1. Actually work when plugged in into RC Integrated EndPoint
2. Respond to the above limitations

> 
> > and had planned to
> > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
> > downstream switch ports (xio-3130 device) as well. I did this because,
> > even though qemu currently allows attaching a normal PCI device in any
> > of these three places, the restriction exists for real hardware and I
> > didn't see any guarantee that qemu wouldn't add the restriction in the
> > future in order to more closely emulate real hardware.
> > 
> > However, since I did that, I've learned that many of the qemu "pci"
> > devices really should be considered as "pci or pcie". Gerd Hoffman lists
> > some of these cases in a bug he filed against libvirt:
> > 
> >    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
> > 
> > I would like to loosen up the restrictions in libvirt, but want to make
> > sure that I don't allow something that could later be forbidden by qemu
> > (thus creating a compatibility problem during upgrades). Beyond Gerd's
> > specific requests to allow ehci, uhci, and hda controllers to attach to
> > PCIe ports, are there any other devices that I specifically should or
> > shouldn't allow? (I would rather be conservative in what I allow - it's
> > easy to allow more things later, but nearly impossible to revoke
> > permission once it's been allowed).
For the moment I would not remove any restrictions, but only the ones
requested and verified by somebody.

> 
> IMO, we really need to grow an interface to query this kind of thing.
Basically libvirt needs to know:
1. for (libvirt) controllers: what kind of devices can be plugged in
2. for devices (controller is also a device)
    - to which controllers can it be plugged in
    - does it support hot-plug?
3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
All the above must be exported to libvirt

Implementation options:
1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
   PCI/PCIe or both (and maybe hot-plug)
   - consider plug type + compliance to figure out whether a plug can go into a socket
   
2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
   - dividing the devices into adapters and plugs
   - adding sockets to bridges(buses?).
   In this way it would be clear which devices can connect to bridges

Any thoughts?
Thanks,
Marcel


 

> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25  8:48   ` Marcel Apfelbaum
@ 2013-09-25  8:59     ` Michael S. Tsirkin
  2013-10-02  8:53       ` Paolo Bonzini
  2013-09-25  9:39     ` Laine Stump
  2013-09-27 17:06     ` Markus Armbruster
  2 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-25  8:59 UTC (permalink / raw)
  To: marcel.a; +Cc: libvir-list, qemu list, Laine Stump

On Wed, Sep 25, 2013 at 11:48:28AM +0300, Marcel Apfelbaum wrote:
> On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
> > On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> > > When I added support for the Q35-based machinetypes to libvirt, I
> > > specifically prohibited attaching any PCI devices (with the exception of
> > > graphics controllers) to the PCIe root complex,
> > 
> > That's wrong I think. Anything attached to RC is an integrated
> > endpoint, and these can be PCI devices.
> I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
> must be PCIe. But, from spec 1.3.2.3:
> - A Root Complex Integrated Endpoint must not require I/O resources claimed through BAR(s).
> - A Root Complex Integrated Endpoint must not generate I/O Requests.
> - A Root Complex Integrated Endpoint is required to support MSI or MSI-X or both if an
> interrupt resource is requested.

Heh PCI-SIG keeps fighting against legacy interrupts and IO.
But lots of hardware happily ignores these rules.
And the reason is simple: software does not enforce them.
Here's integrated stuff on my laptop:

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core
Processor Family Integrated Graphics Controller (rev 09) (prog-if 00
[VGA controller])
        Subsystem: Lenovo Device 21cf
        Flags: bus master, fast devsel, latency 0, IRQ 43
        Memory at f0000000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 5000 [size=64]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [a4] PCI Advanced Features
        Kernel driver in use: i915

So it has an IO BAR.


00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI])
        Subsystem: Lenovo Device 21cf
        Flags: bus master, medium devsel, latency 0, IRQ 16
        Memory at f252a000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Debug port: BAR=1 offset=00a0
        Capabilities: [98] PCI Advanced Features
        Kernel driver in use: ehci-pci

So IRQ but no MSI.


> I suppose that this restriction can be removed for PCI devices that
> 1. Actually work when plugged in into RC Integrated EndPoint
> 2. Respond to the above limitations

These limitations are just guidance for future devices.
They can't change the past, such devices were made.

> > 
> > > and had planned to
> > > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
> > > downstream switch ports (xio-3130 device) as well. I did this because,
> > > even though qemu currently allows attaching a normal PCI device in any
> > > of these three places, the restriction exists for real hardware and I
> > > didn't see any guarantee that qemu wouldn't add the restriction in the
> > > future in order to more closely emulate real hardware.
> > > 
> > > However, since I did that, I've learned that many of the qemu "pci"
> > > devices really should be considered as "pci or pcie". Gerd Hoffman lists
> > > some of these cases in a bug he filed against libvirt:
> > > 
> > >    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
> > > 
> > > I would like to loosen up the restrictions in libvirt, but want to make
> > > sure that I don't allow something that could later be forbidden by qemu
> > > (thus creating a compatibility problem during upgrades). Beyond Gerd's
> > > specific requests to allow ehci, uhci, and hda controllers to attach to
> > > PCIe ports, are there any other devices that I specifically should or
> > > shouldn't allow? (I would rather be conservative in what I allow - it's
> > > easy to allow more things later, but nearly impossible to revoke
> > > permission once it's been allowed).
> For the moment I would not remove any restrictions, but only the ones
> requested and verified by somebody.
> 
> > 
> > IMO, we really need to grow an interface to query this kind of thing.
> Basically libvirt needs to know:
> 1. for (libvirt) controllers: what kind of devices can be plugged in
> 2. for devices (controller is also a device)
>     - to which controllers can it be plugged in
>     - does it support hot-plug?
> 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
> All the above must be exported to libvirt
> 
> Implementation options:
> 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
>    PCI/PCIe or both (and maybe hot-plug)
>    - consider plug type + compliance to figure out whether a plug can go into a socket
>    
> 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
>    - dividing the devices into adapters and plugs
>    - adding sockets to bridges(buses?).
>    In this way it would be clear which devices can connect to bridges
> 
> Any thoughts?
> Thanks,
> Marcel

It's all not too hard to implement, we just need to know
what kind of interface makes sense for management.

So Cc libvir-list@redhat.com .

> 
> > 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25  8:48   ` Marcel Apfelbaum
  2013-09-25  8:59     ` Michael S. Tsirkin
@ 2013-09-25  9:39     ` Laine Stump
  2013-09-25 10:00       ` Michael S. Tsirkin
  2013-09-27 17:06     ` Markus Armbruster
  2 siblings, 1 reply; 19+ messages in thread
From: Laine Stump @ 2013-09-25  9:39 UTC (permalink / raw)
  To: marcel.a; +Cc: qemu list, Marcel Apfelbaum, Michael S. Tsirkin

On 09/25/2013 04:48 AM, Marcel Apfelbaum wrote:
> On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
>> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
>>> When I added support for the Q35-based machinetypes to libvirt, I
>>> specifically prohibited attaching any PCI devices (with the exception of
>>> graphics controllers) to the PCIe root complex,
>> That's wrong I think. Anything attached to RC is an integrated
>> endpoint, and these can be PCI devices.
> I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
> must be PCIe. But, from spec 1.3.2.3:
> - A Root Complex Integrated Endpoint must not require I/O resources claimed through BAR(s).
> - A Root Complex Integrated Endpoint must not generate I/O Requests.
> - A Root Complex Integrated Endpoint is required to support MSI or MSI-X or both if an
> interrupt resource is requested.

So much easier in the real world, where the rule is "if it fits in the
socket and the pins match up, then it's okay" :-)


>> IMO, we really need to grow an interface to query this kind of thing.
> Basically libvirt needs to know:
> 1. for (libvirt) controllers: what kind of devices can be plugged in

The controllers also need a flag indicating if they supporting having
devices hot-plugged into them. For example, as far as I understand, the
PCI root complex ("pcie-root" in libvirt) doesn't support hot-plugging
devices, nor does i82801b11-bridge ("dmi-to-pci-bridge" in libvirt), but
pci-bridge, ioh3420 ("root-port" in libvirt), and xio3130-downstream
("downstream-switch-port" in libvirt) *do* support hot-plugging devices
(as far as I know, none of these controllers can themselves be
hot-plugged into another controller, but that could change in the
future, e.g. I recall someone saying something about  hot-plug of a
pci-bridge being a future possibility)


> 2. for devices (controller is also a device)
>     - to which controllers can it be plugged in
>     - does it support hot-plug?
> 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
> All the above must be exported to libvirt
>
> Implementation options:
> 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
>    PCI/PCIe or both (and maybe hot-plug)
>    - consider plug type + compliance to figure out whether a plug can go into a socket
>    
> 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
>    - dividing the devices into adapters and plugs
>    - adding sockets to bridges(buses?).
>    In this way it would be clear which devices can connect to bridges

However it is done, each controller will need to have two sets of flags
- one for what it can plug into, and one for what can be plugged into it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25  9:39     ` Laine Stump
@ 2013-09-25 10:00       ` Michael S. Tsirkin
  2013-09-25 10:14         ` Laine Stump
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-25 10:00 UTC (permalink / raw)
  To: Laine Stump; +Cc: qemu list, Marcel Apfelbaum, marcel.a

On Wed, Sep 25, 2013 at 05:39:37AM -0400, Laine Stump wrote:
> On 09/25/2013 04:48 AM, Marcel Apfelbaum wrote:
> > On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
> >> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> >>> When I added support for the Q35-based machinetypes to libvirt, I
> >>> specifically prohibited attaching any PCI devices (with the exception of
> >>> graphics controllers) to the PCIe root complex,
> >> That's wrong I think. Anything attached to RC is an integrated
> >> endpoint, and these can be PCI devices.
> > I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
> > must be PCIe. But, from spec 1.3.2.3:
> > - A Root Complex Integrated Endpoint must not require I/O resources claimed through BAR(s).
> > - A Root Complex Integrated Endpoint must not generate I/O Requests.
> > - A Root Complex Integrated Endpoint is required to support MSI or MSI-X or both if an
> > interrupt resource is requested.
> 
> So much easier in the real world, where the rule is "if it fits in the
> socket and the pins match up, then it's okay" :-)
> 

Well real world hardware has even more limitations,
like different PCIe form-factors.

Also I'm not aware of such hardware for PCI/PCIe specifically,
but hardware with support for multiple buses
does exist, e.g. I have a disk with both USB and eSATA
interfaces.
Also, it could be same chip with different interfaces on top.
At some point intel made hardware that looked almost exactly
like a PCI-X part except it had PCI-E interface.

We can stick all this info into hardware type but it's
a really ugly way to do this.

See for example virtio-net-pci which users still can't
wrap their heads around.
They really want to say use virtio net device.


> >> IMO, we really need to grow an interface to query this kind of thing.
> > Basically libvirt needs to know:
> > 1. for (libvirt) controllers: what kind of devices can be plugged in
> 
> The controllers also need a flag indicating if they supporting having
> devices hot-plugged into them. For example, as far as I understand, the
> PCI root complex ("pcie-root" in libvirt) doesn't support hot-plugging
> devices,

Not exactly. It doesn't support native hotplug.

> nor does i82801b11-bridge ("dmi-to-pci-bridge" in libvirt), but
> pci-bridge, ioh3420 ("root-port" in libvirt), and xio3130-downstream
> ("downstream-switch-port" in libvirt) *do* support hot-plugging devices
> (as far as I know, none of these controllers can themselves be
> hot-plugged into another controller, but that could change in the
> future, e.g. I recall someone saying something about  hot-plug of a
> pci-bridge being a future possibility)
> 
> 
> > 2. for devices (controller is also a device)
> >     - to which controllers can it be plugged in
> >     - does it support hot-plug?
> > 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
> > All the above must be exported to libvirt
> >
> > Implementation options:
> > 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
> >    PCI/PCIe or both (and maybe hot-plug)
> >    - consider plug type + compliance to figure out whether a plug can go into a socket
> >    
> > 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
> >    - dividing the devices into adapters and plugs
> >    - adding sockets to bridges(buses?).
> >    In this way it would be clear which devices can connect to bridges
> 
> However it is done, each controller will need to have two sets of flags
> - one for what it can plug into, and one for what can be plugged into it.

Not just into it - what can be plugged into each slot.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25 10:00       ` Michael S. Tsirkin
@ 2013-09-25 10:14         ` Laine Stump
  2013-09-25 10:56           ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Laine Stump @ 2013-09-25 10:14 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu list, Marcel Apfelbaum, marcel.a

On 09/25/2013 06:00 AM, Michael S. Tsirkin wrote:
> On Wed, Sep 25, 2013 at 05:39:37AM -0400, Laine Stump wrote:
>> However it is done, each controller will need to have two sets of flags
>> - one for what it can plug into, and one for what can be plugged into it.
> Not just into it - what can be plugged into each slot.

There are controllers that have different capabilities for different
slots? Well, that complicates things a bit...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25 10:14         ` Laine Stump
@ 2013-09-25 10:56           ` Michael S. Tsirkin
  2013-09-25 10:58             ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-25 10:56 UTC (permalink / raw)
  To: Laine Stump; +Cc: qemu list, Marcel Apfelbaum, marcel.a

On Wed, Sep 25, 2013 at 06:14:27AM -0400, Laine Stump wrote:
> On 09/25/2013 06:00 AM, Michael S. Tsirkin wrote:
> > On Wed, Sep 25, 2013 at 05:39:37AM -0400, Laine Stump wrote:
> >> However it is done, each controller will need to have two sets of flags
> >> - one for what it can plug into, and one for what can be plugged into it.
> > Not just into it - what can be plugged into each slot.
> 
> There are controllers that have different capabilities for different
> slots?

For once, not all slots can support hotplug.

> Well, that complicates things a bit...

Lots of controllers have multiple interfaces.
You might not call them "slots" and we might
model them as multiple devices ATM, but it's
an artifact of QEMU - they are often a single chip.

-- 
mST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25 10:56           ` Michael S. Tsirkin
@ 2013-09-25 10:58             ` Michael S. Tsirkin
  0 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-25 10:58 UTC (permalink / raw)
  To: Laine Stump; +Cc: qemu list, Marcel Apfelbaum, marcel.a

On Wed, Sep 25, 2013 at 01:56:20PM +0300, Michael S. Tsirkin wrote:
> On Wed, Sep 25, 2013 at 06:14:27AM -0400, Laine Stump wrote:
> > On 09/25/2013 06:00 AM, Michael S. Tsirkin wrote:
> > > On Wed, Sep 25, 2013 at 05:39:37AM -0400, Laine Stump wrote:
> > >> However it is done, each controller will need to have two sets of flags
> > >> - one for what it can plug into, and one for what can be plugged into it.
> > > Not just into it - what can be plugged into each slot.
> > 
> > There are controllers that have different capabilities for different
> > slots?
> 
> For once, not all slots can support hotplug.
> 
> > Well, that complicates things a bit...
> 
> Lots of controllers have multiple interfaces.
> You might not call them "slots" and we might
> model them as multiple devices ATM, but it's
> an artifact of QEMU - they are often a single chip.

qemu calls them buses normally

> -- 
> mST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25  8:48   ` Marcel Apfelbaum
  2013-09-25  8:59     ` Michael S. Tsirkin
  2013-09-25  9:39     ` Laine Stump
@ 2013-09-27 17:06     ` Markus Armbruster
  2013-09-28 18:12       ` Michael S. Tsirkin
  2 siblings, 1 reply; 19+ messages in thread
From: Markus Armbruster @ 2013-09-27 17:06 UTC (permalink / raw)
  To: marcel.a
  Cc: Anthony Liguori, Andreas Färber, qemu list, Laine Stump,
	Michael S. Tsirkin

Marcel Apfelbaum <marcel.apfelbaum@gmail.com> writes:

> On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
>> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
>> > When I added support for the Q35-based machinetypes to libvirt, I
>> > specifically prohibited attaching any PCI devices (with the exception of
>> > graphics controllers) to the PCIe root complex,
>> 
>> That's wrong I think. Anything attached to RC is an integrated
>> endpoint, and these can be PCI devices.
> I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
> must be PCIe. But, from spec 1.3.2.3:
> - A Root Complex Integrated Endpoint must not require I/O resources
> claimed through BAR(s).
> - A Root Complex Integrated Endpoint must not generate I/O Requests.
> - A Root Complex Integrated Endpoint is required to support MSI or
> MSI-X or both if an
> interrupt resource is requested.
> I suppose that this restriction can be removed for PCI devices that
> 1. Actually work when plugged in into RC Integrated EndPoint
> 2. Respond to the above limitations
>
>> 
>> > and had planned to
>> > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
>> > downstream switch ports (xio-3130 device) as well. I did this because,
>> > even though qemu currently allows attaching a normal PCI device in any
>> > of these three places, the restriction exists for real hardware and I
>> > didn't see any guarantee that qemu wouldn't add the restriction in the
>> > future in order to more closely emulate real hardware.
>> > 
>> > However, since I did that, I've learned that many of the qemu "pci"
>> > devices really should be considered as "pci or pcie". Gerd Hoffman lists
>> > some of these cases in a bug he filed against libvirt:
>> > 
>> >    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
>> > 
>> > I would like to loosen up the restrictions in libvirt, but want to make
>> > sure that I don't allow something that could later be forbidden by qemu
>> > (thus creating a compatibility problem during upgrades). Beyond Gerd's
>> > specific requests to allow ehci, uhci, and hda controllers to attach to
>> > PCIe ports, are there any other devices that I specifically should or
>> > shouldn't allow? (I would rather be conservative in what I allow - it's
>> > easy to allow more things later, but nearly impossible to revoke
>> > permission once it's been allowed).
> For the moment I would not remove any restrictions, but only the ones
> requested and verified by somebody.
>
>> 
>> IMO, we really need to grow an interface to query this kind of thing.
> Basically libvirt needs to know:
> 1. for (libvirt) controllers: what kind of devices can be plugged in
> 2. for devices (controller is also a device)
>     - to which controllers can it be plugged in
>     - does it support hot-plug?
> 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
> All the above must be exported to libvirt
>
> Implementation options:
> 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
>    PCI/PCIe or both (and maybe hot-plug)
>    - consider plug type + compliance to figure out whether a plug can go into a socket
>    
> 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
>    - dividing the devices into adapters and plugs
>    - adding sockets to bridges(buses?).
>    In this way it would be clear which devices can connect to bridges

This isn't actually my idea.  It's how things are designed to work in
qdev, at least in my admittedly limited understanding of qdev.

In traditional qdev, a device has exactly one plug (its "bus type",
shown by -device help), and it may have one ore more buses.  Each bus
has a type, and you can plug a device only into a bus of the matching
type.  This was too limiting, and is not how things work now.

As far as I know, libvirt already understands that a device can only
plug into a matching bus.

In my understanding of where we're headed with qdev, things are / will
become more general, yet stay really simple:

* A device can have an arbitrary number of sockets and plugs.

* Each socket / plug has a type.

* A plug can go into a socket of the same type, and only there.

Pretty straightforward generalization of traditional qdev.  I wouldn't
expect libvirt to have serious trouble coping with it, as long as we
provide the necessary information on device models' plugs and sockets.

In this framework, there's no such thing as a device model that can plug
either into a PCI or a PCIe socket.  Makes sense to me, because there's
no such thing in the physical world, either.

Instead, and just like in the physical world, you have one separate
device variant per desired plug type.

To get that, you have to split the device into a common core and bus
adapters.  You compose the core with the PCI adapter to get the PCI
device, with the PCIe adapter to get the PCIe device, and so forth.

I'm not claiming that's the best way to do PCI + PCIe.  It's a purely
theoretical approach, concerned only with conceptual cleanliness, not
practical coding difficulties.

What we have now is entirely different: we've overloaded the existing
PCI plug with all the other PCI-related plug types that came with the
PCIe support, so it means pretty much nothing anymore.  In particular,
there's no way for libvirt to figure out progragmatically whether some
alleged "PCI" device can go into some alleged "PCI" bus.  I call that a
mess.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-27 17:06     ` Markus Armbruster
@ 2013-09-28 18:12       ` Michael S. Tsirkin
  2013-09-30  9:55         ` Markus Armbruster
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-28 18:12 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Anthony Liguori, Andreas Färber, qemu list, Laine Stump,
	marcel.a

On Fri, Sep 27, 2013 at 07:06:44PM +0200, Markus Armbruster wrote:
> Marcel Apfelbaum <marcel.apfelbaum@gmail.com> writes:
> 
> > On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
> >> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> >> > When I added support for the Q35-based machinetypes to libvirt, I
> >> > specifically prohibited attaching any PCI devices (with the exception of
> >> > graphics controllers) to the PCIe root complex,
> >> 
> >> That's wrong I think. Anything attached to RC is an integrated
> >> endpoint, and these can be PCI devices.
> > I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
> > must be PCIe. But, from spec 1.3.2.3:
> > - A Root Complex Integrated Endpoint must not require I/O resources
> > claimed through BAR(s).
> > - A Root Complex Integrated Endpoint must not generate I/O Requests.
> > - A Root Complex Integrated Endpoint is required to support MSI or
> > MSI-X or both if an
> > interrupt resource is requested.
> > I suppose that this restriction can be removed for PCI devices that
> > 1. Actually work when plugged in into RC Integrated EndPoint
> > 2. Respond to the above limitations
> >
> >> 
> >> > and had planned to
> >> > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
> >> > downstream switch ports (xio-3130 device) as well. I did this because,
> >> > even though qemu currently allows attaching a normal PCI device in any
> >> > of these three places, the restriction exists for real hardware and I
> >> > didn't see any guarantee that qemu wouldn't add the restriction in the
> >> > future in order to more closely emulate real hardware.
> >> > 
> >> > However, since I did that, I've learned that many of the qemu "pci"
> >> > devices really should be considered as "pci or pcie". Gerd Hoffman lists
> >> > some of these cases in a bug he filed against libvirt:
> >> > 
> >> >    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
> >> > 
> >> > I would like to loosen up the restrictions in libvirt, but want to make
> >> > sure that I don't allow something that could later be forbidden by qemu
> >> > (thus creating a compatibility problem during upgrades). Beyond Gerd's
> >> > specific requests to allow ehci, uhci, and hda controllers to attach to
> >> > PCIe ports, are there any other devices that I specifically should or
> >> > shouldn't allow? (I would rather be conservative in what I allow - it's
> >> > easy to allow more things later, but nearly impossible to revoke
> >> > permission once it's been allowed).
> > For the moment I would not remove any restrictions, but only the ones
> > requested and verified by somebody.
> >
> >> 
> >> IMO, we really need to grow an interface to query this kind of thing.
> > Basically libvirt needs to know:
> > 1. for (libvirt) controllers: what kind of devices can be plugged in
> > 2. for devices (controller is also a device)
> >     - to which controllers can it be plugged in
> >     - does it support hot-plug?
> > 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
> > All the above must be exported to libvirt
> >
> > Implementation options:
> > 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
> >    PCI/PCIe or both (and maybe hot-plug)
> >    - consider plug type + compliance to figure out whether a plug can go into a socket
> >    
> > 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
> >    - dividing the devices into adapters and plugs
> >    - adding sockets to bridges(buses?).
> >    In this way it would be clear which devices can connect to bridges
> 
> This isn't actually my idea.  It's how things are designed to work in
> qdev, at least in my admittedly limited understanding of qdev.
> 
> In traditional qdev, a device has exactly one plug (its "bus type",
> shown by -device help), and it may have one ore more buses.  Each bus
> has a type, and you can plug a device only into a bus of the matching
> type.  This was too limiting, and is not how things work now.
> 
> As far as I know, libvirt already understands that a device can only
> plug into a matching bus.
> 
> In my understanding of where we're headed with qdev, things are / will
> become more general, yet stay really simple:
> 
> * A device can have an arbitrary number of sockets and plugs.
> 
> * Each socket / plug has a type.
> 
> * A plug can go into a socket of the same type, and only there.
> 
> Pretty straightforward generalization of traditional qdev.  I wouldn't
> expect libvirt to have serious trouble coping with it, as long as we
> provide the necessary information on device models' plugs and sockets.
> 
> In this framework, there's no such thing as a device model that can plug
> either into a PCI or a PCIe socket.  Makes sense to me, because there's
> no such thing in the physical world, either.
> Instead, and just like in the physical world, you have one separate
> device variant per desired plug type.
> 

Two types of bus is not how things work in real world though.
In real world there are 3 types of express bus besides
classical pci bus, and limitations on which devices go where
that actually apply to devices qemu emulates.
For example, a dwonstream switch port can only go on
internal virtual express bus of a switch.
Devices with multiple interfaces actually do exist
in real world - e.g. esata/usb - I never heard of a pci/pci express
one but it's not impossible I think.

> To get that, you have to split the device into a common core and bus
> adapters.  You compose the core with the PCI adapter to get the PCI
> device, with the PCIe adapter to get the PCIe device, and so forth.
> I'm not claiming that's the best way to do PCI + PCIe.  It's a purely
> theoretical approach, concerned only with conceptual cleanliness, not
> practical coding difficulties.

I don't mind if that's the internal implementation,
but I don't think we should expose this split
in the user interface.

> What we have now is entirely different: we've overloaded the existing
> PCI plug with all the other PCI-related plug types that came with the
> PCIe support, so it means pretty much nothing anymore.  In particular,
> there's no way for libvirt to figure out progragmatically whether some
> alleged "PCI" device can go into some alleged "PCI" bus.  I call that a
> mess.

There are lots of problems.

First, bus type is not the only factor that can limit
which devices go where.
For example, specific slots might not support hotplug.
Another example, all devices in same pci slot must have
"multifunction" property set.
Second, there's no way to find out what is a valid
bus address. For example, with express you can
only use slot 0.
Hotplug is only supported ATM if no two devices
share a pci slot.
Also, there's apparently no way to figure
out what kind of bus (or multiple buses) is behind each device.


Solution proposed above (separate each device into
two parts) only solves the pci versus express issue without
addressing any of the other issues.
So I'm not sure it's worth the effort.

-- 
MST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-28 18:12       ` Michael S. Tsirkin
@ 2013-09-30  9:55         ` Markus Armbruster
  2013-09-30 10:44           ` Laine Stump
  2013-09-30 10:48           ` Michael S. Tsirkin
  0 siblings, 2 replies; 19+ messages in thread
From: Markus Armbruster @ 2013-09-30  9:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Anthony Liguori, Andreas Färber, qemu list, Laine Stump,
	marcel.a

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Fri, Sep 27, 2013 at 07:06:44PM +0200, Markus Armbruster wrote:
>> Marcel Apfelbaum <marcel.apfelbaum@gmail.com> writes:
>> 
>> > On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
>> >> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
>> >> > When I added support for the Q35-based machinetypes to libvirt, I
>> >> > specifically prohibited attaching any PCI devices (with the exception of
>> >> > graphics controllers) to the PCIe root complex,
>> >> 
>> >> That's wrong I think. Anything attached to RC is an integrated
>> >> endpoint, and these can be PCI devices.
>> > I couldn't find on PCIe spec any mention that "Root Complex
>> > Integrated EndPoint"
>> > must be PCIe. But, from spec 1.3.2.3:
>> > - A Root Complex Integrated Endpoint must not require I/O resources
>> > claimed through BAR(s).
>> > - A Root Complex Integrated Endpoint must not generate I/O Requests.
>> > - A Root Complex Integrated Endpoint is required to support MSI or
>> > MSI-X or both if an
>> > interrupt resource is requested.
>> > I suppose that this restriction can be removed for PCI devices that
>> > 1. Actually work when plugged in into RC Integrated EndPoint
>> > 2. Respond to the above limitations
>> >
>> >> 
>> >> > and had planned to
>> >> > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
>> >> > downstream switch ports (xio-3130 device) as well. I did this because,
>> >> > even though qemu currently allows attaching a normal PCI device in any
>> >> > of these three places, the restriction exists for real hardware and I
>> >> > didn't see any guarantee that qemu wouldn't add the restriction in the
>> >> > future in order to more closely emulate real hardware.
>> >> > 
>> >> > However, since I did that, I've learned that many of the qemu "pci"
>> >> > devices really should be considered as "pci or pcie". Gerd Hoffman lists
>> >> > some of these cases in a bug he filed against libvirt:
>> >> > 
>> >> >    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
>> >> > 
>> >> > I would like to loosen up the restrictions in libvirt, but want to make
>> >> > sure that I don't allow something that could later be forbidden by qemu
>> >> > (thus creating a compatibility problem during upgrades). Beyond Gerd's
>> >> > specific requests to allow ehci, uhci, and hda controllers to attach to
>> >> > PCIe ports, are there any other devices that I specifically should or
>> >> > shouldn't allow? (I would rather be conservative in what I allow - it's
>> >> > easy to allow more things later, but nearly impossible to revoke
>> >> > permission once it's been allowed).
>> > For the moment I would not remove any restrictions, but only the ones
>> > requested and verified by somebody.
>> >
>> >> 
>> >> IMO, we really need to grow an interface to query this kind of thing.
>> > Basically libvirt needs to know:
>> > 1. for (libvirt) controllers: what kind of devices can be plugged in
>> > 2. for devices (controller is also a device)
>> >     - to which controllers can it be plugged in
>> >     - does it support hot-plug?
>> > 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
>> > All the above must be exported to libvirt
>> >
>> > Implementation options:
>> > 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
>> >    PCI/PCIe or both (and maybe hot-plug)
>> >    - consider plug type + compliance to figure out whether a plug can go into a socket
>> >    
>> > 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
>> >    - dividing the devices into adapters and plugs
>> >    - adding sockets to bridges(buses?).
>> >    In this way it would be clear which devices can connect to bridges
>> 
>> This isn't actually my idea.  It's how things are designed to work in
>> qdev, at least in my admittedly limited understanding of qdev.
>> 
>> In traditional qdev, a device has exactly one plug (its "bus type",
>> shown by -device help), and it may have one ore more buses.  Each bus
>> has a type, and you can plug a device only into a bus of the matching
>> type.  This was too limiting, and is not how things work now.
>> 
>> As far as I know, libvirt already understands that a device can only
>> plug into a matching bus.
>> 
>> In my understanding of where we're headed with qdev, things are / will
>> become more general, yet stay really simple:
>> 
>> * A device can have an arbitrary number of sockets and plugs.
>> 
>> * Each socket / plug has a type.
>> 
>> * A plug can go into a socket of the same type, and only there.
>> 
>> Pretty straightforward generalization of traditional qdev.  I wouldn't
>> expect libvirt to have serious trouble coping with it, as long as we
>> provide the necessary information on device models' plugs and sockets.
>> 
>> In this framework, there's no such thing as a device model that can plug
>> either into a PCI or a PCIe socket.  Makes sense to me, because there's
>> no such thing in the physical world, either.
>> Instead, and just like in the physical world, you have one separate
>> device variant per desired plug type.
>> 
>
> Two types of bus is not how things work in real world though.
> In real world there are 3 types of express bus besides
> classical pci bus, and limitations on which devices go where
> that actually apply to devices qemu emulates.
> For example, a dwonstream switch port can only go on
> internal virtual express bus of a switch.
> Devices with multiple interfaces actually do exist
> in real world - e.g. esata/usb -

I think the orthodox way to model a disk with both eSATA an USB
connectors would be two separate plugs, where only one of them can be
used at the same time.  Not that I can see why anyone would want to
model such a device when you can just as well have separate eSATA-only
and USB-only devices, and use the one you want.

>                                  I never heard of a pci/pci express
> one but it's not impossible I think.

PCI on one side of the card, PCIe on the other, and a switchable
backplate?  Weird :)

Again, I can't see why we'd want to model this, even if it existed.

Nevertheless, point taken: devices with multiple interfaces of which
only one can be used at the same time exist, and we can't exclude the
possibility that we want to model such a device one day.

>> To get that, you have to split the device into a common core and bus
>> adapters.  You compose the core with the PCI adapter to get the PCI
>> device, with the PCIe adapter to get the PCIe device, and so forth.
>> I'm not claiming that's the best way to do PCI + PCIe.  It's a purely
>> theoretical approach, concerned only with conceptual cleanliness, not
>> practical coding difficulties.
>
> I don't mind if that's the internal implementation,
> but I don't think we should expose this split
> in the user interface.

I'm not so sure.

The current interface munges together all PCIish connectors, and the
result is a mess: users can't see which device can be plugged into which
socket.  Libvirt needs to know, and it has grown a bunch of hardcoded ad
hoc rules, which aren't quite right.

With separate types for incompatible plugs and sockets, the "what can
plug into what" question remains as trivial as it was by the initial
design.

>> What we have now is entirely different: we've overloaded the existing
>> PCI plug with all the other PCI-related plug types that came with the
>> PCIe support, so it means pretty much nothing anymore.  In particular,
>> there's no way for libvirt to figure out progragmatically whether some
>> alleged "PCI" device can go into some alleged "PCI" bus.  I call that a
>> mess.
>
> There are lots of problems.
>
> First, bus type is not the only factor that can limit
> which devices go where.
> For example, specific slots might not support hotplug.

We made "can hotplug" a property of the bus.  Perhaps it should be a
property of the slot.

> Another example, all devices in same pci slot must have
> "multifunction" property set.

PCI multifunction devices are simply done wrong in the current code.

I think the orthodox way to model multifunction devices would involve
composing the functions with a container device, resulting in a
composite PCI device that can only be plugged as a whole.

Again, this is a conceptually clean approach, unconcerned with practical
coding difficulties.

> Second, there's no way to find out what is a valid
> bus address. For example, with express you can
> only use slot 0.

Device introspection via QOM should let you enumerate available sockets.

> Hotplug is only supported ATM if no two devices
> share a pci slot.

Do you mean "hotplug works only with multifunction off"?  If yes, see
above.  If no, please elaborate.

> Also, there's apparently no way to figure
> out what kind of bus (or multiple buses) is behind each device.

Again, introspection via QOM should let you enumerate available sockets.

> Solution proposed above (separate each device into
> two parts) only solves the pci versus express issue without
> addressing any of the other issues.
> So I'm not sure it's worth the effort.

As I said, I'm not claiming I know the only sane solution to this
problem.  I've only described a solution that stays true to the qdev /
QOM design as I understand it.  True qdev / QOM experts, please help out
with advice.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-30  9:55         ` Markus Armbruster
@ 2013-09-30 10:44           ` Laine Stump
  2013-09-30 10:48           ` Michael S. Tsirkin
  1 sibling, 0 replies; 19+ messages in thread
From: Laine Stump @ 2013-09-30 10:44 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Andreas Färber, marcel.a, qemu list, Anthony Liguori,
	Michael S. Tsirkin

On 09/30/2013 05:55 AM, Markus Armbruster wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> I never heard of a pci/pci express
>> one but it's not impossible I think.
> PCI on one side of the card, PCIe on the other, and a switchable
> backplate?  Weird :)
>
> Again, I can't see why we'd want to model this, even if it existed.

Unfortunately that's what's been done, so libvirt will have to deal with
it, even after qemu gets it fixed.

> The current interface munges together all PCIish connectors, and the
> result is a mess: users can't see which device can be plugged into which
> socket.  Libvirt needs to know, and it has grown a bunch of hardcoded ad
> hoc rules, which aren't quite right.

The good news is that libvirt has only recently started dealing with
anything other than vanilla PCI slots. The bad news is that it has. I
guess everything should work out okay if we just keep the current "wild
guess" code around, but only fall back on it if the new more accurate
information is unavailable.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-30  9:55         ` Markus Armbruster
  2013-09-30 10:44           ` Laine Stump
@ 2013-09-30 10:48           ` Michael S. Tsirkin
  2013-09-30 16:01             ` Gerd Hoffmann
  1 sibling, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-30 10:48 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Anthony Liguori, Andreas Färber, qemu list, Laine Stump,
	marcel.a

On Mon, Sep 30, 2013 at 11:55:47AM +0200, Markus Armbruster wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Fri, Sep 27, 2013 at 07:06:44PM +0200, Markus Armbruster wrote:
> >> Marcel Apfelbaum <marcel.apfelbaum@gmail.com> writes:
> >> 
> >> > On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote:
> >> >> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> >> >> > When I added support for the Q35-based machinetypes to libvirt, I
> >> >> > specifically prohibited attaching any PCI devices (with the exception of
> >> >> > graphics controllers) to the PCIe root complex,
> >> >> 
> >> >> That's wrong I think. Anything attached to RC is an integrated
> >> >> endpoint, and these can be PCI devices.
> >> > I couldn't find on PCIe spec any mention that "Root Complex
> >> > Integrated EndPoint"
> >> > must be PCIe. But, from spec 1.3.2.3:
> >> > - A Root Complex Integrated Endpoint must not require I/O resources
> >> > claimed through BAR(s).
> >> > - A Root Complex Integrated Endpoint must not generate I/O Requests.
> >> > - A Root Complex Integrated Endpoint is required to support MSI or
> >> > MSI-X or both if an
> >> > interrupt resource is requested.
> >> > I suppose that this restriction can be removed for PCI devices that
> >> > 1. Actually work when plugged in into RC Integrated EndPoint
> >> > 2. Respond to the above limitations
> >> >
> >> >> 
> >> >> > and had planned to
> >> >> > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
> >> >> > downstream switch ports (xio-3130 device) as well. I did this because,
> >> >> > even though qemu currently allows attaching a normal PCI device in any
> >> >> > of these three places, the restriction exists for real hardware and I
> >> >> > didn't see any guarantee that qemu wouldn't add the restriction in the
> >> >> > future in order to more closely emulate real hardware.
> >> >> > 
> >> >> > However, since I did that, I've learned that many of the qemu "pci"
> >> >> > devices really should be considered as "pci or pcie". Gerd Hoffman lists
> >> >> > some of these cases in a bug he filed against libvirt:
> >> >> > 
> >> >> >    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
> >> >> > 
> >> >> > I would like to loosen up the restrictions in libvirt, but want to make
> >> >> > sure that I don't allow something that could later be forbidden by qemu
> >> >> > (thus creating a compatibility problem during upgrades). Beyond Gerd's
> >> >> > specific requests to allow ehci, uhci, and hda controllers to attach to
> >> >> > PCIe ports, are there any other devices that I specifically should or
> >> >> > shouldn't allow? (I would rather be conservative in what I allow - it's
> >> >> > easy to allow more things later, but nearly impossible to revoke
> >> >> > permission once it's been allowed).
> >> > For the moment I would not remove any restrictions, but only the ones
> >> > requested and verified by somebody.
> >> >
> >> >> 
> >> >> IMO, we really need to grow an interface to query this kind of thing.
> >> > Basically libvirt needs to know:
> >> > 1. for (libvirt) controllers: what kind of devices can be plugged in
> >> > 2. for devices (controller is also a device)
> >> >     - to which controllers can it be plugged in
> >> >     - does it support hot-plug?
> >> > 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx - "pci-root")
> >> > All the above must be exported to libvirt
> >> >
> >> > Implementation options:
> >> > 1. Add a compliance field on PCI/PCIe devices and controllers stating if it supports
> >> >    PCI/PCIe or both (and maybe hot-plug)
> >> >    - consider plug type + compliance to figure out whether a plug can go into a socket
> >> >    
> >> > 2. Use Markus Armbruster idea of introducing a concept of "plug and sockets":
> >> >    - dividing the devices into adapters and plugs
> >> >    - adding sockets to bridges(buses?).
> >> >    In this way it would be clear which devices can connect to bridges
> >> 
> >> This isn't actually my idea.  It's how things are designed to work in
> >> qdev, at least in my admittedly limited understanding of qdev.
> >> 
> >> In traditional qdev, a device has exactly one plug (its "bus type",
> >> shown by -device help), and it may have one ore more buses.  Each bus
> >> has a type, and you can plug a device only into a bus of the matching
> >> type.  This was too limiting, and is not how things work now.
> >> 
> >> As far as I know, libvirt already understands that a device can only
> >> plug into a matching bus.
> >> 
> >> In my understanding of where we're headed with qdev, things are / will
> >> become more general, yet stay really simple:
> >> 
> >> * A device can have an arbitrary number of sockets and plugs.
> >> 
> >> * Each socket / plug has a type.
> >> 
> >> * A plug can go into a socket of the same type, and only there.
> >> 
> >> Pretty straightforward generalization of traditional qdev.  I wouldn't
> >> expect libvirt to have serious trouble coping with it, as long as we
> >> provide the necessary information on device models' plugs and sockets.
> >> 
> >> In this framework, there's no such thing as a device model that can plug
> >> either into a PCI or a PCIe socket.  Makes sense to me, because there's
> >> no such thing in the physical world, either.
> >> Instead, and just like in the physical world, you have one separate
> >> device variant per desired plug type.
> >> 
> >
> > Two types of bus is not how things work in real world though.
> > In real world there are 3 types of express bus besides
> > classical pci bus, and limitations on which devices go where
> > that actually apply to devices qemu emulates.
> > For example, a dwonstream switch port can only go on
> > internal virtual express bus of a switch.
> > Devices with multiple interfaces actually do exist
> > in real world - e.g. esata/usb -
> 
> I think the orthodox way to model a disk with both eSATA an USB
> connectors would be two separate plugs, where only one of them can be
> used at the same time.  Not that I can see why anyone would want to
> model such a device when you can just as well have separate eSATA-only
> and USB-only devices, and use the one you want.
> 
> >                                  I never heard of a pci/pci express
> > one but it's not impossible I think.
> 
> PCI on one side of the card, PCIe on the other, and a switchable
> backplate?  Weird :)
> 
> Again, I can't see why we'd want to model this, even if it existed.
> 
> Nevertheless, point taken: devices with multiple interfaces of which
> only one can be used at the same time exist, and we can't exclude the
> possibility that we want to model such a device one day.
> 
> >> To get that, you have to split the device into a common core and bus
> >> adapters.  You compose the core with the PCI adapter to get the PCI
> >> device, with the PCIe adapter to get the PCIe device, and so forth.
> >> I'm not claiming that's the best way to do PCI + PCIe.  It's a purely
> >> theoretical approach, concerned only with conceptual cleanliness, not
> >> practical coding difficulties.
> >
> > I don't mind if that's the internal implementation,
> > but I don't think we should expose this split
> > in the user interface.
> 
> I'm not so sure.
> 
> The current interface munges together all PCIish connectors, and the
> result is a mess: users can't see which device can be plugged into which
> socket.  Libvirt needs to know, and it has grown a bunch of hardcoded ad
> hoc rules, which aren't quite right.
> 
> With separate types for incompatible plugs and sockets, the "what can
> plug into what" question remains as trivial as it was by the initial
> design.

Yes but, same as in the initial design,
it really makes it user's problem.

So we'd have
virtio-net-pci-conventional
virtio-net-pci-express
virtio-net-pci-integrated


All this while users just really want to say "virtio"
(that's the expert user, what most people want is for guest to be faster).

> >> What we have now is entirely different: we've overloaded the existing
> >> PCI plug with all the other PCI-related plug types that came with the
> >> PCIe support, so it means pretty much nothing anymore.  In particular,
> >> there's no way for libvirt to figure out progragmatically whether some
> >> alleged "PCI" device can go into some alleged "PCI" bus.  I call that a
> >> mess.
> >
> > There are lots of problems.
> >
> > First, bus type is not the only factor that can limit
> > which devices go where.
> > For example, specific slots might not support hotplug.
> 
> We made "can hotplug" a property of the bus.  Perhaps it should be a
> property of the slot.

Sure.

> > Another example, all devices in same pci slot must have
> > "multifunction" property set.
> 
> PCI multifunction devices are simply done wrong in the current code.
> 
> I think the orthodox way to model multifunction devices would involve
> composing the functions with a container device, resulting in a
> composite PCI device that can only be plugged as a whole.

It would also presumably involve a new bus which has
no basis in reality and a new type of device
for when it's a function within a multifunction device.

> Again, this is a conceptually clean approach, unconcerned with practical
> coding difficulties.

And need to grow new interfaces to specify these containers.

> > Second, there's no way to find out what is a valid
> > bus address. For example, with express you can
> > only use slot 0.
> 
> Device introspection via QOM should let you enumerate available sockets.

Slots would need to become sockets for this to work :)

> > Hotplug is only supported ATM if no two devices
> > share a pci slot.
> 
> Do you mean "hotplug works only with multifunction off"?  If yes, see
> above.  If no, please elaborate.

Well it does kind of work with most guests.
The way it works is a hack though.

> > Also, there's apparently no way to figure
> > out what kind of bus (or multiple buses) is behind each device.
> 
> Again, introspection via QOM should let you enumerate available sockets.

Maybe it should but it doesn't seem to let me.

> > Solution proposed above (separate each device into
> > two parts) only solves the pci versus express issue without
> > addressing any of the other issues.
> > So I'm not sure it's worth the effort.
> 
> As I said, I'm not claiming I know the only sane solution to this
> problem.  I've only described a solution that stays true to the qdev /
> QOM design as I understand it.

Right. And I don't argue from the implementation point of view.
What I am saying is, encoding everything in a single string
isn't a good user interface.

We really should let users say "virtio" detect that
it's connected to pci on one end and network on
the other and instanciate virtio-net-pci.

> True qdev / QOM experts, please help out with advice.

-- 
MST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-30 10:48           ` Michael S. Tsirkin
@ 2013-09-30 16:01             ` Gerd Hoffmann
  2013-09-30 16:06               ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Gerd Hoffmann @ 2013-09-30 16:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Anthony Liguori, marcel.a, qemu list, Markus Armbruster,
	Laine Stump, Andreas Färber

  Hi,

> Yes but, same as in the initial design,
> it really makes it user's problem.
> 
> So we'd have
> virtio-net-pci-conventional
> virtio-net-pci-express
> virtio-net-pci-integrated
> 
> 
> All this while users just really want to say "virtio"
> (that's the expert user, what most people want is for guest to be faster).

And for the actual device emulation it makes almost no difference.  xhci
exists in express and integrated variants too.  The qemu-emulated device
calls pcie_endpoint_cap_init() unconditionally, so the express endpoint
capability shows up even if you plug it into the root bus.  That should
be handled better.  But I think that would be the only difference in the
xhci code.  And even that could be handled in the pci core, for example
by making pcie_endpoint_cap_init a nop unless the device is actually is
a express endpoint from the bus topology point of view.

Maybe PCIDeviceClass->is_express should move to PCIDevice and
PCIDeviceClass should get a supports_express field instead.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-30 16:01             ` Gerd Hoffmann
@ 2013-09-30 16:06               ` Michael S. Tsirkin
  0 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-09-30 16:06 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Anthony Liguori, marcel.a, qemu list, Markus Armbruster,
	Laine Stump, Andreas Färber

On Mon, Sep 30, 2013 at 06:01:17PM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > Yes but, same as in the initial design,
> > it really makes it user's problem.
> > 
> > So we'd have
> > virtio-net-pci-conventional
> > virtio-net-pci-express
> > virtio-net-pci-integrated
> > 
> > 
> > All this while users just really want to say "virtio"
> > (that's the expert user, what most people want is for guest to be faster).
> 
> And for the actual device emulation it makes almost no difference.  xhci
> exists in express and integrated variants too.  The qemu-emulated device
> calls pcie_endpoint_cap_init() unconditionally, so the express endpoint
> capability shows up even if you plug it into the root bus.  That should
> be handled better.  But I think that would be the only difference in the
> xhci code.  And even that could be handled in the pci core, for example
> by making pcie_endpoint_cap_init a nop unless the device is actually is
> a express endpoint from the bus topology point of view.
> 
> Maybe PCIDeviceClass->is_express should move to PCIDevice and
> PCIDeviceClass should get a supports_express field instead.
> 
> cheers,
>   Gerd

Not sure why do you need is_express in PCIDevice.
We already have QEMU_PCI_CAP_EXPRESS set in cap_present.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-24 10:01 [Qemu-devel] Attaching PCI devices to the PCIe root complex Laine Stump
  2013-09-25  7:01 ` Michael S. Tsirkin
@ 2013-10-01 21:14 ` Michael S. Tsirkin
  1 sibling, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-10-01 21:14 UTC (permalink / raw)
  To: Laine Stump; +Cc: qemu list

On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote:
> When I added support for the Q35-based machinetypes to libvirt, I
> specifically prohibited attaching any PCI devices (with the exception of
> graphics controllers) to the PCIe root complex, and had planned to
> prevent attaching them to PCIe root ports (ioh3420 device) and PCIe
> downstream switch ports (xio-3130 device) as well. I did this because,
> even though qemu currently allows attaching a normal PCI device in any
> of these three places, the restriction exists for real hardware and I
> didn't see any guarantee that qemu wouldn't add the restriction in the
> future in order to more closely emulate real hardware.
> 
> However, since I did that, I've learned that many of the qemu "pci"
> devices really should be considered as "pci or pcie". Gerd Hoffman lists
> some of these cases in a bug he filed against libvirt:
> 
>    https://bugzilla.redhat.com/show_bug.cgi?id=1003983
> 
> I would like to loosen up the restrictions in libvirt, but want to make
> sure that I don't allow something that could later be forbidden by qemu
> (thus creating a compatibility problem during upgrades). Beyond Gerd's
> specific requests to allow ehci, uhci, and hda controllers to attach to
> PCIe ports, are there any other devices that I specifically should or
> shouldn't allow? (I would rather be conservative in what I allow - it's
> easy to allow more things later, but nearly impossible to revoke
> permission once it's been allowed).


Thinking some more about it.
At the moment qemu is very flexible, allowing you
to create all kind of illegal configurations.
How about we teach qemu to reject illegal configs,
and libvirt simply tries to create them and finds out
what's legal this way?

Same would work for detecting hotplug slots, etc.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-09-25  8:59     ` Michael S. Tsirkin
@ 2013-10-02  8:53       ` Paolo Bonzini
  2013-10-02  9:28         ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-10-02  8:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: libvir-list, qemu list, Laine Stump, marcel.a

Il 25/09/2013 10:59, Michael S. Tsirkin ha scritto:
>> > I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
>> > must be PCIe. But, from spec 1.3.2.3:
>> > - A Root Complex Integrated Endpoint must not require I/O resources claimed through BAR(s).
>> > - A Root Complex Integrated Endpoint must not generate I/O Requests.
>> > - A Root Complex Integrated Endpoint is required to support MSI or MSI-X or both if an
>> > interrupt resource is requested.
> Heh PCI-SIG keeps fighting against legacy interrupts and IO.
> But lots of hardware happily ignores these rules.
> And the reason is simple: software does not enforce them.

I think it's "must not require", not "must not have".  So it's the usual
rule that applies to PCIe device, i.e. that they should work even if the
OS doesn't enable the I/O BARs.

Then I have no idea what the I/O BAR in i915 is for, and whether the
device can be used without that BAR.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Attaching PCI devices to the PCIe root complex
  2013-10-02  8:53       ` Paolo Bonzini
@ 2013-10-02  9:28         ` Michael S. Tsirkin
  0 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2013-10-02  9:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: libvir-list, qemu list, Laine Stump, marcel.a

On Wed, Oct 02, 2013 at 10:53:07AM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 10:59, Michael S. Tsirkin ha scritto:
> >> > I couldn't find on PCIe spec any mention that "Root Complex Integrated EndPoint"
> >> > must be PCIe. But, from spec 1.3.2.3:
> >> > - A Root Complex Integrated Endpoint must not require I/O resources claimed through BAR(s).
> >> > - A Root Complex Integrated Endpoint must not generate I/O Requests.
> >> > - A Root Complex Integrated Endpoint is required to support MSI or MSI-X or both if an
> >> > interrupt resource is requested.
> > Heh PCI-SIG keeps fighting against legacy interrupts and IO.
> > But lots of hardware happily ignores these rules.
> > And the reason is simple: software does not enforce them.
> 
> I think it's "must not require", not "must not have".  So it's the usual
> rule that applies to PCIe device, i.e. that they should work even if the
> OS doesn't enable the I/O BARs.

I agree, thanks for pointing this out.

Seems to still apply to the MSI rule.

> Then I have no idea what the I/O BAR in i915 is for, and whether the
> device can be used without that BAR.
> 
> Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-10-02  9:26 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-24 10:01 [Qemu-devel] Attaching PCI devices to the PCIe root complex Laine Stump
2013-09-25  7:01 ` Michael S. Tsirkin
2013-09-25  8:48   ` Marcel Apfelbaum
2013-09-25  8:59     ` Michael S. Tsirkin
2013-10-02  8:53       ` Paolo Bonzini
2013-10-02  9:28         ` Michael S. Tsirkin
2013-09-25  9:39     ` Laine Stump
2013-09-25 10:00       ` Michael S. Tsirkin
2013-09-25 10:14         ` Laine Stump
2013-09-25 10:56           ` Michael S. Tsirkin
2013-09-25 10:58             ` Michael S. Tsirkin
2013-09-27 17:06     ` Markus Armbruster
2013-09-28 18:12       ` Michael S. Tsirkin
2013-09-30  9:55         ` Markus Armbruster
2013-09-30 10:44           ` Laine Stump
2013-09-30 10:48           ` Michael S. Tsirkin
2013-09-30 16:01             ` Gerd Hoffmann
2013-09-30 16:06               ` Michael S. Tsirkin
2013-10-01 21:14 ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).