* [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu
@ 2013-10-10 22:25 Mike Day
2013-10-10 22:30 ` Mike Day
2013-10-11 13:40 ` Anthony Liguori
0 siblings, 2 replies; 4+ messages in thread
From: Mike Day @ 2013-10-10 22:25 UTC (permalink / raw)
To: qemu-devel, qemu-ppc; +Cc: Alexey Kardashevskiy, aliguori, David Gibson
[RFC] SPAPR-PCI Hotplug Support in Qemu
Background:
ppc64 has a unique bus structure for PCI slots: each slot is connected
to its PHB by a pci switch. This is true in some IBM hardware as well as
paravirtual hardware (PAPR).
SLOF firmware normally scans the hardware bus and creates the correct
slot/PCI switch -complex in the open firmware device tree. It also
configures the slot and PCI switch (BARs, etc.)
For devices set up by platform firmware, each PCI device is attached to
its PHB and correctly configured.
For Linux hot-plugged devices running under PowerVM today, each device
is created with a PCI switch hanging off the dev->subordinate
pointer. (PowerVM gets this info from the open firmware device tree in
rtas.)
Problem:
The Qemu hot-plug path doesn't anticipate a PCI switch being attached
to every PHB slot.
When hot-plugging a device, Qemu qdev creates the device, which allows
the device to initialize itself. Qemu then passes this initialized
device to the ppc PHB via the hot-plug path.[1]
The current ppc hot-plug code then creates a device tree node for the
device [2], and allocates resources (BARs etc) for the new device. [3]
The ppc64 kernel expects each hot-plugged PCI device structure to
point to a subordinate bus dev->subordinate. This assumption is held
throughout the ppc PCI code, and there are numerous opportunities for
panics when the device gets passed to a kernel routine with a
subordinate pointer. [4]
Proposed Solutions:
(1) Create and hook an inert PCI switch to every hot-plugged PCI
device in Qemu.
(a) After the device has initialized itself, at hot-plug time, create a
new PCI switch, configure the switch, allocate BARs, and attach the
switch to the hot-plugged devices (dev->subordinate).
(b) create a new device tree node that begins with the PCI switch and
the parent of the hot-plugged device. Add the PCI switch/device
complex to the device tree under the PHB.
(2) Add each hot-plugged PCI device to its own complex of PHB
(Processor Host Bus) and PCI switch.
Simplify (1) by creating a new PHB for each hot-plugged device.
(a) At PHB creation time, create a PCI switch device node for each PHB
slot.
(b) At hot-plug time, create and configure a new PHB and add the
hot-plugged device to one of the slots. Configure and allocate
resources as normally.
Comments:
The current code has only one PHB. We know we need to support more
than one PHB ultimately. Solution #2 is consistent with this approach.
[1] https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c
[2] ibm,rtas_configure_connector:
https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L575
[3] spapr_phb_add_pci_dt
https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L900
[4] dlpar_pci_add_bus
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/hotplug/rpadlpar_core.c?id=8bf3379a74bc9132751bfa685bad2da318fd59d7#n165
--
Mike Day | + 1 919 371-8786 | ncmike@ncultra.org
"Endurance is a Virtue"
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu
2013-10-10 22:25 [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu Mike Day
@ 2013-10-10 22:30 ` Mike Day
2013-10-11 2:47 ` Benjamin Herrenschmidt
2013-10-11 13:40 ` Anthony Liguori
1 sibling, 1 reply; 4+ messages in thread
From: Mike Day @ 2013-10-10 22:30 UTC (permalink / raw)
To: qemu-devel, qemu-ppc, Anthony Liguori; +Cc: Alexey Kardashevskiy, David Gibson
[-- Attachment #1: Type: text/plain, Size: 3277 bytes --]
Adding Anthony's corrected address.
On Thu, Oct 10, 2013 at 6:25 PM, Mike Day <ncmike@ncultra.org> wrote:
[RFC] SPAPR-PCI Hotplug Support in Qemu
>
> Background:
> ppc64 has a unique bus structure for PCI slots: each slot is connected
> to its PHB by a pci switch. This is true in some IBM hardware as well as
> paravirtual hardware (PAPR).
>
> SLOF firmware normally scans the hardware bus and creates the correct
> slot/PCI switch -complex in the open firmware device tree. It also
> configures the slot and PCI switch (BARs, etc.)
>
> For devices set up by platform firmware, each PCI device is attached to
> its PHB and correctly configured.
>
> For Linux hot-plugged devices running under PowerVM today, each device
> is created with a PCI switch hanging off the dev->subordinate
> pointer. (PowerVM gets this info from the open firmware device tree in
> rtas.)
>
> Problem:
>
> The Qemu hot-plug path doesn't anticipate a PCI switch being attached
> to every PHB slot.
>
> When hot-plugging a device, Qemu qdev creates the device, which allows
> the device to initialize itself. Qemu then passes this initialized
> device to the ppc PHB via the hot-plug path.[1]
>
> The current ppc hot-plug code then creates a device tree node for the
> device [2], and allocates resources (BARs etc) for the new device. [3]
>
> The ppc64 kernel expects each hot-plugged PCI device structure to
> point to a subordinate bus dev->subordinate. This assumption is held
> throughout the ppc PCI code, and there are numerous opportunities for
> panics when the device gets passed to a kernel routine with a
> subordinate pointer. [4]
>
>
> Proposed Solutions:
>
> (1) Create and hook an inert PCI switch to every hot-plugged PCI
> device in Qemu.
>
> (a) After the device has initialized itself, at hot-plug time, create a
> new PCI switch, configure the switch, allocate BARs, and attach the
> switch to the hot-plugged devices (dev->subordinate).
>
> (b) create a new device tree node that begins with the PCI switch and
> the parent of the hot-plugged device. Add the PCI switch/device
> complex to the device tree under the PHB.
>
> (2) Add each hot-plugged PCI device to its own complex of PHB
> (Processor Host Bus) and PCI switch.
>
> Simplify (1) by creating a new PHB for each hot-plugged device.
>
> (a) At PHB creation time, create a PCI switch device node for each PHB
> slot.
>
> (b) At hot-plug time, create and configure a new PHB and add the
> hot-plugged device to one of the slots. Configure and allocate
> resources as normally.
>
> Comments:
>
> The current code has only one PHB. We know we need to support more
> than one PHB ultimately. Solution #2 is consistent with this approach.
>
>
> [1]
https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c
>
> [2] ibm,rtas_configure_connector:
>
https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L575
>
> [3] spapr_phb_add_pci_dt
>
https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L900
>
> [4] dlpar_pci_add_bus
>
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/hotplug/rpadlpar_core.c?id=8bf3379a74bc9132751bfa685bad2da318fd59d7#n165
>
>
> --
>
> Mike Day | + 1 919 371-8786 | ncmike@ncultra.org
> "Endurance is a Virtue"
[-- Attachment #2: Type: text/html, Size: 4424 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu
2013-10-10 22:30 ` Mike Day
@ 2013-10-11 2:47 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2013-10-11 2:47 UTC (permalink / raw)
To: Mike Day
Cc: Alexey Kardashevskiy, qemu-ppc, qemu-devel, Anthony Liguori,
David Gibson
On Thu, 2013-10-10 at 18:30 -0400, Mike Day wrote:
> Adding Anthony's corrected address.
>
>
> On Thu, Oct 10, 2013 at 6:25 PM, Mike Day <ncmike@ncultra.org> wrote:
> [RFC] SPAPR-PCI Hotplug Support in Qemu
> >
> > Background:
> > ppc64 has a unique bus structure for PCI slots: each slot is connected
> > to its PHB by a pci switch. This is true in some IBM hardware as well as
> > paravirtual hardware (PAPR).
Not *exactly* ... in PAPR, it's frequent that PCI and PCIe devices
appear to be direct children of the PHB. In fact that has been
problematic in the past because PCIe devices are supposed to always be
behind a virtual bridge, but that's not how pHyp exposes them so you end
up with a device with a PCIe capability without a parent with such a
capability. That has confused Linux more than once.
What is very common however is for us to just pop a new virtual PHB for
a new device (or group of devices). This will be especially true with
SR-IOV where we'll pretty much expect VFs to show up behind dedicated
virtual PHBs.
Additionally, with pass-through, there is the constraint that devices in
the same iommu group appear behind a common parent bridge (and nothing
else below that bridge is on a different group) since the way we expose
the iommu to the guest is via properties in the bridge (or PHB) node of
the device-tree.
> > SLOF firmware normally scans the hardware bus and creates the correct
> > slot/PCI switch -complex in the open firmware device tree. It also
> > configures the slot and PCI switch (BARs, etc.)
> >
> > For devices set up by platform firmware, each PCI device is attached to
> > its PHB and correctly configured.
> >
> > For Linux hot-plugged devices running under PowerVM today, each device
> > is created with a PCI switch hanging off the dev->subordinate
> > pointer. (PowerVM gets this info from the open firmware device tree in
> > rtas.)
That is not entirely correct, I have many cases where PowerVM does not
expose a PCI bridge/switch but shows the devices directly below the PHB.
Guests should cope either way.
The problem really lies in the fact that devices below a given bridge
(or PHB) are expected to share the iommu table.
> > Problem:
> >
> > The Qemu hot-plug path doesn't anticipate a PCI switch being attached
> > to every PHB slot.
> >
> > When hot-plugging a device, Qemu qdev creates the device, which allows
> > the device to initialize itself. Qemu then passes this initialized
> > device to the ppc PHB via the hot-plug path.[1]
> >
> > The current ppc hot-plug code then creates a device tree node for the
> > device [2], and allocates resources (BARs etc) for the new device. [3]
> >
> > The ppc64 kernel expects each hot-plugged PCI device structure to
> > point to a subordinate bus dev->subordinate. This assumption is held
> > throughout the ppc PCI code, and there are numerous opportunities for
> > panics when the device gets passed to a kernel routine with a
> > subordinate pointer. [4]
> >
> >
> > Proposed Solutions:
> >
> > (1) Create and hook an inert PCI switch to every hot-plugged PCI
> > device in Qemu.
> >
> > (a) After the device has initialized itself, at hot-plug time, create a
> > new PCI switch, configure the switch, allocate BARs, and attach the
> > switch to the hot-plugged devices (dev->subordinate).
> >
> > (b) create a new device tree node that begins with the PCI switch and
> > the parent of the hot-plugged device. Add the PCI switch/device
> > complex to the device tree under the PHB.
> >
> > (2) Add each hot-plugged PCI device to its own complex of PHB
> > (Processor Host Bus) and PCI switch.
> >
> > Simplify (1) by creating a new PHB for each hot-plugged device.
> >
> > (a) At PHB creation time, create a PCI switch device node for each PHB
> > slot.
> >
> > (b) At hot-plug time, create and configure a new PHB and add the
> > hot-plugged device to one of the slots. Configure and allocate
> > resources as normally.
> >
> > Comments:
> >
> > The current code has only one PHB. We know we need to support more
> > than one PHB ultimately. Solution #2 is consistent with this approach.
> >
> >
> > [1] https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c
> >
> > [2] ibm,rtas_configure_connector:
> > https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L575
> >
> > [3] spapr_phb_add_pci_dt
> > https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L900
> >
> > [4] dlpar_pci_add_bus
> > http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/hotplug/rpadlpar_core.c?id=8bf3379a74bc9132751bfa685bad2da318fd59d7#n165
> >
Ben.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu
2013-10-10 22:25 [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu Mike Day
2013-10-10 22:30 ` Mike Day
@ 2013-10-11 13:40 ` Anthony Liguori
1 sibling, 0 replies; 4+ messages in thread
From: Anthony Liguori @ 2013-10-11 13:40 UTC (permalink / raw)
To: Mike Day
Cc: Alexey Kardashevskiy, open list:PowerPC, qemu-devel, David Gibson
On Thu, Oct 10, 2013 at 3:25 PM, Mike Day <ncmike@ncultra.org> wrote:
>
> [RFC] SPAPR-PCI Hotplug Support in Qemu
>
> Background:
> ppc64 has a unique bus structure for PCI slots: each slot is connected
> to its PHB by a pci switch. This is true in some IBM hardware as well as
> paravirtual hardware (PAPR).
>
> SLOF firmware normally scans the hardware bus and creates the correct
> slot/PCI switch -complex in the open firmware device tree. It also
> configures the slot and PCI switch (BARs, etc.)
So from a topological perspective, you have a bridge hotplugged into
the slot on the PHB and then a device plugged into a free slot on the
bridge?
I believe we support hotplugging bridges today already. It's a two
step process as far as QEMU is concerned (hotplugging the bridge then
plugging the card behind it) but that's just a management tool
problem.
> For devices set up by platform firmware, each PCI device is attached to
> its PHB and correctly configured.
>
> For Linux hot-plugged devices running under PowerVM today, each device
> is created with a PCI switch hanging off the dev->subordinate
> pointer. (PowerVM gets this info from the open firmware device tree in
> rtas.)
>
> Problem:
>
> The Qemu hot-plug path doesn't anticipate a PCI switch being attached
> to every PHB slot.
>
> When hot-plugging a device, Qemu qdev creates the device, which allows
> the device to initialize itself. Qemu then passes this initialized
> device to the ppc PHB via the hot-plug path.[1]
>
> The current ppc hot-plug code then creates a device tree node for the
> device [2], and allocates resources (BARs etc) for the new device. [3]
>
> The ppc64 kernel expects each hot-plugged PCI device structure to
> point to a subordinate bus dev->subordinate. This assumption is held
> throughout the ppc PCI code, and there are numerous opportunities for
> panics when the device gets passed to a kernel routine with a
> subordinate pointer. [4]
>
>
> Proposed Solutions:
>
> (1) Create and hook an inert PCI switch to every hot-plugged PCI
> device in Qemu.
>
> (a) After the device has initialized itself, at hot-plug time, create a
> new PCI switch, configure the switch, allocate BARs, and attach the
> switch to the hot-plugged devices (dev->subordinate).
>
> (b) create a new device tree node that begins with the PCI switch and
> the parent of the hot-plugged device. Add the PCI switch/device
> complex to the device tree under the PHB.
>
> (2) Add each hot-plugged PCI device to its own complex of PHB
> (Processor Host Bus) and PCI switch.
>
> Simplify (1) by creating a new PHB for each hot-plugged device.
>
> (a) At PHB creation time, create a PCI switch device node for each PHB
> slot.
>
> (b) At hot-plug time, create and configure a new PHB and add the
> hot-plugged device to one of the slots. Configure and allocate
> resources as normally.
>
> Comments:
>
> The current code has only one PHB. We know we need to support more
> than one PHB ultimately. Solution #2 is consistent with this approach.
>
>
> [1] https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c
>
> [2] ibm,rtas_configure_connector:
> https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L575
>
> [3] spapr_phb_add_pci_dt
> https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L900
>
> [4] dlpar_pci_add_bus
> http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/hotplug/rpadlpar_core.c?id=8bf3379a74bc9132751bfa685bad2da318fd59d7#n165
>
>
> --
>
> Mike Day | + 1 919 371-8786 | ncmike@ncultra.org
> "Endurance is a Virtue"
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-10-11 13:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-10 22:25 [Qemu-devel] [RFC] SPAPR-PCI Hotplug Support in Qemu Mike Day
2013-10-10 22:30 ` Mike Day
2013-10-11 2:47 ` Benjamin Herrenschmidt
2013-10-11 13:40 ` Anthony Liguori
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).