From: Anthony Liguori <anthony@codemonkey.ws>
To: Alexander Graf <agraf@suse.de>
Cc: Wood Scott-B07421 <B07421@freescale.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"dwg@au1.ibm.com" <dwg@au1.ibm.com>,
"blauwirbel@gmail.com" <blauwirbel@gmail.com>,
Yoder Stuart-B08248 <B08248@freescale.com>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"paul@codesourcery.com" <paul@codesourcery.com>,
"joerg.roedel@amd.com" <joerg.roedel@amd.com>,
"armbru@redhat.com" <armbru@redhat.com>
Subject: Re: [Qemu-devel] device assignment for embedded Power
Date: Fri, 01 Jul 2011 07:13:17 -0500 [thread overview]
Message-ID: <4E0DB9DD.8010406@codemonkey.ws> (raw)
In-Reply-To: <501496D4-41B2-4CF2-A8D1-B1BFE3A827AE@suse.de>
On 07/01/2011 06:40 AM, Alexander Graf wrote:
>
> On 01.07.2011, at 02:58, Benjamin Herrenschmidt wrote:
>
>> On Thu, 2011-06-30 at 15:59 +0000, Yoder Stuart-B08248 wrote:
>>> One feature we need for QEMU/KVM on embedded Power Architecture is the
>>> ability to do passthru assignment of SoC I/O devices and memory. An
>>> important use case in embedded is creating static partitions--
>>> taking physical memory and I/O devices (non-PCI) and partitioning
>>> them between the host Linux and several virtual machines. Things like
>>> live migration would not be needed or supported in these types of scenarios.
>>>
>>> SoC devices do not sit on a probeable bus and there are no identifiers
>>> like 01:00.0 with PCI that we can use to identify devices-- the host
>>> Linux kernel is made aware of SoC I/O devices from nodes/properties in a
>>> device tree structure passed at boot. QEMU needs to generate a
>>> device tree to pass to the guest as well with all the guest's virtual
>>> and physical resources. Today a number of mostly complete guest device
>>> trees are kept under ./pc-bios in QEMU, but this too static and
>>> inflexible.
>>>
>>> Some new mechanism is needed to assign SoC devices to guests, and we
>>> (FSL + Alex Graf) have been discussing a few possible approaches
>>> for doing this from QEMU and would like some feedback.
>>>
>>> Some possibilities:
>>>
>>> 1. Option 1. Pass the host dev tree to QEMU and assign devices
>>> by device tree path
>>>
>>> -dtb ./mpc8572ds.dtb -device assigned-soc-dev,dev=/soc/i2c@3000
>>>
>>> /soc/i2c@3000 is the device tree path to the assigned device.
>>> The device node 'i2c@3000' has some number of properties (e.g.
>>> address, interrupt info) and possibly subnodes under
>>> it. QEMU copies that node when generating the guest dev tree.
>>> See snippet of entire node: http://paste2.org/p/1496460
>>
>> Yuck (see below)
>>
>>> 2. Option 2. Pass the entire assigned device node as a string to
>>> QEMU
>>>
>>> -device assigned-soc-dev,dev=/i2c@3000,dev-node='#address-cells =<1>;
>>> #size-cells =<0>; cell-index =<0>; compatible = "fsl-i2c";
>>> reg =<0xffe03000 0x100>; interrupts =<43 2>;
>>> interrupt-parent =<&mpic>; dfsrr;'
>>
>> Beuark ! (see below)
>>
>>> This avoids needing to pass the host device tree, but could
>>> get awkward-- the i2c example above is very simple, some device
>>> nodes are very large with a complex hierarchy of subnodes and
>>> could be hundreds of lines of text to represent a single
>>> node.
>>>
>>> It gets more complicated...
>>
>>
>> So, from a qemu command line perspective, all you should have to do is
>> pass qemu the device-tree -path- to the device you want to pass-trough
>> (you may support passing a full hierarchy here).
>>
>> That is for normal MMIO mapped SoC devices. Something else (individual
>> i2c, usb, ...) will use specific virtualization of the corresponding
>> busses.
>>
>> Anything else sucks too much really.
>>
>> From there, well, there's several approach inside qemu/kvm to handle
>> that path. If you want to do things at the qemu level you can probably
>> parse /proc/device-tree. But I'd personally just make it a kernel thing.
>>
>> IE. I would have an ioctl to "instanciate" a pass-through device, that
>> takes that path as an argument. I would make it return an anonymous fd
>> which you can then use to mmap the resources, etc...
>
> Yeah, one idea was to use VFIO here. We could for example modify the host device tree to occupy device we want to pass through with a specific compatibility parameter. Or we could try to steal the node during runtime. But I agree, reading the device tree data from a VFIO node sounds reasonable. If it's required.
That makes it very specific to systems that use device trees.
To do the same for ARM platforms or x86, you would need to invent yet
another mechanism.
Passing through arbitrary MMIO is fairly straight forward (likewise with
PIO). Passing through IRQs is a bit less straight forward and perhaps
VFIO is the answer here.
I don't see a problem with QEMU figuring out what a device's resources
are and doing the assignment.
Regards,
Anthony Liguori
next prev parent reply other threads:[~2011-07-01 12:13 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-30 15:59 [Qemu-devel] device assignment for embedded Power Yoder Stuart-B08248
2011-07-01 0:58 ` Benjamin Herrenschmidt
2011-07-01 11:40 ` Alexander Graf
2011-07-01 12:13 ` Anthony Liguori [this message]
2011-07-01 12:10 ` Anthony Liguori
2011-07-01 12:52 ` Paul Brook
2011-07-01 13:33 ` Anthony Liguori
2011-07-01 16:43 ` Scott Wood
2011-07-01 17:03 ` Paul Brook
2011-07-01 17:49 ` Scott Wood
2011-07-01 20:59 ` Paul Brook
2011-07-01 21:51 ` Scott Wood
2011-07-01 23:33 ` Paul Brook
2011-07-01 23:05 ` Benjamin Herrenschmidt
2011-07-01 23:50 ` Paul Brook
2011-07-02 2:17 ` Alexander Graf
2011-07-02 11:45 ` Paul Brook
2011-07-01 22:35 ` Anthony Liguori
2011-07-01 22:32 ` Anthony Liguori
2011-07-05 18:16 ` Scott Wood
2011-07-01 16:34 ` Scott Wood
2011-07-05 18:19 ` Yoder Stuart-B08248
2011-07-05 22:23 ` Alexander Graf
2011-07-01 11:16 ` Paul Brook
2011-07-01 11:33 ` Alexander Graf
2011-07-01 11:55 ` Paul Brook
2011-07-01 12:02 ` Alexander Graf
2011-07-01 12:14 ` Anthony Liguori
2011-07-01 17:51 ` Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E0DB9DD.8010406@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=B07421@freescale.com \
--cc=B08248@freescale.com \
--cc=agraf@suse.de \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=blauwirbel@gmail.com \
--cc=dwg@au1.ibm.com \
--cc=joerg.roedel@amd.com \
--cc=paul@codesourcery.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).