From: Scott Wood <scottwood@freescale.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Yoder Stuart-B08248 <B08248@freescale.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Alexander Graf <agraf@suse.de>,
Wood Scott-B07421 <B07421@freescale.com>,
"Joerg.Roedel@amd.com" <Joerg.Roedel@amd.com>,
"avi@redhat.com" <avi@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: RFC: vfio / device assignment -- layout of device fd files
Date: Mon, 29 Aug 2011 18:14:29 -0500 [thread overview]
Message-ID: <4E5C1D55.4040306@freescale.com> (raw)
In-Reply-To: <1314658013.2859.399.camel@bling.home>
On 08/29/2011 05:46 PM, Alex Williamson wrote:
> On Mon, 2011-08-29 at 16:58 -0500, Scott Wood wrote:
>> On 08/29/2011 02:51 PM, Alex Williamson wrote:
>>> On Mon, 2011-08-29 at 16:51 +0000, Yoder Stuart-B08248 wrote:
>>>> The device info records following the file header have the following
>>>> record types each with content encoded in a record specific way:
>>>>
>>>> REGION - describes an addressable address range for the device
>>>> DTPATH - describes the device tree path for the device
>>>> DTINDEX - describes the index into the related device tree
>>>> property (reg,ranges,interrupts,interrupt-map)
>>>
>>> I don't quite understand if these are physical or virtual.
>>
>> If what are physical or virtual?
>
> Can you give an example of a path vs an index? I don't understand
> enough about these to ask a useful question about what they're
> describing.
You'd have both path and index.
Example, for this tree:
/ {
...
foo {
...
bar {
reg = <0x1000 64 0x1800 64>;
ranges = <0 0x20000 0x10000>;
...
child {
reg = <0x100 0x100>;
...
};
};
};
};
There would be 4 regions if you bind to /foo/bar:
// this is 64 bytes at 0x1000
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=0
// this is 64 bytes at 0x1800
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=1
// this is 16K at 0x20000
DTPATH "/foo/bar"
DTINDEX prop_type=RANGES prop_index=0
// this is 256 bytes at 0x20100
DTPATH "/foo/bar/child"
DTINDEX prop_type=REG prop_index=0
Both ranges and the child reg are needed, since ranges could be a simple
"ranges;" that passes everything with no translation, and child nodes
could be absent-but-implied in some other cases (such as when they
represent PCI devices which can be probed -- we still need to map the
ranges that correspond to PCI controller windows).
>>>> INTERRUPT - describes an interrupt for the device
>>>> PCI_CONFIG_SPACE - describes config space for the device
>>>
>>> I would have expected this to be a REGION with a property of
>>> PCI_CONFIG_SPACE.
>>
>> Could be, if physical address is made optional.
>
> Or physical address is also a property, aka sub-region.
A subrecord of REGION is fine with me.
>>> Would we only need to expose phys addr for 1:1 mapping requirements?
>>> I'm not sure why we'd care to expose this otherwise.
>>
>> It's more important for non-PCI, where it avoids the need for userspace
>> to parse the device tree to find the guest address (we'll usually want
>> 1:1), or to consolidate pages shared by multiple regions. It could be
>> nice for debugging, as well.
>
> So the device tree path is ripped straight from the system, so it's the
> actual 1:1, matching physical hardware, path.
Yes.
>>> Even for non-PCI we need to
>>> know if the region is pio/mmio32/mmio64/prefetchable/etc.
>>
>> Outside of PCI, what standardized form would you put such information
>> in? Where would the kernel get this information? What does
>> mmio32/mmio64 mean in this context?
>
> I could imagine a platform device described by ACPI that might want to
> differentiate. The physical device doesn't get moved of course, but
> guest drivers might care how the device is described if we need to
> rebuild those ACPI tables. ACPI might even be a good place to leverage
> these data structures... /me ducks.
ACPI info could be another subrecord type, but in the device tree
system-bus case we generally don't have this information at the generic
infrastructure level. Drivers are expected to know how their devices'
regions should be mapped.
>>> BAR index could really just translate to a REGION instance number.
>>
>> How would that work if you make non-BAR things (such as config space)
>> into regions?
>
> Put their instance numbers outside of the BAR region? We have a fixed
> REGION space on PCI, so we could just define BAR0 == instance 0, BAR1 ==
> instance 1... ROM == instance 6, CONFIG == instance 0xF (or 7).
Seems more awkward than just having each region say what it is. What do
you do to fill in the gaps?
-Scott
WARNING: multiple messages have this Message-ID (diff)
From: Scott Wood <scottwood@freescale.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Wood Scott-B07421 <B07421@freescale.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Alexander Graf <agraf@suse.de>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Yoder Stuart-B08248 <B08248@freescale.com>,
"avi@redhat.com" <avi@redhat.com>,
"Joerg.Roedel@amd.com" <Joerg.Roedel@amd.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] RFC: vfio / device assignment -- layout of device fd files
Date: Mon, 29 Aug 2011 18:14:29 -0500 [thread overview]
Message-ID: <4E5C1D55.4040306@freescale.com> (raw)
In-Reply-To: <1314658013.2859.399.camel@bling.home>
On 08/29/2011 05:46 PM, Alex Williamson wrote:
> On Mon, 2011-08-29 at 16:58 -0500, Scott Wood wrote:
>> On 08/29/2011 02:51 PM, Alex Williamson wrote:
>>> On Mon, 2011-08-29 at 16:51 +0000, Yoder Stuart-B08248 wrote:
>>>> The device info records following the file header have the following
>>>> record types each with content encoded in a record specific way:
>>>>
>>>> REGION - describes an addressable address range for the device
>>>> DTPATH - describes the device tree path for the device
>>>> DTINDEX - describes the index into the related device tree
>>>> property (reg,ranges,interrupts,interrupt-map)
>>>
>>> I don't quite understand if these are physical or virtual.
>>
>> If what are physical or virtual?
>
> Can you give an example of a path vs an index? I don't understand
> enough about these to ask a useful question about what they're
> describing.
You'd have both path and index.
Example, for this tree:
/ {
...
foo {
...
bar {
reg = <0x1000 64 0x1800 64>;
ranges = <0 0x20000 0x10000>;
...
child {
reg = <0x100 0x100>;
...
};
};
};
};
There would be 4 regions if you bind to /foo/bar:
// this is 64 bytes at 0x1000
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=0
// this is 64 bytes at 0x1800
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=1
// this is 16K at 0x20000
DTPATH "/foo/bar"
DTINDEX prop_type=RANGES prop_index=0
// this is 256 bytes at 0x20100
DTPATH "/foo/bar/child"
DTINDEX prop_type=REG prop_index=0
Both ranges and the child reg are needed, since ranges could be a simple
"ranges;" that passes everything with no translation, and child nodes
could be absent-but-implied in some other cases (such as when they
represent PCI devices which can be probed -- we still need to map the
ranges that correspond to PCI controller windows).
>>>> INTERRUPT - describes an interrupt for the device
>>>> PCI_CONFIG_SPACE - describes config space for the device
>>>
>>> I would have expected this to be a REGION with a property of
>>> PCI_CONFIG_SPACE.
>>
>> Could be, if physical address is made optional.
>
> Or physical address is also a property, aka sub-region.
A subrecord of REGION is fine with me.
>>> Would we only need to expose phys addr for 1:1 mapping requirements?
>>> I'm not sure why we'd care to expose this otherwise.
>>
>> It's more important for non-PCI, where it avoids the need for userspace
>> to parse the device tree to find the guest address (we'll usually want
>> 1:1), or to consolidate pages shared by multiple regions. It could be
>> nice for debugging, as well.
>
> So the device tree path is ripped straight from the system, so it's the
> actual 1:1, matching physical hardware, path.
Yes.
>>> Even for non-PCI we need to
>>> know if the region is pio/mmio32/mmio64/prefetchable/etc.
>>
>> Outside of PCI, what standardized form would you put such information
>> in? Where would the kernel get this information? What does
>> mmio32/mmio64 mean in this context?
>
> I could imagine a platform device described by ACPI that might want to
> differentiate. The physical device doesn't get moved of course, but
> guest drivers might care how the device is described if we need to
> rebuild those ACPI tables. ACPI might even be a good place to leverage
> these data structures... /me ducks.
ACPI info could be another subrecord type, but in the device tree
system-bus case we generally don't have this information at the generic
infrastructure level. Drivers are expected to know how their devices'
regions should be mapped.
>>> BAR index could really just translate to a REGION instance number.
>>
>> How would that work if you make non-BAR things (such as config space)
>> into regions?
>
> Put their instance numbers outside of the BAR region? We have a fixed
> REGION space on PCI, so we could just define BAR0 == instance 0, BAR1 ==
> instance 1... ROM == instance 6, CONFIG == instance 0xF (or 7).
Seems more awkward than just having each region say what it is. What do
you do to fill in the gaps?
-Scott
next prev parent reply other threads:[~2011-08-29 23:14 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-29 16:51 RFC: vfio / device assignment -- layout of device fd files Yoder Stuart-B08248
2011-08-29 16:51 ` [Qemu-devel] " Yoder Stuart-B08248
2011-08-29 19:04 ` Anthony Liguori
2011-08-29 19:04 ` Anthony Liguori
2011-08-29 19:32 ` Scott Wood
2011-08-29 19:32 ` Scott Wood
2011-08-29 19:51 ` Alex Williamson
2011-08-29 19:51 ` [Qemu-devel] " Alex Williamson
2011-08-29 21:58 ` Scott Wood
2011-08-29 21:58 ` [Qemu-devel] " Scott Wood
2011-08-29 22:46 ` Alex Williamson
2011-08-29 22:46 ` [Qemu-devel] " Alex Williamson
2011-08-29 23:14 ` Scott Wood [this message]
2011-08-29 23:14 ` Scott Wood
2011-08-30 4:55 ` Alex Williamson
2011-08-30 4:55 ` [Qemu-devel] " Alex Williamson
2011-08-30 16:54 ` Scott Wood
2011-08-30 16:54 ` [Qemu-devel] " Scott Wood
2011-09-01 20:00 ` Michael S. Tsirkin
2011-09-01 20:00 ` [Qemu-devel] " Michael S. Tsirkin
2011-09-01 20:26 ` Scott Wood
2011-09-01 20:26 ` [Qemu-devel] " Scott Wood
2011-09-02 15:57 ` Michael S. Tsirkin
2011-09-02 15:57 ` [Qemu-devel] " Michael S. Tsirkin
2011-09-02 17:50 ` Scott Wood
2011-09-02 17:50 ` [Qemu-devel] " Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E5C1D55.4040306@freescale.com \
--to=scottwood@freescale.com \
--cc=B07421@freescale.com \
--cc=B08248@freescale.com \
--cc=Joerg.Roedel@amd.com \
--cc=agraf@suse.de \
--cc=alex.williamson@redhat.com \
--cc=avi@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=david@gibson.dropbear.id.au \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.