From: Alex Williamson <alex.williamson@redhat.com>
To: Stuart Yoder <b08248@gmail.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, agraf@suse.de,
avi@redhat.com, Scott Wood <scottwood@freescale.com>
Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Mon, 19 Sep 2011 09:16:00 -0600 [thread overview]
Message-ID: <1316445361.4443.29.camel@bling.home> (raw)
In-Reply-To: <CALRxmdCmJ8u7913iBnfdWfo3G_O1ij7iJ8bxiC7L+8Ne5_482A@mail.gmail.com>
On Fri, 2011-09-09 at 08:11 -0500, Stuart Yoder wrote:
> Based on the discussions over the last couple of weeks
> I have updated the device fd file layout proposal and
> tried to specify it a bit more formally.
>
> ===============================================================
>
> 1. Overview
>
> This specification describes the layout of device files
> used in the context of vfio, which gives user space
> direct access to I/O devices that have been bound to
> vfio.
>
> When a device fd is opened and read, offset 0x0 contains
> a fixed sized header followed by a number of variable length
> records that describe different characteristics
> of the device-- addressable regions, interrupts, etc.
>
> 0x0 +-------------+-------------+
> | magic | u32 // identifies this as a vfio
> device file
> +---------------------------+ and identifies the type of bus
> | version | u32 // specifies the version of this
> +---------------------------+
> | flags | u32 // encodes any flags
> +---------------------------+
> | dev info record 0 |
> | type | u32 // type of record
> | rec_len | u32 // length in bytes of record
> | | (including record header)
> | flags | u32 // type specific flags
> | ...content... | // record content, which could
> +---------------------------+ // include sub-records
> | dev info record 1 |
> +---------------------------+
> | dev info record N |
> +---------------------------+
>
> The device info records following the file header may have
> the following record types each with content encoded in
> a record specific way:
>
> ------------+-------+------------------------------------------------------
> | type |
> Region | num | Description
> ---------------------------------------------------------------------------
> REGION 1 describes an addressable address range for the device
> DTPATH 2 describes the device tree path for the device
> DTINDEX 3 describes the index into the related device tree
> property (reg,ranges,interrupts,interrupt-map)
> INTERRUPT 4 describes an interrupt for the device
> PCI_CONFIG_SPACE 5 property identifying a region as PCI config space
> PCI_BAR_INDEX 6 describes the BAR index for a PCI region
> PHYS_ADDR 7 describes the physical address of the region
> ---------------------------------------------------------------------------
>
> 2. Header
>
> The header is located at offset 0x0 in the device fd
> and has the following format:
>
> struct devfd_header {
> __u32 magic;
> __u32 version;
> __u32 flags;
> };
>
> The 'magic' field contains a magic value that will
> identify the type bus the device is on. Valid values
> are:
>
> 0x70636900 // "pci" - PCI device
> 0x64740000 // "dt" - device tree (system bus)
>
> 3. Region
>
> A REGION record an addressable address region for the device.
>
> struct devfd_region {
> __u32 type; // must be 0x1
> __u32 record_len;
> __u32 flags;
> __u64 offset; // seek offset to region from beginning
> // of file
> __u64 len ; // length of the region
> };
>
> The 'flags' field supports one flag:
>
> IS_MMAPABLE
>
> 4. Device Tree Path (DTPATH)
>
> A DTPATH record is a sub-record of a REGION and describes
> the path to a device tree node for the region
Can we better distinguish sub-records from records? I assume we're
trying to be as versatile as possible by having a single "type" address
space, but is this going to lead to implementation problems? A DTPATH
as a record, an INTERRUPT as a sub-record, etc. Should we instead have
a "subtype" address space per "type" and per device type? For a "dt"
device, it looks like we really have:
* REGION (type 0)
* DTPATH (subtype 0)
* DTINDEX (subtype 1)
* PHYS_ADDR (subtype 2)
* INTERRUPT (type 1)
* DTPATH (subtype 0)
* DTINDEX (subtype 1)
While "pci" is:
* REGION (type 0)
* PCI_CONFIG_SPACE (subtype 0)
* PCI_BAR_INDEX (subtype 1)
* INTERRUPT (type 1)
> struct devfd_dtpath {
> __u32 type; // must be 0x2
> __u32 record_len;
> __u64 char[] ; // length of the region
> };
>
> 5. Device Tree Index (DTINDEX)
>
> A DTINDEX record is a sub-record of a REGION and specifies
> the index into the resource list encoded in the associated
> device tree property-- "reg", "ranges", "interrupts", or
> "interrupt-map".
>
> struct devfd_dtindex {
> __u32 type; // must be 0x3
> __u32 record_len;
> __u32 prop_type;
> __u32 prop_index; // index into the resource list
> };
>
> prop_type must have one of the follow values:
> 1 // "reg" property
> 2 // "ranges" property
> 3 // "interrupts" property
> 4 // "interrupts" property
>
> Note: prop_index is not the byte offset into the property,
> but the logical index.
>
> 6. Interrupts (INTERRUPT)
>
> An INTERRUPT record describes one of a device's interrupts.
> The handle field is an argument to VFIO_DEVICE_GET_IRQ_FD
> which user space can use to receive device interrupts.
>
> struct devfd_interrupts {
> __u32 type; // must be 0x4
> __u32 record_len;
> __u32 flags;
> __u32 handle; // parameter to VFIO_DEVICE_GET_IRQ_FD
> };
I'm still on the fence whether we should implement INTERRUPT for PCI or
only assume handle 0x0 or maybe assume handle == interrupt pin.
>
> 7. PCI Config Space (PCI_CONFIG_SPACE)
>
> A PCI_CONFIG_SPACE record is a sub-record of a REGION record
> and identifies the region as PCI configuration space.
>
> struct devfd_cfgspace {
> __u32 type; // must be 0x5
> __u32 record_len;
> __u32 flags;
> }
>
> 8. PCI Bar Index (PCI_BAR_INDEX)
>
> A PCI_BAR_INDEX record is a sub-record of a REGION record
> and identifies the PCI BAR index for the region.
>
> struct devfd_barindex {
> __u32 type; // must be 0x6
> __u32 record_len;
> __u32 flags;
> __u32 bar_index;
> }
I suppose we're more concerned with easy parsing and alignment than
compactness, so a u32 to differentiate 6 BARS + 1 ROM is probably ok.
>
> 9. Physical Address (PHYS_ADDR)
>
> A PHYS_ADDR record is a sub-record of a REGION record
> and specifies the physical address of the region.
>
> struct devfd_physaddr {
> __u32 type; // must be 0x7
> __u32 record_len;
> __u32 flags;
> __u64 phys_addr;
> }
Thanks,
Alex
next prev parent reply other threads:[~2011-09-19 15:16 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 13:11 [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files Stuart Yoder
2011-09-09 13:16 ` Stuart Yoder
2011-09-19 15:16 ` Alex Williamson [this message]
2011-09-19 19:37 ` Scott Wood
2011-09-19 21:07 ` Alex Williamson
2011-09-19 21:15 ` Scott Wood
2011-09-26 7:51 ` David Gibson
2011-09-26 10:04 ` Alexander Graf
2011-09-26 18:34 ` Alex Williamson
2011-09-26 20:03 ` Stuart Yoder
2011-09-26 20:42 ` Alex Williamson
2011-09-26 23:59 ` Scott Wood
2011-09-27 0:45 ` Alex Williamson
2011-09-27 21:28 ` Scott Wood
2011-09-28 2:40 ` Alex Williamson
2011-09-28 8:58 ` Alexander Graf
2011-09-30 8:55 ` David Gibson
2011-09-30 8:50 ` David Gibson
2011-09-30 8:46 ` David Gibson
2011-09-30 16:37 ` Alex Williamson
2011-09-30 21:59 ` Alex Williamson
2011-09-30 8:40 ` David Gibson
2011-09-26 19:57 ` Stuart Yoder
2011-09-27 0:25 ` Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1316445361.4443.29.camel@bling.home \
--to=alex.williamson@redhat.com \
--cc=agraf@suse.de \
--cc=avi@redhat.com \
--cc=b08248@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=scottwood@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).