From: Alex Williamson <alex.williamson@redhat.com>
To: John G Johnson <john.g.johnson@oracle.com>
Cc: "Walker, Benjamin" <benjamin.walker@intel.com>,
"Elena Ufimtseva" <elena.ufimtseva@oracle.com>,
"Jag Raman" <jag.raman@oracle.com>,
"Swapnil Ingle" <swapnil.ingle@nutanix.com>,
"Harris, James R" <james.r.harris@intel.com>,
"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Raphael Norwitz" <raphael.norwitz@nutanix.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Kirti Wankhede" <kwankhede@nvidia.com>,
"Kanth Ghatraju" <Kanth.Ghatraju@oracle.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Felipe Franciosi" <felipe@nutanix.com>,
"Thanos Makatos" <thanos.makatos@nutanix.com>,
"Zhang, Tina" <tina.zhang@intel.com>,
"Liu, Changpeng" <changpeng.liu@intel.com>,
"dgilbert@redhat.com" <dgilbert@redhat.com>
Subject: Re: RFC: use VFIO over a UNIX domain socket to implement device offloading
Date: Thu, 14 May 2020 13:20:32 -0600 [thread overview]
Message-ID: <20200514132032.5e635249@w520.home> (raw)
In-Reply-To: <8101D131-3B95-4CF5-8D46-8755593AA97D@oracle.com>
On Thu, 14 May 2020 09:32:15 -0700
John G Johnson <john.g.johnson@oracle.com> wrote:
> Thanos and I have made some changes to the doc in response to the
> feedback we’ve received. The biggest difference is that it is less reliant
> on the reader being familiar with the current VFIO implementation. We’d
> appreciate any additional feedback you could give on the changes. Thanks
> in advance.
>
> Thanos and JJ
>
>
> The link remains the same:
>
> https://docs.google.com/document/d/1FspkL0hVEnZqHbdoqGLUpyC38rSk_7HhY471TsVwyK8/edit?usp=sharing
Hi,
I'm confused by VFIO_USER_ADD_MEMORY_REGION vs VFIO_USER_IOMMU_MAP_DMA.
The former seems intended to provide the server with access to the
entire GPA space, while the latter indicates an IOVA to GPA mapping of
those regions. Doesn't this break the basic isolation of a vIOMMU?
This essentially says to me "here's all the guest memory, but please
only access these regions for which we're providing DMA mappings".
That invites abuse.
Also regarding VFIO_USER_ADD_MEMORY_REGION, it's not clear to me how
"an array of file descriptors will be sent as part of the message
meta-data" works. Also consider s/SUB/DEL/. Why is the Device ID in
the table specified as 0? How does a client learn their Device ID?
VFIO_USER_DEVICE_GET_REGION_INFO (or anything else making use of a
capability chain), the cap_offset and next pointers within the chain
need to specify what their offset is relative to (ie. the start of the
packet, the start of the vfio compatible data structure, etc). I
assume the latter for client compatibility.
Also on REGION_INFO, offset is specified as "the base offset to be
given to the mmap() call for regions with the MMAP attribute". Base
offset from what? Is the mmap performed on the socket fd? Do we not
allow read/write, we need to use VFIO_USER_MMIO_READ/WRITE instead?
Why do we specify "MMIO" in those operations versus simply "REGION"?
Are we arbitrarily excluding support for I/O port regions or device
specific regions? If these commands replace direct read and write to
an fd offset, how is PCI config space handled?
VFIO_USER_MMIO_READ specifies the count field is zero and the reply
will include the count specifying the amount of data read. How does
the client specify how much data to read? Via message size?
VFIO_USER_DMA_READ/WRITE, is the address a GPA or IOVA? IMO the device
should only ever have access via IOVA, which implies a DMA mapping
exists for the device. Can you provide an example of why we need these
commands since there seems little point to this interface if a device
cannot directly interact with VM memory.
The IOMMU commands should be unnecessary, a vIOMMU should be
transparent to the server by virtue that the device only knows about
IOVA mappings accessible to the device. Requiring the client to expose
all memory to the server implies that the server must always be trusted.
Interrupt info format, s/type/index/, s/vector/subindex/
In addition to the unused ioctls, the entire concept of groups and
containers are not found in this specification. To some degree that
makes sense and even mdevs and typically SR-IOV VFs have a 1:1 device
to group relationship. However, the container is very much involved in
the development of migration support, where it's the container that
provides dirty bitmaps. Since we're doing map and unmap without that
container concept here, perhaps we'd equally apply those APIs to this
same socket. Thanks,
Alex
next prev parent reply other threads:[~2020-05-14 19:23 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-26 9:47 RFC: use VFIO over a UNIX domain socket to implement device offloading Thanos Makatos
2020-03-27 10:37 ` Thanos Makatos
2020-04-01 9:17 ` Stefan Hajnoczi
2020-04-01 15:49 ` Thanos Makatos
2020-04-01 16:58 ` Marc-André Lureau
2020-04-02 10:19 ` Stefan Hajnoczi
2020-04-02 10:46 ` Daniel P. Berrangé
2020-04-03 12:03 ` Stefan Hajnoczi
2020-04-20 11:05 ` Thanos Makatos
2020-04-22 15:29 ` Stefan Hajnoczi
2020-04-27 10:58 ` Thanos Makatos
2020-04-30 11:23 ` Thanos Makatos
2020-04-30 11:40 ` Daniel P. Berrangé
2020-04-30 15:20 ` Thanos Makatos
2020-05-01 15:01 ` Felipe Franciosi
2020-05-01 15:28 ` Daniel P. Berrangé
2020-05-04 9:45 ` Stefan Hajnoczi
2020-05-04 17:49 ` John G Johnson
2020-05-11 14:37 ` Stefan Hajnoczi
2020-05-14 16:32 ` John G Johnson
2020-05-14 19:20 ` Alex Williamson [this message]
2020-05-21 0:45 ` John G Johnson
2020-06-02 15:06 ` Alex Williamson
2020-06-10 6:25 ` John G Johnson
2020-06-15 10:49 ` Stefan Hajnoczi
2020-06-18 21:38 ` John G Johnson
2020-06-23 12:27 ` Stefan Hajnoczi
2020-06-26 3:54 ` John G Johnson
2020-06-26 13:30 ` Stefan Hajnoczi
2020-07-02 6:23 ` John G Johnson
2020-07-15 10:15 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200514132032.5e635249@w520.home \
--to=alex.williamson@redhat.com \
--cc=Kanth.Ghatraju@oracle.com \
--cc=benjamin.walker@intel.com \
--cc=changpeng.liu@intel.com \
--cc=dgilbert@redhat.com \
--cc=elena.ufimtseva@oracle.com \
--cc=felipe@nutanix.com \
--cc=jag.raman@oracle.com \
--cc=james.r.harris@intel.com \
--cc=john.g.johnson@oracle.com \
--cc=konrad.wilk@oracle.com \
--cc=kwankhede@nvidia.com \
--cc=marcandre.lureau@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=raphael.norwitz@nutanix.com \
--cc=stefanha@redhat.com \
--cc=swapnil.ingle@nutanix.com \
--cc=thanos.makatos@nutanix.com \
--cc=tina.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).