From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Thanos Makatos <thanos.makatos@nutanix.com>
Cc: "Walker, Benjamin" <benjamin.walker@intel.com>,
"Elena Ufimtseva" <elena.ufimtseva@oracle.com>,
"Jag Raman" <jag.raman@oracle.com>,
"Harris, James R" <james.r.harris@intel.com>,
"Swapnil Ingle" <swapnil.ingle@nutanix.com>,
"John G Johnson" <john.g.johnson@oracle.com>,
"Stefan Hajnoczi" <stefanha@gmail.com>,
"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Raphael Norwitz" <raphael.norwitz@nutanix.com>,
"Kirti Wankhede" <kwankhede@nvidia.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Felipe Franciosi" <felipe@nutanix.com>,
"Kanth Ghatraju" <Kanth.Ghatraju@oracle.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Zhang, Tina" <tina.zhang@intel.com>,
"Liu, Changpeng" <changpeng.liu@intel.com>,
"dgilbert@redhat.com" <dgilbert@redhat.com>
Subject: Re: RFC: use VFIO over a UNIX domain socket to implement device offloading
Date: Thu, 30 Apr 2020 12:40:41 +0100 [thread overview]
Message-ID: <20200430114041.GN2084570@redhat.com> (raw)
In-Reply-To: <MW2PR02MB372319618A59DA06851BBFB48BAA0@MW2PR02MB3723.namprd02.prod.outlook.com>
On Thu, Apr 30, 2020 at 11:23:34AM +0000, Thanos Makatos wrote:
> > > > I've just shared with you the Google doc we've working on with John
> > > where we've
> > > > been drafting the protocol specification, we think it's time for some first
> > > > comments. Please feel free to comment/edit and suggest more people
> > to
> > > be on the
> > > > reviewers list.
> > > >
> > > > You can also find the Google doc here:
> > > >
> > > >
> > > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__docs.google.com_document_d_1FspkL0hVEnZqHbdoqGLUpyC38rSk-
> > 5F&d=DwIFAw&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogtti46a
> > tk736SI4vgsJiUKIyDE&m=lJC7YeMMsAaVsr99tmTYncQdjEfOXiJQkRkJW7NMg
> > Rg&s=RyyhgVrLX2bBTqpXZnBmllqkCg_wyalxwZKkfcYt50c&e=
> > > 7HhY471TsVwyK8/edit?usp=sharing
> > > >
> > > > If a Google doc doesn't work for you we're open to suggestions.
> > >
> > > I can't add comments to the document so I've inlined them here:
> > >
> > > The spec assumes the reader is already familiar with VFIO and does not
> > > explain concepts like the device lifecycle, regions, interrupts, etc.
> > > We don't need to duplicate detailed VFIO information, but I think the
> > > device model should be explained so that anyone can start from the
> > > VFIO-user spec and begin working on an implementation. Right now I
> > > think they would have to do some serious investigation of VFIO first in
> > > order to be able to write code.
> >
> > I've added a high-level overview of how VFIO is used in this context.
> >
> > > "only the source header files are used"
> > > I notice the current <linux/vfio.h> header is licensed "GPL-2.0 WITH
> > > Linux-syscall-note". I'm not a lawyer but I guess this means there are
> > > some restrictions on using this header file. The <linux/virtio*.h>
> > > header files were explicitly licensed under the BSD license to make it
> > > easy to use the non __KERNEL__ parts.
> >
> > My impression is that this note actually relaxes the licensing requirements, so
> > that proprietary software can use the system call headers and run on Linux
> > without being considered derived work. In any case I'll double check with our
> > legal team.
> >
> > > VFIO-user Command Types: please indicate for each request type whether
> > > it is client->server, server->client, or both. Also is it a "command"
> > > or "request"?
> >
> > Will do. It's a command.
> >
> >
> > > vfio_user_req_type <-- is this an extension on top of <linux/vfio.h>?
> > > Please make it clear what is part of the base <linux/vfio.h> protocol
> > > and what is specific to vfio-user.
> >
> > Correct, it's an extension over <linux/vfio.h>. I've clarified the two symbol
> > namespaces.
> >
> >
> > > VFIO_USER_READ/WRITE serve completely different purposes depending
> > on
> > > whether they are sent client->server or server->client. I suggest
> > > defining separate request type constants instead of overloading them.
> >
> > Fixed.
> >
> > > What is the difference between VFIO_USER_MAP_DMA and
> > > VFIO_USER_REG_MEM?
> > > They both seem to be client->server messages for setting up memory but
> > > I'm not sure why two request types are needed.
> >
> > John will provide more information on this.
> >
> > > struct vfio_user_req->data. Is this really a union so that every
> > > message has the same size, regardless of how many parameters are
> > passed
> > > in the data field?
> >
> > Correct, it's a union so that every message has the same length.
> >
> > > "a framebuffer where the guest does multiple stores to the virtual
> > > device." Do you mean in SMP guests? Or even in a single CPU guest?
> >
> > @John
> >
> > > Also, is there any concurrency requirement on the client and server
> > > side? Can I implement a client/server that processes requests
> > > sequentially and completes them before moving on to the next request or
> > > would that deadlock for certain message types?
> >
> > I believe that this might also depend on the device semantics, will need to
> > think about it in greater detail.
>
> I've looked at this but can't provide a definitive answer yet. I believe the
> safest thing to do is for the server to process requests in order.
>
> > More importantly, considering:
> > a) Marc-André's comments about data alignment etc., and
> > b) the possibility to run the server on another guest or host,
> > we won't be able to use native VFIO types. If we do want to support that
> > then
> > we'll have to redefine all data formats, similar to
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_qemu_qemu_blob_master_docs_interop_vhost-
> > 2Duser.rst&d=DwIFAw&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6
> > ogtti46atk736SI4vgsJiUKIyDE&m=lJC7YeMMsAaVsr99tmTYncQdjEfOXiJQkRkJ
> > W7NMgRg&s=1d_kB7VWQ-8d4t6Ikga5KSVwws4vwiVMvTyWVaS6PRU&e= .
> >
> > So the protocol will be more like an enhanced version of the Vhost-user
> > protocol
> > than VFIO. I'm fine with either direction (VFIO vs. enhanced Vhost-user),
> > so we need to decide before proceeding as the request format is
> > substantially
> > different.
>
> Regarding the ability to use the protocol on non-AF_UNIX sockets, we can
> support this future use case without unnecessarily complicating the protocol by
> defining the C structs and stating that data alignment and endianness for the
> non AF_UNIX case must be the one used by GCC on a x86_64 bit machine, or can
> be overridden as required.
Defining it to be x86_64 semantics is effectively saying "we're not going
to do anything and it is up to other arch maintainers to fix the inevitable
portability problems that arise".
Since this is a new protocol should we take the opportunity to model it
explicitly in some common standard RPC protocol language. This would have
the benefit of allowing implementors to use off the shelf APIs for their
wire protocol marshalling, and eliminate questions about endianness and
alignment across architectures.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2020-04-30 11:43 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-26 9:47 RFC: use VFIO over a UNIX domain socket to implement device offloading Thanos Makatos
2020-03-27 10:37 ` Thanos Makatos
2020-04-01 9:17 ` Stefan Hajnoczi
2020-04-01 15:49 ` Thanos Makatos
2020-04-01 16:58 ` Marc-André Lureau
2020-04-02 10:19 ` Stefan Hajnoczi
2020-04-02 10:46 ` Daniel P. Berrangé
2020-04-03 12:03 ` Stefan Hajnoczi
2020-04-20 11:05 ` Thanos Makatos
2020-04-22 15:29 ` Stefan Hajnoczi
2020-04-27 10:58 ` Thanos Makatos
2020-04-30 11:23 ` Thanos Makatos
2020-04-30 11:40 ` Daniel P. Berrangé [this message]
2020-04-30 15:20 ` Thanos Makatos
2020-05-01 15:01 ` Felipe Franciosi
2020-05-01 15:28 ` Daniel P. Berrangé
2020-05-04 9:45 ` Stefan Hajnoczi
2020-05-04 17:49 ` John G Johnson
2020-05-11 14:37 ` Stefan Hajnoczi
2020-05-14 16:32 ` John G Johnson
2020-05-14 19:20 ` Alex Williamson
2020-05-21 0:45 ` John G Johnson
2020-06-02 15:06 ` Alex Williamson
2020-06-10 6:25 ` John G Johnson
2020-06-15 10:49 ` Stefan Hajnoczi
2020-06-18 21:38 ` John G Johnson
2020-06-23 12:27 ` Stefan Hajnoczi
2020-06-26 3:54 ` John G Johnson
2020-06-26 13:30 ` Stefan Hajnoczi
2020-07-02 6:23 ` John G Johnson
2020-07-15 10:15 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200430114041.GN2084570@redhat.com \
--to=berrange@redhat.com \
--cc=Kanth.Ghatraju@oracle.com \
--cc=alex.williamson@redhat.com \
--cc=benjamin.walker@intel.com \
--cc=changpeng.liu@intel.com \
--cc=dgilbert@redhat.com \
--cc=elena.ufimtseva@oracle.com \
--cc=felipe@nutanix.com \
--cc=jag.raman@oracle.com \
--cc=james.r.harris@intel.com \
--cc=john.g.johnson@oracle.com \
--cc=konrad.wilk@oracle.com \
--cc=kwankhede@nvidia.com \
--cc=marcandre.lureau@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=raphael.norwitz@nutanix.com \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=swapnil.ingle@nutanix.com \
--cc=thanos.makatos@nutanix.com \
--cc=tina.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).