qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: "Marc-André Lureau" <marcandre.lureau@gmail.com>
Cc: virtio-comment@lists.oasis-open.org, qemu-devel@nongnu.org,
	mst@redhat.com, stefanha@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [virtio-comment] Re: [PATCH v1] docs/vhost-user: extend the vhost-user protocol to support the vhost-pci based inter-vm communication
Date: Tue, 08 Nov 2016 19:35:32 +0800	[thread overview]
Message-ID: <5821B884.5080309@intel.com> (raw)
In-Reply-To: <CAJ+F1C+9k9+X7RCVQ5HMXUxRmdA7WecQ2eTZJ_87ZPHht9s=_g@mail.gmail.com>

On 11/08/2016 03:47 PM, Marc-André Lureau wrote:
> Hi
>
> I suggest you split this patch for the various "features" you propose.

OK. I'll make it several small patches.   <*v1-AR1*>
>
>      Master and slave can be either a client (i.e. connecting) or
>     server (listening)
>      in the socket communication.
>
"Client" and "Server" have already been used in the doc here.
>
>     +The current vhost-user protocol is extended to support the
>     vhost-pci based inter-VM
>     +communication. In this case, Slave is a QEMU which runs a
>     vhost-pci server, and
>     +Master is another QEMU which runs a vhost-pci client.
>     +
>
>
>
> Why introduce new terminology "server" and "client"? What does it 
> change? This is confusing with socket client/server configuration.

OK. I will try with "Slave" and "Master" in this doc when it's possible. 
<*v1-AR2*>

>
>      Message Specification
>      ---------------------
>
>      Note that all numbers are in the machine native byte order. A
>     vhost-user message
>     -consists of 3 header fields and a payload:
>     +consists of 4 header fields and a payload:
>
>     -------------------------------------
>     -| request | flags | size | payload |
>     -------------------------------------
>     +----------------------------------------------
>     +| request | flags | conn_id | size | payload |
>     +----------------------------------------------
>
>       * Request: 32-bit type of the request
>       * Flags: 32-bit bit field:
>         - Lower 2 bits are the version (currently 0x01)
>     -   - Bit 2 is the reply flag - needs to be sent on each reply
>     from the slave
>     +   - Bit 2 is the reply flag - needs to be sent on each reply
>         - Bit 3 is the need_reply flag - see
>     VHOST_USER_PROTOCOL_F_REPLY_ACK for
>           details.
>     + * Conn_id: 64-bit connection id to indentify a client socket
>     connection. It is
>     +            introduced in version 0x02 to support the
>     "1-server-N-client" model
>     +            and an asynchronous client read implementation. The
>     connection id,
>     +            0xFFFFFFFFFFFFFFFF, is used by an anonymous client
>     (e.g. a client who
>     +            has not got its connection id from the server in the
>     initial talk)
>
>
> I don't understand why you need a connection id, on each message. 
> What's the purpose? Since the communication is unicast, a single 
> message should be enough.

Sure, please let me explain more:
The QEMU socket is going to be upgraded to support 1 server socket being 
connected by multiple client sockets (I've made patches to achieve 
this). In other words, here, multiple masters will connect to one slave, 
and the slave creates a vhost-pci device for each master after receiving 
the necessary message info. The slave needs to know which master it is 
talking to when receiving a message, as it maintains multiple 
connections at the same time.
Also shed some light on the implementation:
The slave maintains a table for those masters. Each master has an entry 
in the table (indexed by a "conn_id"). When the slave receives a 
message, the payload is added to the corresponding table entry. When 
things are ready (i.e. it has received enough info to create a vhost-pci 
device for the master), the device creation code creates and initializes 
a vhost-pci device (e.g. initialize VirtioPCIProxy in virtio-pci.c) from 
the corresponding table entry.

>
>       * Size - 32-bit size of the payload
>
>
>     @@ -97,6 +106,13 @@ Depending on the request type, payload can be:
>         log offset: offset from start of supplied file descriptor
>             where logging starts (i.e. where guest address 0 would be
>     logged)
>
>     +* Device info
>     +   --------------------
>     +   | virito id | uuid |
>     +   --------------------
>     +   Virtio id: 16-bit virtio id of the device
>     +   UUID: 128-bit UUID to identify the QEMU instance that creates
>     the device
>     +
>
>
> I wonder if UUID should be a different message.
>
We can make uuid another message if it has other usages.
Do you see any other usages of uuid?

>
>      In QEMU the vhost-user message is implemented with the following
>     struct:
>
>      typedef struct VhostUserMsg {
>     @@ -109,6 +125,7 @@ typedef struct VhostUserMsg {
>              struct vhost_vring_addr addr;
>              VhostUserMemory memory;
>              VhostUserLog log;
>     +        DeviceInfo dev_info;
>          };
>      } QEMU_PACKED VhostUserMsg;
>
>     @@ -119,17 +136,25 @@ The protocol for vhost-user is based on the
>     existing implementation of vhost
>      for the Linux Kernel. Most messages that can be sent via the Unix
>     domain socket
>      implementing vhost-user have an equivalent ioctl to the kernel
>     implementation.
>
>     -The communication consists of master sending message requests and
>     slave sending
>     -message replies. Most of the requests don't require replies. Here
>     is a list of
>     -the ones that do:
>     +Traditionally, the communication consists of master sending
>     message requests
>     +and slave sending message replies. Most of the requests don't
>     require replies.
>     +Here is a list of the ones that do:
>
>       * VHOST_GET_FEATURES
>       * VHOST_GET_PROTOCOL_FEATURES
>       * VHOST_GET_VRING_BASE
>       * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
>     + * VHOST_USER_GET_CONN_ID
>     + * VHOST_USER_SET_PEER_CONNECTION
>
> Let's also fix  the VHOST_USER prefix of the above requests.
>
Sure, will do. <*v1-AR3*>

>      [ Also see the section on REPLY_ACK protocol extension. ]
>
>     +Currently, the communication also supports the Slave (server)
>     sending messages
>     +to the Master (client). Here is a list of them:
>     + * VHOST_USER_SET_FEATURES
>
>     + * VHOST_USER_SET_PEER_CONNECTION (the serve may actively request
>     to disconnect
>     +   with the client)
>
>
> Oh, you are making the communication bidirectional? This is a 
> fundamental change in the protocol. This may be difficult to implement 
> in qemu, since the communication in synchronous, a request expects an 
> immediate reply, if it gets back a request (from the slave) in the 
> middle, it will fail.
>

Not really.
Adding the above two doesn't affect the existing synchronous read() 
messages (basically, those VHOST_USER_GET_xx messages). Like 
VHOST_USER_SET_FEATURES, the _SET_ messages don't need a reply. Here, we 
just make the slave capable of actively sending messages to the master.

> Currently all requests (including VHOST_USER_SET_FEATURES) are coming 
> from the Master. I don't understand yet the purpose of 
> VHOST_USER_SET_PEER_CONNECTION to propose an alternative, but I would 
> rather keep the unidirectional communication if possible.
>
>      There are several messages that the master sends with file
>     descriptors passed
>      in the ancillary data:
>
>     @@ -259,6 +284,7 @@ Protocol features
>      #define VHOST_USER_PROTOCOL_F_LOG_SHMFD      1
>      #define VHOST_USER_PROTOCOL_F_RARP           2
>      #define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
>     +#define VHOST_USER_PROTOCOL_F_VHOST_PCI      4
>
>      Message types
>      -------------
>     @@ -470,6 +496,43 @@ Message types
>            The first 6 bytes of the payload contain the mac address of
>     the guest to
>            allow the vhost user backend to construct and broadcast the
>     fake RARP.
>
>     + * VHOST_USER_GET_CONN_ID
>     +
>     +      Id: 20
>     +      Equivalent ioctl: N/A
>     +      Master payload: u64
>     +
>     +      The client sends this message to the server to ask for its
>     connection id.
>
>
> Confusing, please keep the Master/Slave terminology
OK.
>
>     + The connection id is then put into the message header (the
>     conn_id field),
>     +      so that the server can always know who it is talking with.
>     +
>
>
> Could you explain what the connection id is for?

Explained above. Please let me know if I didn't make it clear.

>
> +This request should be sent only when VHOST_USER_PROTOCOL_F_VHOST_PCI 
> has...
>
>     +* VHOST_USER_SET_DEV_INFO
>     +
>     +      Id: 21
>     +      Equivalent ioctl: N/A
>     +      Master payload: dev info
>     +
>     +      The client sends the producer device info to the server.
>
>
> "Master sends producer device info to the Slave" works, no?

Yes, it works. The current dev info only contains a "virtio id" field 
(assume we'll take uuid out as a separate message), which tells the 
slave if it is a net, scsi, console or else. do you see any issue?
>
> Could we guarantee this message is sent before SET_VRING*?

Why do we need to guarantee this?

>
>     + This request should be sent only when
>     VHOST_USER_PROTOCOL_F_VHOST_PCI has
>     +      been negotiated.
>     +
>
>
> I think this message could be useful for other purposes than 
> vhost-pci, thus I would give it its own flag.

Could you please give an example of other usage? Thanks.

>
>     +* VHOST_USER_SET_PEER_CONNECTION
>     +
>     +      Id: 22
>     +      Equivalent ioctl: N/A
>     +      Master payload: u64
>     +
>     +      The producer device requests to connect or disconnect to
>     the consumer device.
>
>
> producer->Master, consummer->Slave
>
> How does it interact with SET_VRING_ENABLE?

It's independent of SET_VRING_ENABLE:
SET_VRING_ENABLE enables a virtq to be in "active".
SET_PEER_CONNECTION enables the peer (slave or master) device to be in 
"active". The driver shouldn't send packets if the device is inactive.

>
>     + The consumer device may request to disconnect to the producer
>     device. This
>     +      request should be sent only when
>     VHOST_USER_PROTOCOL_F_VHOST_PCI has been
>     +      negotiated.
>     +      Connection request: If the reply message indicates
>     "success", the vhost-pci based
>     +      inter-VM communication channel has been established.
>     +      Disconnection request: If the reply message indicates
>     "success", the vhost-pci based
>     +      inter-VM communication channel has been destroyed.
>     +      #define VHOST_USER_SET_PEER_CONNECTION_F_OFF 0
>     +      #define VHOST_USER_SET_PEER_CONNECTION_F_ON 1
>     +
>
I think it would be better to add one more command here:
#define VHOST_USER_SET_PEER_CONNECTION_F_INIT 2

The master uses this command to tell the slave it's ready to create the 
vhost-pci device. Regarding the implementation, it is put at the bottom 
of vhost_net_start() function (when all the vring info have been sent 
and enabled).

Best,
Wei

  reply	other threads:[~2016-11-08 11:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24  7:10 [Qemu-devel] [PATCH v1] docs/vhost-user: extend the vhost-user protocol to support the vhost-pci based inter-vm communication Wei Wang
2016-10-30 18:02 ` Michael S. Tsirkin
2016-11-07  5:43 ` Wang, Wei W
2016-11-08  7:47 ` Marc-André Lureau
2016-11-08 11:35   ` Wei Wang [this message]
2016-11-08 11:59     ` [Qemu-devel] [virtio-comment] " Wang, Wei W
2016-11-08 12:17     ` Marc-André Lureau
2016-11-09  8:32       ` Wei Wang
2016-11-10 12:26         ` Marc-André Lureau
2016-11-11  8:24           ` Wei Wang
2016-11-11 17:12       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5821B884.5080309@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).