All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: "Marc-André Lureau" <marcandre.lureau@gmail.com>
Cc: virtio-comment@lists.oasis-open.org, qemu-devel@nongnu.org,
	mst@redhat.com, stefanha@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [virtio-comment] Re: [PATCH v1] docs/vhost-user: extend the vhost-user protocol to support the vhost-pci based inter-vm communication
Date: Tue, 08 Nov 2016 19:35:32 +0800	[thread overview]
Message-ID: <5821B884.5080309@intel.com> (raw)
In-Reply-To: <CAJ+F1C+9k9+X7RCVQ5HMXUxRmdA7WecQ2eTZJ_87ZPHht9s=_g@mail.gmail.com>

On 11/08/2016 03:47 PM, Marc-André Lureau wrote:
> Hi
>
> I suggest you split this patch for the various "features" you propose.

OK. I'll make it several small patches.   <*v1-AR1*>
>
>      Master and slave can be either a client (i.e. connecting) or
>     server (listening)
>      in the socket communication.
>
"Client" and "Server" have already been used in the doc here.
>
>     +The current vhost-user protocol is extended to support the
>     vhost-pci based inter-VM
>     +communication. In this case, Slave is a QEMU which runs a
>     vhost-pci server, and
>     +Master is another QEMU which runs a vhost-pci client.
>     +
>
>
>
> Why introduce new terminology "server" and "client"? What does it 
> change? This is confusing with socket client/server configuration.

OK. I will try with "Slave" and "Master" in this doc when it's possible. 
<*v1-AR2*>

>
>      Message Specification
>      ---------------------
>
>      Note that all numbers are in the machine native byte order. A
>     vhost-user message
>     -consists of 3 header fields and a payload:
>     +consists of 4 header fields and a payload:
>
>     -------------------------------------
>     -| request | flags | size | payload |
>     -------------------------------------
>     +----------------------------------------------
>     +| request | flags | conn_id | size | payload |
>     +----------------------------------------------
>
>       * Request: 32-bit type of the request
>       * Flags: 32-bit bit field:
>         - Lower 2 bits are the version (currently 0x01)
>     -   - Bit 2 is the reply flag - needs to be sent on each reply
>     from the slave
>     +   - Bit 2 is the reply flag - needs to be sent on each reply
>         - Bit 3 is the need_reply flag - see
>     VHOST_USER_PROTOCOL_F_REPLY_ACK for
>           details.
>     + * Conn_id: 64-bit connection id to indentify a client socket
>     connection. It is
>     +            introduced in version 0x02 to support the
>     "1-server-N-client" model
>     +            and an asynchronous client read implementation. The
>     connection id,
>     +            0xFFFFFFFFFFFFFFFF, is used by an anonymous client
>     (e.g. a client who
>     +            has not got its connection id from the server in the
>     initial talk)
>
>
> I don't understand why you need a connection id, on each message. 
> What's the purpose? Since the communication is unicast, a single 
> message should be enough.

Sure, please let me explain more:
The QEMU socket is going to be upgraded to support 1 server socket being 
connected by multiple client sockets (I've made patches to achieve 
this). In other words, here, multiple masters will connect to one slave, 
and the slave creates a vhost-pci device for each master after receiving 
the necessary message info. The slave needs to know which master it is 
talking to when receiving a message, as it maintains multiple 
connections at the same time.
Also shed some light on the implementation:
The slave maintains a table for those masters. Each master has an entry 
in the table (indexed by a "conn_id"). When the slave receives a 
message, the payload is added to the corresponding table entry. When 
things are ready (i.e. it has received enough info to create a vhost-pci 
device for the master), the device creation code creates and initializes 
a vhost-pci device (e.g. initialize VirtioPCIProxy in virtio-pci.c) from 
the corresponding table entry.

>
>       * Size - 32-bit size of the payload
>
>
>     @@ -97,6 +106,13 @@ Depending on the request type, payload can be:
>         log offset: offset from start of supplied file descriptor
>             where logging starts (i.e. where guest address 0 would be
>     logged)
>
>     +* Device info
>     +   --------------------
>     +   | virito id | uuid |
>     +   --------------------
>     +   Virtio id: 16-bit virtio id of the device
>     +   UUID: 128-bit UUID to identify the QEMU instance that creates
>     the device
>     +
>
>
> I wonder if UUID should be a different message.
>
We can make uuid another message if it has other usages.
Do you see any other usages of uuid?

>
>      In QEMU the vhost-user message is implemented with the following
>     struct:
>
>      typedef struct VhostUserMsg {
>     @@ -109,6 +125,7 @@ typedef struct VhostUserMsg {
>              struct vhost_vring_addr addr;
>              VhostUserMemory memory;
>              VhostUserLog log;
>     +        DeviceInfo dev_info;
>          };
>      } QEMU_PACKED VhostUserMsg;
>
>     @@ -119,17 +136,25 @@ The protocol for vhost-user is based on the
>     existing implementation of vhost
>      for the Linux Kernel. Most messages that can be sent via the Unix
>     domain socket
>      implementing vhost-user have an equivalent ioctl to the kernel
>     implementation.
>
>     -The communication consists of master sending message requests and
>     slave sending
>     -message replies. Most of the requests don't require replies. Here
>     is a list of
>     -the ones that do:
>     +Traditionally, the communication consists of master sending
>     message requests
>     +and slave sending message replies. Most of the requests don't
>     require replies.
>     +Here is a list of the ones that do:
>
>       * VHOST_GET_FEATURES
>       * VHOST_GET_PROTOCOL_FEATURES
>       * VHOST_GET_VRING_BASE
>       * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
>     + * VHOST_USER_GET_CONN_ID
>     + * VHOST_USER_SET_PEER_CONNECTION
>
> Let's also fix  the VHOST_USER prefix of the above requests.
>
Sure, will do. <*v1-AR3*>

>      [ Also see the section on REPLY_ACK protocol extension. ]
>
>     +Currently, the communication also supports the Slave (server)
>     sending messages
>     +to the Master (client). Here is a list of them:
>     + * VHOST_USER_SET_FEATURES
>
>     + * VHOST_USER_SET_PEER_CONNECTION (the serve may actively request
>     to disconnect
>     +   with the client)
>
>
> Oh, you are making the communication bidirectional? This is a 
> fundamental change in the protocol. This may be difficult to implement 
> in qemu, since the communication in synchronous, a request expects an 
> immediate reply, if it gets back a request (from the slave) in the 
> middle, it will fail.
>

Not really.
Adding the above two doesn't affect the existing synchronous read() 
messages (basically, those VHOST_USER_GET_xx messages). Like 
VHOST_USER_SET_FEATURES, the _SET_ messages don't need a reply. Here, we 
just make the slave capable of actively sending messages to the master.

> Currently all requests (including VHOST_USER_SET_FEATURES) are coming 
> from the Master. I don't understand yet the purpose of 
> VHOST_USER_SET_PEER_CONNECTION to propose an alternative, but I would 
> rather keep the unidirectional communication if possible.
>
>      There are several messages that the master sends with file
>     descriptors passed
>      in the ancillary data:
>
>     @@ -259,6 +284,7 @@ Protocol features
>      #define VHOST_USER_PROTOCOL_F_LOG_SHMFD      1
>      #define VHOST_USER_PROTOCOL_F_RARP           2
>      #define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
>     +#define VHOST_USER_PROTOCOL_F_VHOST_PCI      4
>
>      Message types
>      -------------
>     @@ -470,6 +496,43 @@ Message types
>            The first 6 bytes of the payload contain the mac address of
>     the guest to
>            allow the vhost user backend to construct and broadcast the
>     fake RARP.
>
>     + * VHOST_USER_GET_CONN_ID
>     +
>     +      Id: 20
>     +      Equivalent ioctl: N/A
>     +      Master payload: u64
>     +
>     +      The client sends this message to the server to ask for its
>     connection id.
>
>
> Confusing, please keep the Master/Slave terminology
OK.
>
>     + The connection id is then put into the message header (the
>     conn_id field),
>     +      so that the server can always know who it is talking with.
>     +
>
>
> Could you explain what the connection id is for?

Explained above. Please let me know if I didn't make it clear.

>
> +This request should be sent only when VHOST_USER_PROTOCOL_F_VHOST_PCI 
> has...
>
>     +* VHOST_USER_SET_DEV_INFO
>     +
>     +      Id: 21
>     +      Equivalent ioctl: N/A
>     +      Master payload: dev info
>     +
>     +      The client sends the producer device info to the server.
>
>
> "Master sends producer device info to the Slave" works, no?

Yes, it works. The current dev info only contains a "virtio id" field 
(assume we'll take uuid out as a separate message), which tells the 
slave if it is a net, scsi, console or else. do you see any issue?
>
> Could we guarantee this message is sent before SET_VRING*?

Why do we need to guarantee this?

>
>     + This request should be sent only when
>     VHOST_USER_PROTOCOL_F_VHOST_PCI has
>     +      been negotiated.
>     +
>
>
> I think this message could be useful for other purposes than 
> vhost-pci, thus I would give it its own flag.

Could you please give an example of other usage? Thanks.

>
>     +* VHOST_USER_SET_PEER_CONNECTION
>     +
>     +      Id: 22
>     +      Equivalent ioctl: N/A
>     +      Master payload: u64
>     +
>     +      The producer device requests to connect or disconnect to
>     the consumer device.
>
>
> producer->Master, consummer->Slave
>
> How does it interact with SET_VRING_ENABLE?

It's independent of SET_VRING_ENABLE:
SET_VRING_ENABLE enables a virtq to be in "active".
SET_PEER_CONNECTION enables the peer (slave or master) device to be in 
"active". The driver shouldn't send packets if the device is inactive.

>
>     + The consumer device may request to disconnect to the producer
>     device. This
>     +      request should be sent only when
>     VHOST_USER_PROTOCOL_F_VHOST_PCI has been
>     +      negotiated.
>     +      Connection request: If the reply message indicates
>     "success", the vhost-pci based
>     +      inter-VM communication channel has been established.
>     +      Disconnection request: If the reply message indicates
>     "success", the vhost-pci based
>     +      inter-VM communication channel has been destroyed.
>     +      #define VHOST_USER_SET_PEER_CONNECTION_F_OFF 0
>     +      #define VHOST_USER_SET_PEER_CONNECTION_F_ON 1
>     +
>
I think it would be better to add one more command here:
#define VHOST_USER_SET_PEER_CONNECTION_F_INIT 2

The master uses this command to tell the slave it's ready to create the 
vhost-pci device. Regarding the implementation, it is put at the bottom 
of vhost_net_start() function (when all the vring info have been sent 
and enabled).

Best,
Wei

  reply	other threads:[~2016-11-08 11:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24  7:10 [Qemu-devel] [PATCH v1] docs/vhost-user: extend the vhost-user protocol to support the vhost-pci based inter-vm communication Wei Wang
2016-10-30 18:02 ` Michael S. Tsirkin
2016-11-07  5:43 ` Wang, Wei W
2016-11-08  7:47 ` Marc-André Lureau
2016-11-08 11:35   ` Wei Wang [this message]
2016-11-08 11:59     ` [Qemu-devel] [virtio-comment] " Wang, Wei W
2016-11-08 12:17     ` Marc-André Lureau
2016-11-09  8:32       ` Wei Wang
2016-11-10 12:26         ` Marc-André Lureau
2016-11-11  8:24           ` Wei Wang
2016-11-11 17:12       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5821B884.5080309@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.