From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: qemu-devel@nongnu.org, Johannes Berg <johannes.berg@intel.com>,
"Michael S . Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [RFC v2 1/2] docs: vhost-user: add in-band kick/call messages
Date: Wed, 11 Sep 2019 20:15:14 +0100 [thread overview]
Message-ID: <20190911191514.GA2895@work-vm> (raw)
In-Reply-To: <20190911134539.25650-2-johannes@sipsolutions.net>
* Johannes Berg (johannes@sipsolutions.net) wrote:
> From: Johannes Berg <johannes.berg@intel.com>
>
> For good reason, vhost-user is currently built asynchronously, that
> way better performance can be obtained. However, for certain use
> cases such as simulation, this is problematic.
>
> Consider an event-based simulation in which both the device and CPU
> have scheduled according to a simulation "calendar". Now, consider
> the CPU sending I/O to the device, over a vring in the vhost-user
> protocol. In this case, the CPU must wait for the vring interrupt
> to have been processed by the device, so that the device is able to
> put an entry onto the simulation calendar to obtain time to handle
> the interrupt. Note that this doesn't mean the I/O is actually done
> at this time, it just means that the handling of it is scheduled
> before the CPU can continue running.
>
> This cannot be done with the asynchronous eventfd based vring kick
> and call design.
>
> Extend the protocol slightly, so that a message can be used for kick
> and call instead, if VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS is
> negotiated. This in itself doesn't guarantee synchronisation, but both
> sides can also negotiate VHOST_USER_PROTOCOL_F_REPLY_ACK and thus get
> a reply to this message by setting the need_reply flag, and ensure
> synchronisation this way.
I'm confused; if you've already got REPLY_ACK, why do we need anything
else? We already require the reply on set_mem_table as part of
postcopy.
Dave
> To really use it in both directions, VHOST_USER_PROTOCOL_F_SLAVE_REQ
> is also needed.
>
> Since it is used for simulation purposes and too many messages on
> the socket can lock up the virtual machine, document that this should
> only be used together with the mentioned features.
>
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> docs/interop/vhost-user.rst | 113 ++++++++++++++++++++++++++++++------
> 1 file changed, 96 insertions(+), 17 deletions(-)
>
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 7827b710aa0a..c4396eabf9e9 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -2,6 +2,7 @@
> Vhost-user Protocol
> ===================
> :Copyright: 2014 Virtual Open Systems Sarl.
> +:Copyright: 2019 Intel Corporation
> :Licence: This work is licensed under the terms of the GNU GPL,
> version 2 or later. See the COPYING file in the top-level
> directory.
> @@ -279,6 +280,9 @@ If *master* is unable to send the full message or receives a wrong
> reply it will close the connection. An optional reconnection mechanism
> can be implemented.
>
> +If *slave* detects some error such as incompatible features, it may also
> +close the connection. This should only happen in exceptional circumstances.
> +
> Any protocol extensions are gated by protocol feature bits, which
> allows full backwards compatibility on both master and slave. As
> older slaves don't support negotiating protocol features, a feature
> @@ -315,7 +319,8 @@ it until ring is started, or after it has been stopped.
>
> Client must start ring upon receiving a kick (that is, detecting that
> file descriptor is readable) on the descriptor specified by
> -``VHOST_USER_SET_VRING_KICK``, and stop ring upon receiving
> +``VHOST_USER_SET_VRING_KICK`` (or receiving the in-band message
> +``VHOST_USER_VRING_KICK`` if negotiated) and stop ring upon receiving
> ``VHOST_USER_GET_VRING_BASE``.
>
> While processing the rings (whether they are enabled or not), client
> @@ -767,24 +772,48 @@ When reconnecting:
> #. Resubmit inflight ``DescStatePacked`` entries in order of their
> counter value
>
> +In-band notifications
> +---------------------
> +
> +In some limited situations (e.g. for simulation) it is desirable to
> +have the kick, call and error (if used) signals done via in-band
> +messages instead of asynchronous eventfd notifications. This can be
> +done by negotiating the ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS``
> +protocol feature.
> +
> +Note that due to the fact that too many messages on the sockets can
> +cause the sending application(s) to block, it is not advised to use
> +this feature unless absolutely necessary. It is also considered an
> +error to negotiate this feature without also negotiating
> +``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` and ``VHOST_USER_PROTOCOL_F_REPLY_ACK``,
> +the former is necessary for getting a message channel from the slave
> +to the master, while the latter needs to be used with the in-band
> +notification messages to block until they are processed, both to avoid
> +blocking later and for proper processing (at least in the simulation
> +use case.) As it has no other way of signalling this error, the slave
> +should close the connection as a response to a
> +``VHOST_USER_SET_PROTOCOL_FEATURES`` message that sets the in-band
> +notifications feature flag without the other two.
> +
> Protocol features
> -----------------
>
> .. code:: c
>
> - #define VHOST_USER_PROTOCOL_F_MQ 0
> - #define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1
> - #define VHOST_USER_PROTOCOL_F_RARP 2
> - #define VHOST_USER_PROTOCOL_F_REPLY_ACK 3
> - #define VHOST_USER_PROTOCOL_F_MTU 4
> - #define VHOST_USER_PROTOCOL_F_SLAVE_REQ 5
> - #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN 6
> - #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
> - #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8
> - #define VHOST_USER_PROTOCOL_F_CONFIG 9
> - #define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD 10
> - #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 11
> - #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
> + #define VHOST_USER_PROTOCOL_F_MQ 0
> + #define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1
> + #define VHOST_USER_PROTOCOL_F_RARP 2
> + #define VHOST_USER_PROTOCOL_F_REPLY_ACK 3
> + #define VHOST_USER_PROTOCOL_F_MTU 4
> + #define VHOST_USER_PROTOCOL_F_SLAVE_REQ 5
> + #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN 6
> + #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
> + #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8
> + #define VHOST_USER_PROTOCOL_F_CONFIG 9
> + #define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD 10
> + #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 11
> + #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
> + #define VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS 13
>
> Master message types
> --------------------
> @@ -946,7 +975,9 @@ Master message types
> Bits (0-7) of the payload contain the vring index. Bit 8 is the
> invalid FD flag. This flag is set when there is no file descriptor
> in the ancillary data. This signals that polling should be used
> - instead of waiting for a kick.
> + instead of waiting for the call. however, if the protocol feature
> + ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS`` has been negotiated
> + it instead means the updates should be done using the messages.
>
> ``VHOST_USER_SET_VRING_CALL``
> :id: 13
> @@ -959,7 +990,9 @@ Master message types
> Bits (0-7) of the payload contain the vring index. Bit 8 is the
> invalid FD flag. This flag is set when there is no file descriptor
> in the ancillary data. This signals that polling will be used
> - instead of waiting for the call.
> + instead of waiting for the call; however, if the protocol feature
> + ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS`` has been negotiated
> + it instead means the updates should be done using the messages.
>
> ``VHOST_USER_SET_VRING_ERR``
> :id: 14
> @@ -971,7 +1004,11 @@ Master message types
>
> Bits (0-7) of the payload contain the vring index. Bit 8 is the
> invalid FD flag. This flag is set when there is no file descriptor
> - in the ancillary data.
> + in the ancillary data. If the protocol features
> + ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS`` and
> + ``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` have been negotiated the invalid
> + FD flag will be used to indicate that updates should be done using
> + the ``VHOST_USER_SLAVE_ message.
>
> ``VHOST_USER_GET_QUEUE_NUM``
> :id: 17
> @@ -1190,6 +1227,20 @@ Master message types
> ancillary data. The GPU protocol is used to inform the master of
> rendering state and updates. See vhost-user-gpu.rst for details.
>
> +``VHOST_USER_VRING_KICK``
> + :id: 34
> + :equivalent ioctl: N/A
> + :slave payload: vring state description
> + :master payload: N/A
> +
> + When the ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS`` protocol
> + feature has been successfully negotiated, this message may be
> + submitted by the master to indicate that a buffer was added to
> + the vring instead of signalling it using the vring's event FD or
> + having the slave rely on polling.
> +
> + The state.num field is currently reserved and must be set to 0.
> +
> Slave message types
> -------------------
>
> @@ -1246,6 +1297,34 @@ Slave message types
> ``VHOST_USER_PROTOCOL_F_HOST_NOTIFIER`` protocol feature has been
> successfully negotiated.
>
> +``VHOST_USER_SLAVE_VRING_CALL``
> + :id: 4
> + :equivalent ioctl: N/A
> + :slave payload: vring state description
> + :master payload: N/A
> +
> + When the ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS`` protocol
> + feature has been successfully negotiated, this message may be
> + submitted by the slave to indicate that a buffer was used from
> + the vring instead of signalling this using the vring's kick FD or
> + having the master relying on polling.
> +
> + The state.num field is currently reserved and must be set to 0.
> +
> +``VHOST_USER_SLAVE_VRING_ERR``
> + :id: 5
> + :equivalent ioctl: N/A
> + :slave payload: vring state description
> + :master payload: N/A
> +
> + When the ``VHOST_USER_PROTOCOL_F_IN_BAND_NOTIFICATIONS`` protocol
> + feature has been successfully negotiated, this message may be
> + submitted by the slave to indicate that an error occurred on the
> + specific vring, instead of signalling the error FD set by the
> + master via ``VHOST_USER_SET_VRING_ERR``.
> +
> + The state.num field is currently reserved and must be set to 0.
> +
> .. _reply_ack:
>
> VHOST_USER_PROTOCOL_F_REPLY_ACK
> --
> 2.20.1
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-09-11 19:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-11 13:45 [Qemu-devel] [RFC v2 0/2] vhost-user: in-band notifications Johannes Berg
2019-09-11 13:45 ` [Qemu-devel] [RFC v2 1/2] docs: vhost-user: add in-band kick/call messages Johannes Berg
2019-09-11 14:07 ` Michael S. Tsirkin
2019-09-11 15:09 ` Johannes Berg
2019-09-16 11:40 ` Johannes Berg
2019-09-16 15:30 ` Michael S. Tsirkin
2019-09-17 12:01 ` Johannes Berg
2019-09-11 19:15 ` Dr. David Alan Gilbert [this message]
2019-09-11 19:18 ` Johannes Berg
2019-09-12 8:09 ` Dr. David Alan Gilbert
2019-09-12 8:13 ` Johannes Berg
2019-09-11 13:45 ` [Qemu-devel] [RFC v2 2/2] libvhost-user: implement in-band notifications Johannes Berg
2019-09-11 15:01 ` [Qemu-devel] [RFC v2 0/2] vhost-user: " no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190911191514.GA2895@work-vm \
--to=dgilbert@redhat.com \
--cc=johannes.berg@intel.com \
--cc=johannes@sipsolutions.net \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).