From: "Michael S. Tsirkin" <mst@redhat.com>
To: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Cc: qemu-devel@nongnu.org, yc-core@yandex-team.ru
Subject: Re: [PING][PING] [PATCH v4] vhost: make SET_VRING_ADDR, SET_FEATURES send replies
Date: Mon, 16 Aug 2021 07:50:30 -0400 [thread overview]
Message-ID: <20210816074947-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <b83030bd-26d5-b26e-742b-185e9120a1cc@yandex-team.ru>
On Mon, Aug 16, 2021 at 09:53:27AM +0300, Denis Plotnikov wrote:
>
> On 12.08.2021 11:04, Denis Plotnikov wrote:
> >
> > On 09.08.2021 13:48, Denis Plotnikov wrote:
> > > On vhost-user-blk migration, qemu normally sends a number of commands
> > > to enable logging if VHOST_USER_PROTOCOL_F_LOG_SHMFD is negotiated.
> > > Qemu sends VHOST_USER_SET_FEATURES to enable buffers logging and
> > > VHOST_USER_SET_VRING_ADDR per each started ring to enable "used ring"
> > > data logging.
> > > The issue is that qemu doesn't wait for reply from the vhost daemon
> > > for these commands which may result in races between qemu expectation
> > > of logging starting and actual login starting in vhost daemon.
> > >
> > > The race can appear as follows: on migration setup, qemu enables
> > > dirty page
> > > logging by sending VHOST_USER_SET_FEATURES. The command doesn't
> > > arrive to a
> > > vhost-user-blk daemon immediately and the daemon needs some time to
> > > turn the
> > > logging on internally. If qemu doesn't wait for reply, after sending the
> > > command, qemu may start migrateing memory pages to a destination. At
> > > this time,
> > > the logging may not be actually turned on in the daemon but some
> > > guest pages,
> > > which the daemon is about to write to, may have already been transferred
> > > without logging to the destination. Since the logging wasn't turned on,
> > > those pages won't be transferred again as dirty. So we may end up with
> > > corrupted data on the destination.
> > > The same scenario is applicable for "used ring" data logging, which is
> > > turned on with VHOST_USER_SET_VRING_ADDR command.
> > >
> > > To resolve this issue, this patch makes qemu wait for the command result
> > > explicitly if VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated and
> > > logging enabled.
> > >
> > > Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
This looks reasonable. This change is too scary for 6.1 so
I think it should wait for 6.2.
Thanks!
> > > ---
> > > v3 -> v4:
> > > * join acked reply and get_features in enforce_reply [mst]
> > > * typos, rewording, cosmetic changes [mst]
> > >
> > > v2 -> v3:
> > > * send VHOST_USER_GET_FEATURES to flush out outstanding messages
> > > [mst]
> > >
> > > v1 -> v2:
> > > * send reply only when logging is enabled [mst]
> > >
> > > v0 -> v1:
> > > * send reply for SET_VRING_ADDR, SET_FEATURES only [mst]
> > > ---
> > > hw/virtio/vhost-user.c | 139 +++++++++++++++++++++++++++++------------
> > > 1 file changed, 98 insertions(+), 41 deletions(-)
> > >
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > index ee57abe04526..5bb9254acd21 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -1095,23 +1095,6 @@ static int vhost_user_set_mem_table(struct
> > > vhost_dev *dev,
> > > return 0;
> > > }
> > > -static int vhost_user_set_vring_addr(struct vhost_dev *dev,
> > > - struct vhost_vring_addr *addr)
> > > -{
> > > - VhostUserMsg msg = {
> > > - .hdr.request = VHOST_USER_SET_VRING_ADDR,
> > > - .hdr.flags = VHOST_USER_VERSION,
> > > - .payload.addr = *addr,
> > > - .hdr.size = sizeof(msg.payload.addr),
> > > - };
> > > -
> > > - if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > - return -1;
> > > - }
> > > -
> > > - return 0;
> > > -}
> > > -
> > > static int vhost_user_set_vring_endian(struct vhost_dev *dev,
> > > struct vhost_vring_state *ring)
> > > {
> > > @@ -1288,72 +1271,146 @@ static int vhost_user_set_vring_call(struct
> > > vhost_dev *dev,
> > > return vhost_set_vring_file(dev, VHOST_USER_SET_VRING_CALL, file);
> > > }
> > > -static int vhost_user_set_u64(struct vhost_dev *dev, int request,
> > > uint64_t u64)
> > > +
> > > +static int vhost_user_get_u64(struct vhost_dev *dev, int request,
> > > uint64_t *u64)
> > > {
> > > VhostUserMsg msg = {
> > > .hdr.request = request,
> > > .hdr.flags = VHOST_USER_VERSION,
> > > - .payload.u64 = u64,
> > > - .hdr.size = sizeof(msg.payload.u64),
> > > };
> > > + if (vhost_user_one_time_request(request) && dev->vq_index != 0) {
> > > + return 0;
> > > + }
> > > +
> > > if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > return -1;
> > > }
> > > + if (vhost_user_read(dev, &msg) < 0) {
> > > + return -1;
> > > + }
> > > +
> > > + if (msg.hdr.request != request) {
> > > + error_report("Received unexpected msg type. Expected %d
> > > received %d",
> > > + request, msg.hdr.request);
> > > + return -1;
> > > + }
> > > +
> > > + if (msg.hdr.size != sizeof(msg.payload.u64)) {
> > > + error_report("Received bad msg size.");
> > > + return -1;
> > > + }
> > > +
> > > + *u64 = msg.payload.u64;
> > > +
> > > return 0;
> > > }
> > > -static int vhost_user_set_features(struct vhost_dev *dev,
> > > - uint64_t features)
> > > +static int vhost_user_get_features(struct vhost_dev *dev, uint64_t
> > > *features)
> > > {
> > > - return vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES, features);
> > > + return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
> > > }
> > > -static int vhost_user_set_protocol_features(struct vhost_dev *dev,
> > > - uint64_t features)
> > > +static int enforce_reply(struct vhost_dev *dev,
> > > + const VhostUserMsg *msg)
> > > {
> > > - return vhost_user_set_u64(dev,
> > > VHOST_USER_SET_PROTOCOL_FEATURES, features);
> > > + uint64_t dummy;
> > > +
> > > + if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
> > > + return process_message_reply(dev, msg);
> > > + }
> > > +
> > > + /*
> > > + * We need to wait for a reply but the backend does not
> > > + * support replies for the command we just sent.
> > > + * Send VHOST_USER_GET_FEATURES which makes all backends
> > > + * send a reply.
> > > + */
> > > + return vhost_user_get_features(dev, &dummy);
> > > }
> > > -static int vhost_user_get_u64(struct vhost_dev *dev, int request,
> > > uint64_t *u64)
> > > +static int vhost_user_set_vring_addr(struct vhost_dev *dev,
> > > + struct vhost_vring_addr *addr)
> > > {
> > > VhostUserMsg msg = {
> > > - .hdr.request = request,
> > > + .hdr.request = VHOST_USER_SET_VRING_ADDR,
> > > .hdr.flags = VHOST_USER_VERSION,
> > > + .payload.addr = *addr,
> > > + .hdr.size = sizeof(msg.payload.addr),
> > > };
> > > - if (vhost_user_one_time_request(request) && dev->vq_index != 0) {
> > > - return 0;
> > > + bool reply_supported = virtio_has_feature(dev->protocol_features,
> > > + VHOST_USER_PROTOCOL_F_REPLY_ACK);
> > > +
> > > + /*
> > > + * wait for a reply if logging is enabled to make sure
> > > + * backend is actually logging changes
> > > + */
> > > + bool wait_for_reply = addr->flags & (1 << VHOST_VRING_F_LOG);
> > > +
> > > + if (reply_supported && wait_for_reply) {
> > > + msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> > > }
> > > if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > return -1;
> > > }
> > > - if (vhost_user_read(dev, &msg) < 0) {
> > > - return -1;
> > > + if (wait_for_reply) {
> > > + return enforce_reply(dev, &msg);
> > > }
> > > - if (msg.hdr.request != request) {
> > > - error_report("Received unexpected msg type. Expected %d
> > > received %d",
> > > - request, msg.hdr.request);
> > > - return -1;
> > > + return 0;
> > > +}
> > > +
> > > +static int vhost_user_set_u64(struct vhost_dev *dev, int request,
> > > uint64_t u64,
> > > + bool wait_for_reply)
> > > +{
> > > + VhostUserMsg msg = {
> > > + .hdr.request = request,
> > > + .hdr.flags = VHOST_USER_VERSION,
> > > + .payload.u64 = u64,
> > > + .hdr.size = sizeof(msg.payload.u64),
> > > + };
> > > +
> > > + if (wait_for_reply) {
> > > + bool reply_supported =
> > > virtio_has_feature(dev->protocol_features,
> > > + VHOST_USER_PROTOCOL_F_REPLY_ACK);
> > > + if (reply_supported) {
> > > + msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> > > + }
> > > }
> > > - if (msg.hdr.size != sizeof(msg.payload.u64)) {
> > > - error_report("Received bad msg size.");
> > > + if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > return -1;
> > > }
> > > - *u64 = msg.payload.u64;
> > > + if (wait_for_reply) {
> > > + return enforce_reply(dev, &msg);
> > > + }
> > > return 0;
> > > }
> > > -static int vhost_user_get_features(struct vhost_dev *dev,
> > > uint64_t *features)
> > > +static int vhost_user_set_features(struct vhost_dev *dev,
> > > + uint64_t features)
> > > {
> > > - return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
> > > + /*
> > > + * wait for a reply if logging is enabled to make sure
> > > + * backend is actually logging changes
> > > + */
> > > + bool log_enabled = features & (0x1ULL << VHOST_F_LOG_ALL);
> > > +
> > > + return vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES, features,
> > > + log_enabled);
> > > +}
> > > +
> > > +static int vhost_user_set_protocol_features(struct vhost_dev *dev,
> > > + uint64_t features)
> > > +{
> > > + return vhost_user_set_u64(dev,
> > > VHOST_USER_SET_PROTOCOL_FEATURES, features,
> > > + false);
> > > }
> > > static int vhost_user_set_owner(struct vhost_dev *dev)
prev parent reply other threads:[~2021-08-16 11:51 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-09 10:48 [PATCH v4] vhost: make SET_VRING_ADDR, SET_FEATURES send replies Denis Plotnikov
2021-08-12 8:04 ` [PING] " Denis Plotnikov
2021-08-16 6:53 ` [PING][PING] " Denis Plotnikov
2021-08-16 11:50 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210816074947-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=den-plotnikov@yandex-team.ru \
--cc=qemu-devel@nongnu.org \
--cc=yc-core@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).