From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Robin Geuze <robing@transip.nl>,
jasowang@redhat.com, qemu-devel@nongnu.org,
Cornelia Huck <cornelia.huck@de.ibm.com>
Subject: Re: [Qemu-devel] Bug in virtio_net_load
Date: Thu, 30 Jun 2016 18:29:09 +0100 [thread overview]
Message-ID: <20160630172909.GF2683@work-vm> (raw)
In-Reply-To: <20160630200609-mutt-send-email-mst@redhat.com>
* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Thu, Jun 30, 2016 at 10:34:51AM +0200, Robin Geuze wrote:
> > Hey,
> >
> > I work for TransIP and we host a VPS platform based on QEMU/KVM. We are
> > currently running qemu 2.4.0. A few days ago we noticed that live migrations
> > for some of our VM's would fail. Further investigation turned out it was
> > specific to windows server 2012, caused by the fact that the standard virtio
> > driver from RedHat was replaced in windows updates by a driver called
> > "Midfin eFabric" (this driver doesn't really seem to be meant for virtio, we
> > have a case running at MicroSoft about that). Once we knew how to reproduce
> > we tested this on QEMU 2.6.0 as well and it also seems to be affected
> > (later we found out that 2.4.0 to 2.6.0 migration does work probably due to
> > pure luck).
> >
> > We started investigating the problem in QEMU 2.4.0 and noticed it was caused
> > by the fact that virtio_net_device_load requires certain feature flags to be
> > set, specifically to load curr_guest_offloads which is only written and read
> > if the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS flag is set, but those flags are set
> > in virtio_load after the call to virtio_net_device_load. Moving the code
> > setting the feature flags before the call to virtio_net_device_load fixes
> > it, however it introduces another problem. Virtio can have 64-bits feature
> > flags, however the standard save payload for virtio only has space for
> > 32-bits feature flags. This was solved by putting those in a subsection of
> > the vmstate_save_state stuff. Unfortunately this is called (and thus binary
> > offset located) after the virtio_net_device_load code.
> >
> > There was an attempt to fix this in QEMU 2.6.0. However, this seems to have
> > broken it worse. The write code (virtio_net_save, virtio_save and
> > virtio_net_save_device) still puts the curr_guest_offloads value before the
> > vmstate_save_state data. However the read code expects and tries to read it
> > after the vmstate_save_state data. Should we just also change the
> > virtio_net_save code to have it follow the same order as virtio_net_load? Or
> > will this potentially break more stuff.
> >
> > Regards,
> >
> > Robin Geuze
> >
> > TransIP BV
>
> After going over it several times, I think the change in 2.6
> was wrong
>
>
>
> commit 1f8828ef573c83365b4a87a776daf8bcef1caa21
> Author: Jason Wang <jasowang@redhat.com>
> Date: Fri Sep 11 16:01:56 2015 +0800
>
> virtio-net: unbreak self announcement and guest offloads after migration
>
> After commit 019a3edbb25f1571e876f8af1ce4c55412939e5d ("virtio: make
> features 64bit wide"). Device's guest_features was actually set after
> vdc->load(). This breaks the assumption that device specific load()
> function can check guest_features. For virtio-net, self announcement
> and guest offloads won't work after migration.
>
> Fixing this by defer them to virtio_net_load() where guest_features
> were guaranteed to be set. Other virtio devices looks fine.
>
> Fixes: 019a3edbb25f1571e876f8af1ce4c55412939e5d
> ("virtio: make features 64bit wide")
> Cc: qemu-stable@nongnu.org
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>
>
> I'm not sure what was I thinking when I applied this:
> it changes load without changing save - how can this work?
>
>
> I am inclined to revert 1f8828ef573c83365b4a87a776daf8bcef1caa21 and
> apply this instead:
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 7ed06ea..18153d5 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1499,6 +1499,16 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
> }
> qemu_get_be32s(f, &features);
>
> + /*
> + * Temporarily set guest_features low bits - needed by
> + * virtio net load code testing for VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
> + * VIRTIO_NET_F_GUEST_ANNOUNCE and VIRTIO_NET_F_CTRL_VQ.
> + *
> + * Note: devices should always test host features in future - don't create
> + * new dependencies like this.
> + */
> + vdev->guest_features = features;
> +
> config_len = qemu_get_be32(f);
>
> /*
>
> Could you please confirm whether this help?
> Jason, Cornelia - any comments?
>
> David, if this goes in I'm afraid your patchset reworking
> save/load will have to be rebased, but I think we want
> the bugfix first and new features/changes second.
> Do you agree?
Yes, bug fixes first. But actually the merge is trivial, the only
place I think they collide is in virtio_net_load and it's just the difference
of the return virtio_load vs the ret = virtio_load in the patch you revert.
Dave
> --
> MST
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2016-06-30 17:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-30 8:34 [Qemu-devel] Bug in virtio_net_load Robin Geuze
2016-06-30 17:23 ` Michael S. Tsirkin
2016-06-30 17:29 ` Dr. David Alan Gilbert [this message]
2016-07-01 2:35 ` Jason Wang
2016-07-01 8:48 ` Cornelia Huck
2016-07-01 8:54 ` Robin Geuze
2016-07-04 7:11 ` Robin Geuze
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160630172909.GF2683@work-vm \
--to=dgilbert@redhat.com \
--cc=cornelia.huck@de.ibm.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=robing@transip.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.