From: Cornelia Huck <cornelia.huck@de.ibm.com>
To: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>,
mst@redhat.com, qemu-devel@nongnu.org, marcel@redhat.com,
eblake@redhat.com, qemu-stable@nongnu.org, stefanha@redhat.com,
dgilbert@redhat.com
Subject: Re: [Qemu-devel] [Qemu-stable] [PATCH v2] virtio-bus: Plug devices after features are negotiated
Date: Wed, 14 Dec 2016 08:44:51 +0100 [thread overview]
Message-ID: <20161214084451.2a5236d2.cornelia.huck@de.ibm.com> (raw)
In-Reply-To: <20161213212745.1976.91005@loki>
On Tue, 13 Dec 2016 15:27:45 -0600
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> Quoting Maxime Coquelin (2016-09-13 08:30:30)
> > Currently, devices are plugged before features are negotiated.
> > If the backend doesn't support VIRTIO_F_VERSION_1, the transport
> > needs to rewind some settings.
> >
> > This is the case for CCW, for which a post_plugged callback had
> > been introduced, where max_rev field is just updated if
> > VIRTIO_F_VERSION_1 is not supported by the backend.
> > For PCI, implementing post_plugged would be much more
> > complicated, so it needs to know whether the backend supports
> > VIRTIO_F_VERSION_1 at plug time.
> >
> > Currently, nothing is done for PCI. Modern capabilities get
> > exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
> > by the backend, which confuses the guest.
> >
> > This patch replaces existing post_plugged solution with an
> > approach that fits with both transports.
> > Features negotiation is performed before ->device_plugged() call.
> > A pre_plugged callback is introduced so that the transports can
> > set their supported features.
> >
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: qemu-stable@nongnu.org
> > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
> > Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> > Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>
>
> It looks like this patch breaks migration under certain circumstances.
> One such scenario is migrating 2.7 -> 2.8.0-rc3 as detailed below on a
> host that doesn't have support for virtio-1 in vhost (which was
> introduced via 41e3e42 in kernel 3.18. In my case, I'm using Ubuntu
> 14.04, kernel 3.13):
>
> source (2.7.0):
>
> sudo build/v2.7.0/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L
> build/v2.7.0-bios -M pc-i440fx-2.7 -m 512M -drive
> file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
> cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
> tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
> unix:/tmp/vm-hmp.sock,server,nowait -qmp
> unix:/tmp/vm-qmp.sock,server,nowait -vnc :200
>
> target (2.8.0-rc3):
>
> sudo build/v2.8.0-rc3/x86_64-softmmu/qemu-system-x86_64 -enable-kvm
> -L build/v2.8.0-rc3-bios -M pc-i440fx-2.7 -m 512M -drive
> file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
> cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
> tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
> unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp
> unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :201
> -incoming unix:/tmp/migrate.sock
>
> error on target after migration:
>
> qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x34
> read: 98 device: 40 cmask: ff wmask: 0 w1cmask:0
> qemu-system-x86_64: Failed to load PCIDevice:config
> qemu-system-x86_64: Failed to load virtio-net:virtio
> qemu-system-x86_64: error while loading state for instance 0x0 of
> device '0000:00:03.0/virtio-net'
> qemu-system-x86_64: load of migration failed: Invalid argument
>
Ugh. Let me double-check what happens for ccw. I could have sworn I
tested this...
>
> Based on discussion on IRC (excerpts below), I think the new handling needs
> to be controllable via a machine option that can be disabled by default
> for older machine types. This is being considered a release blocker for
> 2.8 since there are still pre-3.18 LTS kernels in the wild.
>
>
> root-cause:
>
> 14:35 < stefanha> v3.13 will not work
> 14:35 < stefanha> VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> 14:35 < stefanha> (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> 14:35 < stefanha> (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> 14:35 < stefanha> (1ULL << VHOST_F_LOG_ALL),
> 14:35 < stefanha> The kernel side only knows about these guys
> 14:35 < davidgiluk> our downstream 3.10.0-524 seems to work - but that's probably got all the vhost stuff backported
> 14:35 < stefanha> plus these guys:
> 14:35 < stefanha> F VHOST_NET_FEATURES = VHOST_FEATURES |
> 14:35 < stefanha> (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
> 14:35 < stefanha> (1ULL << VIRTIO_NET_F_MRG_RXBUF),
> 14:36 < stefanha> mdroth: So QEMU is going to check a list of feature bits including VERSION_1
> 14:36 < stefanha> and it will see that vhost doesn't support them.
> 14:36 < stefanha> So we're kind of doing the right thing now.
> 14:36 < stefanha> Before userspace was overriding the fact that vhost cannot handle VERSION_1.
> 14:36 < stefanha> ...except we broke migration
> 14:36 < davidgiluk> stefanha: But shouldn't it refuse to startup ?
> 14:36 < stefanha> Everything is perfect*
> 14:36 < stefanha> * except we broke migration
> 14:36 < stefanha> :)
>
> suggestions on how to fix this:
>
> 14:40 < stefanha> My understanding is this bug is vhost-specific. If you have an old kernel that doesn't support VERSION_1 vhost then migration will break.
> 14:41 < stefanha> The actual bug is in QEMU though, not vhost
> 14:42 < stefanha> vhost is reporting the truth. It's QEMU that has changed behavior.
> 14:44 < stefanha> mdroth: I think a lame fix would be to make virtio_pci_device_plugged() dependent on a flag that says: use old broken behavior instead of reporting the truth to the guest.
> 14:44 < stefanha> The flag could be enabled for all old machine types
> 14:44 < stefanha> and new machine types will report the truth.
> 14:44 < stefanha> That way migration continues to work.
I'll check whether we would need something for ccw as well.
> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> 14:45 < stefanha> But we're going to have to keep lying to the guest if we want to preserve migration compatibility
> 14:45 < stefanha> The key change in behavior with the patch you identified is:
> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> 14:46 < stefanha> Previously it didn't care about vdev->host_features. It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with disable-modern=true
> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is definitely still around so I don't think we can ship QEMU 2.8 like this.
> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and see what Michael Tsirkin and Maxime Coquelin think.
> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to tell users to set disable-modern= to match their vhost capabilities, but it's hard for them to apply that retroactively if they're looking to migrate
Another thought: Maybe this bug only surfaced right now because older
qemus defaulted virtio-pci to legacy?
(I think modern virtio-pci with old vhost resulted in a config that was
rejected at least by Linux guests. Because pci defaulted to legacy, we
only had the post-plugged workaround for ccw before.)
> 14:49 < stefanha> We can delay the release in the meantime.
> 14:50 < stefanha> mdroth: I don't think a workaround is going to be smooth in this case
> 14:50 < stefanha> People will only notice once migration fails,
> 14:50 < stefanha> and that's when you lose your VM state!
next prev parent reply other threads:[~2016-12-14 7:45 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-13 13:30 [Qemu-devel] [PATCH v2] virtio-bus: Plug devices after features are negotiated Maxime Coquelin
2016-12-13 21:27 ` [Qemu-devel] [Qemu-stable] " Michael Roth
2016-12-14 7:44 ` Cornelia Huck [this message]
2016-12-14 8:28 ` Maxime Coquelin
2016-12-14 8:41 ` Stefan Hajnoczi
2016-12-14 8:59 ` Cornelia Huck
2016-12-14 9:44 ` Maxime Coquelin
2016-12-14 9:50 ` Dr. David Alan Gilbert
2016-12-14 10:02 ` Maxime Coquelin
2016-12-14 10:08 ` Cornelia Huck
2016-12-14 11:48 ` Marcel Apfelbaum
2016-12-14 8:20 ` Maxime Coquelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161214084451.2a5236d2.cornelia.huck@de.ibm.com \
--to=cornelia.huck@de.ibm.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=marcel@redhat.com \
--cc=maxime.coquelin@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).