All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Maxime Coquelin
	<maxime.coquelin-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: "dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org"
	<dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org>,
	Flavio Leitner <fleitner-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Daniel P. Berrange"
	<berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"dev-VfR2kkLFssw@public.gmane.org"
	<dev-VfR2kkLFssw@public.gmane.org>,
	Daniele Di Proietto
	<diproiettod-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>,
	"libvir-list-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
	<libvir-list-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [RFC] Vhost-user backends cross-version migration support
Date: Fri, 3 Feb 2017 17:34:07 +0200	[thread overview]
Message-ID: <20170203172140-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <4cad5796-7024-4a48-a73a-8dd780259968-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote:
> Hi,
> 
> On 02/01/2017 09:35 AM, Maxime Coquelin wrote:
> > Hi,
> > 
> >  Few months ago, Michael reported a problem about migrating VMs relying
> > on vhost-user between hosts supporting different backend versions:
> >  - Message-Id: <20161011173526-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> >  - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html
> > 
> >  The goal of this thread is to draft a proposal based on the outcomes
> > of discussions with contributors of the different parties (DPDK/OVS
> > /libvirt/...).
> 
> Thanks the first feedback. It seems to converge that this is Nova's
> role, but not Libvirt one to manage these versions from management tool
> layer.


I think the conclusion is not that it should go up the stack.  I think
this will just get broken all the time.  No one understands versions and
stuff. Even QEMU developers get confused and break compatibility once in
a while.

My conclusion is that doing it from OVS side is wrong.  Migration is not
an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.    People
just want migration to work, ok? It's our job to do it, we do not really
need a "make things work" flag.

If libvirt does not want to use the vhost-user protocol (which sounds
reasonable, it's rather complex) how about qemu providing a small
utility to query the port?  We could output json or whatever.

This can help with MTU as well.

And maybe it will help with nowait support - if someone uses the utility
to dump backend config once, QEMU can later start the device without
feature queries.


> This change has has no impact from OVS perspective, same requirements
> apply. I am interested on OVS contributors feedback on the below design
> proposal.
> 
> Especially, I would like to have your opinion on the best way for OVS to
> expose its supported versions:
> - Static file generated at build time from version table described below
> - Entries in the OVS DB
> - Dedicated tool listing strings from the version table described below
> 
> For selecting the right version of the vhost-user backend, do you agree
> it should be done via a new parameter of the ovs-vsctl add-port command
> for dpdkvhostuser ports?
> 
> 
> > Problem statement:
> > ==================
> > 
> >  When migrating a VM from one host to another, the interfaces exposed by
> > QEMU must stay unchanged in order to guarantee a successful migration.
> > In the case of vhost-user interface, parameters like supported Virtio
> > feature set, max number of queues, max vring sizes,... must remain
> > compatible. Indeed, the frontend not being re-initialized, no
> > renegotiation happens at migration time.
> > 
> >  For example, we have a VM that runs on host A, which has its vhost-user
> > backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest
> > also support this feature, it is successfully negotiated, and guest
> > transmit packets using indirect descriptor tables, that the backend
> > knows to handle.
> > At some point, the VM is being migrated to host B, which runs an older
> > version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC
> > feature. The migration would break, because the Guest still have the
> > VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some
> > decriptors pointing to indirect tables, that backend B doesn't know to
> > handle.
> >  This is just an example about Virtio features compatibility, but other
> > backend implementation details could cause other failures.
> > 
> >  What we need is to be able to query the destination host's backend to
> > ensure migration is possible. Also, we would need to query this
> > statically, even before the VM is started, to be sure it could be
> > migrated elsewhere for any reason.
> 
> ...
> 
> > 
> > Solution 3: Libvirt queries OVS for vhost backend version string: *OK*
> > ======================================================================
> > 
> > 
> >  The idea is to have a table of supported versions, associated to
> > key/value pairs. Libvirt could query the list of supported versions
> > strings for each hosts, and select the first common one among all hosts.
> > 
> >  Then, libvirt would ask OVS to probe the vhost-user interfaces in the
> > selected version (compatibility mode). For example host A runs OVS-2.7,
> > and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode
> > (e.g. with indirect descriptors disabled), which should be selected at
> > vhost-user interface probe time.
> > 
> >  Advantage of doing so is that libvirt does not need any update if new
> > keys are introduced (i.e. it does not need to know how the new keys have
> > to be handled), all these checks remain in OVS's vhost-user implementation.
> > 
> >  Ideally, we would support per vhost-user interface compatibility mode,
> > which may have an impact also on DPDK API, as the Virtio feature update
> > API is global, and not per port.
> > 
> > - Implementation:
> > -----------------
> > 
> >  Goal here is just to illustrate this proposal, I'm sure you will have
> > good suggestion to improve it.
> >  In OVS vhost-user library, we would introduce a new structure, for
> > example (neither compiled nor tested):
> > 
> > struct vhostuser_compat {
> >  char *version;
> >  uint64_t virtio_features;
> >  uint32_t max_rx_queue_sz;
> >  uint32_t max_nr_queues;
> > };
> > 
> >  *version* field is the compatibility version string.
> >   It could be something like: "upstream.ovs-dpdk.v2.6"
> >   In case for example Fedora adds some more patches to its
> >   package that would break migration to upstream version, it could have
> >   a dedicated compatibility string: "fc26.ovs-dpdk.v2.6".
> >   In case OVS-v2.7 does not break compatibility with previous OVS-v2.6
> >   version, then no need to create a new compatibility entry, just keep
> >   v2.6 one.
> > 
> >  *virtio_features* field is the Virtio features set for a given
> >   compatibility version. When an OVS tag is to be created, it would be
> >   associated to a DPDK version. The Virtio features for these version
> >   would be stored in this field. It would allow to upgrade the DPDK
> >   package for example from v16.07 to v16.11 without breaking migration.
> >   In case the distribution wants to benefit from latests Virtio
> >   features, it would have to create a new entry to ensure migration
> >   won't be broken.
> > 
> >  *max_rx_queue_sz*
> >  *max_nr_queues* fields are just here for example, don't think this is
> >   needed today. I just want to illustrate that we have to anticipate
> >   other parameters than the Virtio feature set, even if not necessary
> >   at the moment.
> > 
> >  We create a table with different compatibility versions in OVS
> > vhost-user lib:
> > 
> > static struct vhostuser_compat vu_compat[] = {
> >  {
> >    .version = "upstream.ovs-dpdk.v2.7",
> >    .virtio_features = 0x12045694,
> >    .max_rx_queue_sz = 512,
> >  },
> >  {
> >    .version = "upstream.ovs-dpdk.v2.6",
> >    .virtio_features = 0x10045694,
> >    .max_rx_queue_sz = 1024,
> >  },
> > }
> > 
> >  At some time during installation, or system init, the table would be
> > parsed, and compatibility version strings would be stored into the OVS
> > database, or a new tool would be created to list these strings.
> > 
> >  Before launching the VM, libvirt will query the version strings for
> > each hosts using for example the JSON RPC API of OVS (maybe not the best
> > solution, looking forward for your comments on this). Libvirt would then
> > select the first common supported version, and insert this string into
> > the vhost-user interfaces parameters in the OVS DBs of each host.
> > 
> >  When the vhost-user connection is initiated, OVS would know in which
> > compatibility mode to init the interface, for example by restricting
> > the support Virtio features of the interface.
> > 
> >  Do you think this is reasonable? Or maybe you have alternative ideas
> > that would be best fit to ensure successful migration?
> 
> Thanks,
> Maxime

  parent reply	other threads:[~2017-02-03 15:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-01  8:35 [RFC] Vhost-user backends cross-version migration support Maxime Coquelin
2017-02-01  9:14 ` [libvirt] " Michal Privoznik
2017-02-01  9:43   ` Daniel P. Berrange
2017-02-01 11:33     ` Maxime Coquelin
2017-02-01 11:41       ` Daniel P. Berrange
     [not found]         ` <20170201114119.GE3232-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-01 22:32           ` Michael S. Tsirkin
2017-02-02 14:14         ` Maxime Coquelin
2017-02-02 15:06           ` Daniel P. Berrange
2017-02-02 16:21             ` Michael S. Tsirkin
     [not found]               ` <20170202181827-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-02-02 17:10                 ` Daniel P. Berrange
     [not found]                   ` <20170202171028.GT2915-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-02 17:20                     ` Michael S. Tsirkin
2017-02-02 17:29                       ` Daniel P. Berrange
     [not found]                         ` <20170202172908.GW2915-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-02 17:31                           ` Michael S. Tsirkin
     [not found]                             ` <20170202193041-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-02-02 18:21                               ` Daniel P. Berrange
     [not found]                                 ` <20170202182155.GA30916-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-02 18:27                                   ` Michael S. Tsirkin
2017-02-03  9:27                                     ` Daniel P. Berrange
2017-02-03  9:41                                       ` Maxime Coquelin
     [not found]                                         ` <34bf53f0-7595-fd90-300d-41db10a43ece-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-03 10:11                                           ` Daniel P. Berrange
2017-02-03 11:36                                             ` Maxime Coquelin
2017-02-02 16:47             ` Laine Stump
2017-02-02 17:09               ` Michael S. Tsirkin
     [not found]                 ` <20170202190811-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-02-02 17:13                   ` Daniel P. Berrange
2017-02-02 17:16                 ` Maxime Coquelin
2017-02-03  9:12                   ` Michal Privoznik
     [not found]                     ` <3ca28dd9-140b-85c2-2040-b1397b3ea254-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-03 17:40                       ` Laine Stump
2017-02-03 14:11 ` Maxime Coquelin
     [not found]   ` <4cad5796-7024-4a48-a73a-8dd780259968-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-02-03 15:34     ` Michael S. Tsirkin [this message]
2017-02-03 15:54       ` Daniel P. Berrange
2017-02-03 16:10         ` Michael S. Tsirkin
2017-02-03 17:22       ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170203172140-mutt-send-email-mst@kernel.org \
    --to=mst-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dev-VfR2kkLFssw@public.gmane.org \
    --cc=dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org \
    --cc=diproiettod-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org \
    --cc=fleitner-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=libvir-list-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=maxime.coquelin-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.