qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, "Lukas Straub" <lukasstraub2@web.de>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"Avihai Horon" <avihaih@nvidia.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Leonardo Bras" <leobras@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>, "Peter Xu" <peterx@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>
Subject: Re: [PATCH] migration: Add documentation for backwards compatiblity
Date: Thu, 11 May 2023 09:10:23 +0100	[thread overview]
Message-ID: <ZFyi73w855EsDmEy@redhat.com> (raw)
In-Reply-To: <20230510195341.7591-1-quintela@redhat.com>

On Wed, May 10, 2023 at 09:53:41PM +0200, Juan Quintela wrote:
> State what are the requeriments to get migration working between qemu
> versions.  And once there explain how one is supposed to implement a
> new feature/default value and not break migration.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> ---
> 
> Hi
> 
> I will really appreciate reviews:
> 
> - I don't speak natively .rst format, so let me what I have done
>   wrong.
> 
> - English is not my native language either (no points if had guessed
>   that).
> 
> - This is stuff is obvious to me, so let me when I have assumed
>   things, things that need to be claried, explained better, etc.
> 
> Thanks, Juan.
> ---
>  docs/devel/migration.rst | 212 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
> 
> diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
> index 6f65c23b47..daa510da42 100644
> --- a/docs/devel/migration.rst
> +++ b/docs/devel/migration.rst
> @@ -142,6 +142,218 @@ General advice for device developers
>    may be different on the destination.  This can result in the
>    device state being loaded into the wrong device.
>  
> +How backwards compatibility work

s/work/works/

> +--------------------------------
> +
> +When we do migration, we have to qemus: source and target qemu.  There

'have two qemu process, the source and the target'

> +are two cases, they are the same version or they are a different
> +version.  The easy case is when they are the same version.  The
> +difficult one is when they are different versions.
> +
> +There are two things that are different, but they have very similar
> +names and sometimes get confused:
> +- qemu version
> +- machine version
> +
> +Let's start with a practical example, we start with:
> +
> +- qemu-system-x86_64 (v5.2), from now one qemu-5.2.
> +- qemu-system-x86_64 (v5.1), from now one qemu-5.1.

s/one/on/

> +Related to this are the "latest" machine types defined on each of
> +them:
> +
> +- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2
> +- pc-q35-5.1 (newer one qemu-5.1) from now on pc-5.1
> +
> +First of all, migration is only supposed to work if you use the same
> +machine type in both source and destination. The qemu configuration

s/configuration/hardware configuration/ - most aspects of the backend
configuration can be changed at will, except for a few cases where
the backend features influence frontend device feature exposure.

> +needs to be the same also on source and destination.  But that is not
> +relevant for this section.
> +
> +I am going to list the number of combinations that we can have.  Let's
> +start with the trivial ones, qemu is the same on source and
> +destination:
> +
> +1 - qemu-5.2 -M pc-5.2  -> migrates to -> qemu-5.2 -M pc-5.2
> +
> +  This is the latest qemu with the latest machine type.
> +  This have to work, and if it don't work it is a bug.
> +
> +2 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
> +
> +  Exactly the same case than the previous one, but for 5.1.
> +  Nothing to see here either.
> +
> +This are the easiest ones, we will not talk more about them in this
> +section.
> +
> +Now we start with the more interesting cases.  Let start with the
> +same qemu but not the same machine type.
> +
> +3 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
> +
> +  It needs to use the definition of pc-5.1 and the devices as they
> +  were configured on 5.1, but this should be easy in the sense that
> +  both sides are the same qemu and both sides have exactly the same
> +  idea of what the pc-5.1 machine is.
> +
> +4 - qemu-5.1 -M pc-5.2  -> migrates to -> qemu-5.1 -M pc-5.2
> +
> +  This combination is not possible as the qemu-5.1 don't understand
> +  pc-5.2 machine type.  So nothing to worry here.
> +
> +Now it comes the interesting ones, when both qemus are different.
> +Notice also that the machine type needs to be pc-5.1, because we have
> +the limitation than qemu-5.1 don't know pc-5.2.  So the possible cases
> +are:
> +
> +5 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
> +
> +  This migration is known as newer to older.  We need to make sure
> +  when we are developing 5.2 we need to take care about not to break
> +  migration to qemu-5.1.  Notice that we can't make updates to
> +  qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is
> +  in qemu-5.2 side to make the relevant changes.
> +
> +6 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
> +
> +  This migration is known as older to newer.  We need to make sure
> +  than we are able to receive migrations from qemu-5.1. The problem is
> +  similar to the previous one.
> +
> +If qemu-5.1 and qemu-5.2 were the same, there will not be any
> +compatibility problems.  But the reason that we create qemu-5.2 is to
> +get new features, devices, defaults, etc.
> +
> +If we get a device that get a new feature, or change a default value,

s/get a new/has a new/

> +we have a problem when we try to migrate between different qemu
> +versions.
> +
> +So we need a way to tell qemu-5.2 than when we are using machine type

s/than when/that when/

> +pc-5.1, it needs to **not** use the feature, to be able to migrate to
> +read qemu-5.1.

s/read/real/

> +
> +And the equivalent part when migrating from qemu-5.1 to qemu-5.2.
> +qemu-5.2 have to expect that it is not going to get data for the new

s/have/has/

> +feature, because qemu-5.1 don't know about it.
> +
> +How do we tell qemu about this device feature changes?  In

s/this/these/

> +hw/core/machine.c:hw_compat_X_Y arrays.
> +
> +If we change a default value, we need to put back the old value on

s/on that array/in that array/

> +that array.  And the device, during initialization needs to look at
> +that array to see what value it needs to get for that feature.  And
> +what are we going to put on that array, the value of a property.
> +
> +To create a property for a device, we need to use one of the
> +DEFINE_PROP_*() macros. See include/hw/qdev-properties.h to find the
> +macros that exist.  With it, we set the default value for that
> +property, and that is what it is going to get in the latest released
> +version.  But if we want a different value for a previous version, we
> +can change that in the hw_compat_X_Y arrays.
> +
> +hw_compat_X_Y is an array of registers that have the format:
> +
> +- name_device
> +- name_property
> +- value
> +
> +Let's see a practical example.
> +
> +In qemu-5.2 virtio-blk-device got multi queue support.  This is a
> +change that is not backward compatible.  In qemu-5.1 it has one
> +queue. In qemu-5.2 it has the same number of queues than the number of

s/than the number/as the number/

> +cpus in the system.
> +
> +When we are doing migration, if we migrate from a device that has 4
> +queues to a device that have only one queue, we don't know where to
> +put the extra information for the other 3 queues, and we fail
> +migration.
> +
> +Similar problem when we migrate from qemu-5.1 that has only one queue
> +to qemu-5.2, we only sent information for one queue, but destination
> +has 4, and we have 3 queues that are not properly initialized and
> +anything can happen.
> +
> +So, how can we address this problem.  Easy, just convince qemu-5.2
> +that when it is running pc-5.1, it needs to set the number of queues
> +for virtio-blk-devices to 1.
> +
> +That way we fix the cases 5 and 6.
> +
> +5 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
> +
> +    qemu-5.2 -M pc-5.1 sets number of queues to be 1.
> +    qemu-5.1 -M pc-5.1 expects number of queues to be 1.
> +
> +    correct.  migration works.
> +
> +6 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
> +
> +    qemu-5.1 -M pc-5.1 sets number of queues to be 1.
> +    qemu-5.2 -M pc-5.1 expects number of queues to be 1.
> +
> +    correct.  migration works.
> +
> +And now the other interesting case, case 3.  In this case we have:
> +
> +3 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
> +
> +    Here we have the same qemu in both sides.  So it don't matter a

s/don't/doesn't/

> +    lot if we have setup the number of queues to 1 or not, because

s/setup/set/

> +    they are the same.
> +
> +    WRONG!
> +
> +    Think what happens if we do one of this double migrations:
> +
> +    A -> migrates -> B -> migrates -> C
> +
> +    where:
> +
> +    A: qemu-5.1 -M pc-5.1
> +    B: qemu-5.2 -M pc-5.1
> +    C: qemu-5.2 -M pc-5.1
> +
> +    migration A -> B is case 6, so number of queues needs to be 1.
> +
> +    migration B -> C is case 3, so we don't care.  But actually we
> +    care because we haven't started the guest in qemu-5.2, it came
> +    migrated from qemu-5.1.  So to be in the safe place, we need to
> +    always use number of queues 1 when we are using pc-5.1.
> +
> +Now, how was this done in reality?  The following commit shows how it
> +was done.
> +
> +commit 9445e1e15e66c19e42bea942ba810db28052cd05
> +Author: Stefan Hajnoczi <stefanha@redhat.com>
> +Date:   Tue Aug 18 15:33:47 2020 +0100
> +
> +    virtio-blk-pci: default num_queues to -smp N
> +
> +The relevant parts for migration are:
> +
> +@@ -1281,7 +1284,8 @@ static Property virtio_blk_properties[] = {
> + #endif
> +     DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
> +                     true),
> +-    DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, 1),
> ++    DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
> ++                       VIRTIO_BLK_AUTO_NUM_QUEUES),
> +     DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256),
> +
> +It changes the default value of num_queues.  But it fishes it for old
> +machine types to have the right value:
> +
> +@@ -31,6 +31,7 @@
> + GlobalProperty hw_compat_5_1[] = {
> +     ...
> ++    { "virtio-blk-device", "num-queues", "1"},
> +     ...
> + };
> +
> +
>  VMState
>  -------
>  
> -- 
> 2.40.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2023-05-11  8:16 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-10 19:53 [PATCH] migration: Add documentation for backwards compatiblity Juan Quintela
2023-05-11  8:10 ` Daniel P. Berrangé [this message]
2023-05-11  8:23   ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFyi73w855EsDmEy@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=leobras@redhat.com \
    --cc=lukasstraub2@web.de \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).