From: Juan Quintela <quintela@redhat.com>
To: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
libvir-list@redhat.com, Stefan Hajnoczi <stefanha@redhat.com>,
Fam Zheng <fam@euphon.net>, Hanna Reitz <hreitz@redhat.com>,
Li Zhijian <lizhijian@fujitsu.com>, Peter Xu <peterx@redhat.com>,
Leonardo Bras <leobras@redhat.com>,
Juan Quintela <quintela@redhat.com>,
Eric Blake <eblake@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Hailiang Zhang <zhanghailiang@xfusion.com>,
qemu-block@nongnu.org, Fabiano Rosas <farosas@suse.de>,
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Subject: [PULL 02/38] migration/doc: Add documentation for backwards compatiblity
Date: Tue, 31 Oct 2023 10:01:06 +0100 [thread overview]
Message-ID: <20231031090142.13122-3-quintela@redhat.com> (raw)
In-Reply-To: <20231031090142.13122-1-quintela@redhat.com>
State what are the requeriments to get migration working between qemu
versions. And once there explain how one is supposed to implement a
new feature/default value and not break migration.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018112827.1325-3-quintela@redhat.com>
---
docs/devel/migration.rst | 219 +++++++++++++++++++++++++++++++++++++++
1 file changed, 219 insertions(+)
diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index 4d6a98ae58..6fe275b1ec 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -919,3 +919,222 @@ versioned machine types to cut down on the combinations that will need
support. This is also useful when newer versions of firmware outgrow
the padding.
+
+Backwards compatibility
+=======================
+
+How backwards compatibility works
+---------------------------------
+
+When we do migration, we have two QEMU processes: the source and the
+target. There are two cases, they are the same version or they are
+different versions. The easy case is when they are the same version.
+The difficult one is when they are different versions.
+
+There are two things that are different, but they have very similar
+names and sometimes get confused:
+
+- QEMU version
+- machine type version
+
+Let's start with a practical example, we start with:
+
+- qemu-system-x86_64 (v5.2), from now on qemu-5.2.
+- qemu-system-x86_64 (v5.1), from now on qemu-5.1.
+
+Related to this are the "latest" machine types defined on each of
+them:
+
+- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2
+- pc-q35-5.1 (newer one in qemu-5.1) from now on pc-5.1
+
+First of all, migration is only supposed to work if you use the same
+machine type in both source and destination. The QEMU hardware
+configuration needs to be the same also on source and destination.
+Most aspects of the backend configuration can be changed at will,
+except for a few cases where the backend features influence frontend
+device feature exposure. But that is not relevant for this section.
+
+I am going to list the number of combinations that we can have. Let's
+start with the trivial ones, QEMU is the same on source and
+destination:
+
+1 - qemu-5.2 -M pc-5.2 -> migrates to -> qemu-5.2 -M pc-5.2
+
+ This is the latest QEMU with the latest machine type.
+ This have to work, and if it doesn't work it is a bug.
+
+2 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1
+
+ Exactly the same case than the previous one, but for 5.1.
+ Nothing to see here either.
+
+This are the easiest ones, we will not talk more about them in this
+section.
+
+Now we start with the more interesting cases. Consider the case where
+we have the same QEMU version in both sides (qemu-5.2) but we are using
+the latest machine type for that version (pc-5.2) but one of an older
+QEMU version, in this case pc-5.1.
+
+3 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
+
+ It needs to use the definition of pc-5.1 and the devices as they
+ were configured on 5.1, but this should be easy in the sense that
+ both sides are the same QEMU and both sides have exactly the same
+ idea of what the pc-5.1 machine is.
+
+4 - qemu-5.1 -M pc-5.2 -> migrates to -> qemu-5.1 -M pc-5.2
+
+ This combination is not possible as the qemu-5.1 doen't understand
+ pc-5.2 machine type. So nothing to worry here.
+
+Now it comes the interesting ones, when both QEMU processes are
+different. Notice also that the machine type needs to be pc-5.1,
+because we have the limitation than qemu-5.1 doesn't know pc-5.2. So
+the possible cases are:
+
+5 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1
+
+ This migration is known as newer to older. We need to make sure
+ when we are developing 5.2 we need to take care about not to break
+ migration to qemu-5.1. Notice that we can't make updates to
+ qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is
+ in qemu-5.2 side to make the relevant changes.
+
+6 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
+
+ This migration is known as older to newer. We need to make sure
+ than we are able to receive migrations from qemu-5.1. The problem is
+ similar to the previous one.
+
+If qemu-5.1 and qemu-5.2 were the same, there will not be any
+compatibility problems. But the reason that we create qemu-5.2 is to
+get new features, devices, defaults, etc.
+
+If we get a device that has a new feature, or change a default value,
+we have a problem when we try to migrate between different QEMU
+versions.
+
+So we need a way to tell qemu-5.2 that when we are using machine type
+pc-5.1, it needs to **not** use the feature, to be able to migrate to
+real qemu-5.1.
+
+And the equivalent part when migrating from qemu-5.1 to qemu-5.2.
+qemu-5.2 has to expect that it is not going to get data for the new
+feature, because qemu-5.1 doesn't know about it.
+
+How do we tell QEMU about these device feature changes? In
+hw/core/machine.c:hw_compat_X_Y arrays.
+
+If we change a default value, we need to put back the old value on
+that array. And the device, during initialization needs to look at
+that array to see what value it needs to get for that feature. And
+what are we going to put in that array, the value of a property.
+
+To create a property for a device, we need to use one of the
+DEFINE_PROP_*() macros. See include/hw/qdev-properties.h to find the
+macros that exist. With it, we set the default value for that
+property, and that is what it is going to get in the latest released
+version. But if we want a different value for a previous version, we
+can change that in the hw_compat_X_Y arrays.
+
+hw_compat_X_Y is an array of registers that have the format:
+
+- name_device
+- name_property
+- value
+
+Let's see a practical example.
+
+In qemu-5.2 virtio-blk-device got multi queue support. This is a
+change that is not backward compatible. In qemu-5.1 it has one
+queue. In qemu-5.2 it has the same number of queues as the number of
+cpus in the system.
+
+When we are doing migration, if we migrate from a device that has 4
+queues to a device that have only one queue, we don't know where to
+put the extra information for the other 3 queues, and we fail
+migration.
+
+Similar problem when we migrate from qemu-5.1 that has only one queue
+to qemu-5.2, we only sent information for one queue, but destination
+has 4, and we have 3 queues that are not properly initialized and
+anything can happen.
+
+So, how can we address this problem. Easy, just convince qemu-5.2
+that when it is running pc-5.1, it needs to set the number of queues
+for virtio-blk-devices to 1.
+
+That way we fix the cases 5 and 6.
+
+5 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1
+
+ qemu-5.2 -M pc-5.1 sets number of queues to be 1.
+ qemu-5.1 -M pc-5.1 expects number of queues to be 1.
+
+ correct. migration works.
+
+6 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
+
+ qemu-5.1 -M pc-5.1 sets number of queues to be 1.
+ qemu-5.2 -M pc-5.1 expects number of queues to be 1.
+
+ correct. migration works.
+
+And now the other interesting case, case 3. In this case we have:
+
+3 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
+
+ Here we have the same QEMU in both sides. So it doesn't matter a
+ lot if we have set the number of queues to 1 or not, because
+ they are the same.
+
+ WRONG!
+
+ Think what happens if we do one of this double migrations:
+
+ A -> migrates -> B -> migrates -> C
+
+ where:
+
+ A: qemu-5.1 -M pc-5.1
+ B: qemu-5.2 -M pc-5.1
+ C: qemu-5.2 -M pc-5.1
+
+ migration A -> B is case 6, so number of queues needs to be 1.
+
+ migration B -> C is case 3, so we don't care. But actually we
+ care because we haven't started the guest in qemu-5.2, it came
+ migrated from qemu-5.1. So to be in the safe place, we need to
+ always use number of queues 1 when we are using pc-5.1.
+
+Now, how was this done in reality? The following commit shows how it
+was done::
+
+ commit 9445e1e15e66c19e42bea942ba810db28052cd05
+ Author: Stefan Hajnoczi <stefanha@redhat.com>
+ Date: Tue Aug 18 15:33:47 2020 +0100
+
+ virtio-blk-pci: default num_queues to -smp N
+
+The relevant parts for migration are::
+
+ @@ -1281,7 +1284,8 @@ static Property virtio_blk_properties[] = {
+ #endif
+ DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
+ true),
+ - DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, 1),
+ + DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
+ + VIRTIO_BLK_AUTO_NUM_QUEUES),
+ DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256),
+
+It changes the default value of num_queues. But it fishes it for old
+machine types to have the right value::
+
+ @@ -31,6 +31,7 @@
+ GlobalProperty hw_compat_5_1[] = {
+ ...
+ + { "virtio-blk-device", "num-queues", "1"},
+ ...
+ };
--
2.41.0
next prev parent reply other threads:[~2023-10-31 9:07 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-31 9:01 [PULL 00/38] Migration 20231031 patches Juan Quintela
2023-10-31 9:01 ` [PULL 01/38] migration/doc: Add contents Juan Quintela
2023-10-31 9:01 ` Juan Quintela [this message]
2023-10-31 9:01 ` [PULL 03/38] migration/doc: How to migrate when hosts have different features Juan Quintela
2023-10-31 9:01 ` [PULL 04/38] migration/doc: We broke backwards compatibility Juan Quintela
2023-10-31 9:01 ` [PULL 05/38] migration: Receiving a zero page non zero is an error Juan Quintela
2023-10-31 9:01 ` [PULL 06/38] migration: Rename ram_handle_compressed() to ram_handle_zero() Juan Quintela
2023-10-31 9:01 ` [PULL 07/38] migration: Give one error if trying to set MULTIFD and XBZRLE Juan Quintela
2023-10-31 9:01 ` [PULL 08/38] migration: Give one error if trying to set COMPRESSION " Juan Quintela
2023-10-31 9:01 ` [PULL 09/38] migration: Remove save_page_use_compression() Juan Quintela
2023-10-31 9:01 ` [PULL 10/38] migration: Make compress_data_with_multithreads return bool Juan Quintela
2023-10-31 9:01 ` [PULL 11/38] migration: Simplify compress_page_with_multithread() Juan Quintela
2023-10-31 9:01 ` [PULL 12/38] migration: Move busy++ to migrate_with_multithread Juan Quintela
2023-10-31 9:01 ` [PULL 13/38] migration: Create compress_update_rates() Juan Quintela
2023-10-31 9:01 ` [PULL 14/38] migration: Export send_queued_data() Juan Quintela
2023-10-31 9:01 ` [PULL 15/38] migration: Move ram_flush_compressed_data() to ram-compress.c Juan Quintela
2023-10-31 9:01 ` [PULL 16/38] migration: Merge flush_compressed_data() and compress_flush_data() Juan Quintela
2023-10-31 9:01 ` [PULL 17/38] migration: Rename ram_compressed_pages() to compress_ram_pages() Juan Quintela
2023-10-31 9:01 ` [PULL 18/38] migration/ram: Fix compilation with -Wshadow=local Juan Quintela
2023-10-31 9:01 ` [PULL 19/38] migration: rename vmstate_save_needed->vmstate_section_needed Juan Quintela
2023-10-31 9:01 ` [PULL 20/38] migration: set file error on subsection loading Juan Quintela
2023-10-31 9:01 ` [PULL 21/38] qemu-iotests: Filter warnings about block migration being deprecated Juan Quintela
2023-10-31 9:01 ` [PULL 22/38] migration: migrate 'inc' command option is deprecated Juan Quintela
2023-10-31 9:01 ` [PULL 23/38] migration: migrate 'blk' " Juan Quintela
2023-10-31 9:01 ` [PULL 24/38] migration: Deprecate block migration Juan Quintela
2023-10-31 9:01 ` [PULL 25/38] migration: Deprecate old compression method Juan Quintela
2023-10-31 9:01 ` [PULL 26/38] migration: Stop migration immediately in RDMA error paths Juan Quintela
2023-10-31 9:01 ` [PULL 27/38] qemu-file: Don't increment qemu_file_transferred at qemu_file_fill_buffer Juan Quintela
2023-10-31 9:01 ` [PULL 28/38] qemu_file: Use a stat64 for qemu_file_transferred Juan Quintela
2023-10-31 9:01 ` [PULL 29/38] qemu_file: total_transferred is not used anymore Juan Quintela
2023-10-31 9:01 ` [PULL 30/38] migration: Use the number of transferred bytes directly Juan Quintela
2023-10-31 9:01 ` [PULL 31/38] qemu_file: Remove unused qemu_file_transferred() Juan Quintela
2023-10-31 9:01 ` [PULL 32/38] qemu-file: Remove _noflush from qemu_file_transferred_noflush() Juan Quintela
2023-10-31 9:01 ` [PULL 33/38] migration: migration_transferred_bytes() don't need the QEMUFile Juan Quintela
2023-10-31 9:01 ` [PULL 34/38] migration: migration_rate_limit_reset() " Juan Quintela
2023-10-31 9:01 ` [PULL 35/38] qemu-file: Simplify qemu_file_get_error() Juan Quintela
2023-10-31 9:01 ` [PULL 36/38] migration: Use migration_transferred_bytes() Juan Quintela
2023-10-31 9:01 ` [PULL 37/38] migration: Remove transferred atomic counter Juan Quintela
2023-10-31 9:01 ` [PULL 38/38] qemu-file: Make qemu_fflush() return errors Juan Quintela
2023-10-31 23:31 ` [PULL 00/38] Migration 20231031 patches Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231031090142.13122-3-quintela@redhat.com \
--to=quintela@redhat.com \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=fam@euphon.net \
--cc=farosas@suse.de \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=leobras@redhat.com \
--cc=libvir-list@redhat.com \
--cc=lizhijian@fujitsu.com \
--cc=peterx@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).