* [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
@ 2025-03-26 16:22 Marco Cavenati
2025-03-26 16:48 ` Daniel P. Berrangé
2025-03-29 5:41 ` Michael Tokarev
0 siblings, 2 replies; 12+ messages in thread
From: Marco Cavenati @ 2025-03-26 16:22 UTC (permalink / raw)
To: Peter Xu, Fabiano Rosas; +Cc: qemu-devel, Marco Cavenati, qemu-trivial
The SEEK_CUR case in qio_channel_block_seek was incorrectly using the
'whence' parameter instead of the 'offset' parameter when calculating the
new position.
Fixes: 65cf200a51ddc6d0a28ecceac30dc892233cddd7 ("migration: introduce a QIOChannel impl for BlockDriverState VMState")
Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
---
migration/channel-block.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/channel-block.c b/migration/channel-block.c
index fff8d87094..b0477f5b6d 100644
--- a/migration/channel-block.c
+++ b/migration/channel-block.c
@@ -123,7 +123,7 @@ qio_channel_block_seek(QIOChannel *ioc,
bioc->offset = offset;
break;
case SEEK_CUR:
- bioc->offset += whence;
+ bioc->offset += offset;
break;
case SEEK_END:
error_setg(errp, "Size of VMstate region is unknown");
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-03-26 16:22 [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Marco Cavenati
@ 2025-03-26 16:48 ` Daniel P. Berrangé
2025-03-29 5:41 ` Michael Tokarev
1 sibling, 0 replies; 12+ messages in thread
From: Daniel P. Berrangé @ 2025-03-26 16:48 UTC (permalink / raw)
To: Marco Cavenati; +Cc: Peter Xu, Fabiano Rosas, qemu-devel, qemu-trivial
On Wed, Mar 26, 2025 at 05:22:30PM +0100, Marco Cavenati wrote:
> The SEEK_CUR case in qio_channel_block_seek was incorrectly using the
> 'whence' parameter instead of the 'offset' parameter when calculating the
> new position.
>
> Fixes: 65cf200a51ddc6d0a28ecceac30dc892233cddd7 ("migration: introduce a QIOChannel impl for BlockDriverState VMState")
>
> Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> ---
> migration/channel-block.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-03-26 16:22 [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Marco Cavenati
2025-03-26 16:48 ` Daniel P. Berrangé
@ 2025-03-29 5:41 ` Michael Tokarev
2025-03-31 12:26 ` Fabiano Rosas
1 sibling, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2025-03-29 5:41 UTC (permalink / raw)
To: Marco Cavenati, Peter Xu, Fabiano Rosas; +Cc: qemu-devel, qemu-trivial
26.03.2025 19:22, Marco Cavenati wrote:
> The SEEK_CUR case in qio_channel_block_seek was incorrectly using the
> 'whence' parameter instead of the 'offset' parameter when calculating the
> new position.
>
> Fixes: 65cf200a51ddc6d0a28ecceac30dc892233cddd7 ("migration: introduce a QIOChannel impl for BlockDriverState VMState")
>
> Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> ---
> migration/channel-block.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/channel-block.c b/migration/channel-block.c
> index fff8d87094..b0477f5b6d 100644
> --- a/migration/channel-block.c
> +++ b/migration/channel-block.c
> @@ -123,7 +123,7 @@ qio_channel_block_seek(QIOChannel *ioc,
> bioc->offset = offset;
> break;
> case SEEK_CUR:
> - bioc->offset += whence;
> + bioc->offset += offset;
> break;
> case SEEK_END:
> error_setg(errp, "Size of VMstate region is unknown");
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
This is a (trivial) bugfix, I'd say it should be in 10.0.
Will you guys send a pullreq for the block layer, or should
I make a single-patch pullreq from the trivial tree?
Thanks,
/mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-03-29 5:41 ` Michael Tokarev
@ 2025-03-31 12:26 ` Fabiano Rosas
2025-08-26 20:32 ` Michael Tokarev
0 siblings, 1 reply; 12+ messages in thread
From: Fabiano Rosas @ 2025-03-31 12:26 UTC (permalink / raw)
To: Michael Tokarev, Marco Cavenati, Peter Xu; +Cc: qemu-devel, qemu-trivial
Michael Tokarev <mjt@tls.msk.ru> writes:
> 26.03.2025 19:22, Marco Cavenati wrote:
>> The SEEK_CUR case in qio_channel_block_seek was incorrectly using the
>> 'whence' parameter instead of the 'offset' parameter when calculating the
>> new position.
>>
>> Fixes: 65cf200a51ddc6d0a28ecceac30dc892233cddd7 ("migration: introduce a QIOChannel impl for BlockDriverState VMState")
>>
>> Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
>> ---
>> migration/channel-block.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/channel-block.c b/migration/channel-block.c
>> index fff8d87094..b0477f5b6d 100644
>> --- a/migration/channel-block.c
>> +++ b/migration/channel-block.c
>> @@ -123,7 +123,7 @@ qio_channel_block_seek(QIOChannel *ioc,
>> bioc->offset = offset;
>> break;
>> case SEEK_CUR:
>> - bioc->offset += whence;
>> + bioc->offset += offset;
>> break;
>> case SEEK_END:
>> error_setg(errp, "Size of VMstate region is unknown");
>
> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
>
> This is a (trivial) bugfix, I'd say it should be in 10.0.
> Will you guys send a pullreq for the block layer, or should
> I make a single-patch pullreq from the trivial tree?
I'll take it. It's not entirely trivial as it shifts a value by 1 in
mapped-ram migration. Fortunately, it's a value that doesn't need to be
the same between migration source and destination.
Thanks
>
> Thanks,
>
> /mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-03-31 12:26 ` Fabiano Rosas
@ 2025-08-26 20:32 ` Michael Tokarev
2025-08-26 21:27 ` Peter Xu
2025-08-26 21:52 ` Fabiano Rosas
0 siblings, 2 replies; 12+ messages in thread
From: Michael Tokarev @ 2025-08-26 20:32 UTC (permalink / raw)
To: farosas; +Cc: Marco.Cavenati, mjt, peterx, qemu-devel, qemu-trivial
Hi!
This is
commit c0b32426ce56182c1ce2a12904f3a702c2ecc460
Author: Marco Cavenati <Marco.Cavenati@eurecom.fr>
Date: Wed Mar 26 17:22:30 2025 +0100
migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
which went to 10.0.0-rc2, and has been cherry-picked to
7.2 and 9.2 stable series.
Reportedly it breaks migration in 7.2.18 and up. Which is
kinda strange, as it shouldn't do any harm?
https://bugs.debian.org/1112044
any guess what's going on?
Thanks,
/mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-08-26 20:32 ` Michael Tokarev
@ 2025-08-26 21:27 ` Peter Xu
2025-08-26 21:52 ` Fabiano Rosas
1 sibling, 0 replies; 12+ messages in thread
From: Peter Xu @ 2025-08-26 21:27 UTC (permalink / raw)
To: Michael Tokarev; +Cc: farosas, Marco.Cavenati, qemu-devel, qemu-trivial
On Tue, Aug 26, 2025 at 11:32:20PM +0300, Michael Tokarev wrote:
> Hi!
>
> This is
>
> commit c0b32426ce56182c1ce2a12904f3a702c2ecc460
> Author: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> Date: Wed Mar 26 17:22:30 2025 +0100
>
> migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
>
> which went to 10.0.0-rc2, and has been cherry-picked to
> 7.2 and 9.2 stable series.
>
> Reportedly it breaks migration in 7.2.18 and up. Which is
> kinda strange, as it shouldn't do any harm?
>
> https://bugs.debian.org/1112044
>
> any guess what's going on?
The only thing I can think of is, when it used to save to some file /
snapshot, then if the old image was stored with some wrong offsets (due to
wrong seek()s) then a new QEMU with correct offsets will instead read wrong
data even if they started to do the right things..
The reporter says:
This occurs during live-migrating a guest onto a host with u15,
migrating it back fixes the softlocks. A reset is required to fix
it but is only required when the receiving host is on the latest
version.
So it's a host-to-host live migration. Is that using TCP as URI? The
problem is I don't even think TCP layer should use io_seek at all.
qio_channel_io_seek() is only used in below (except VFIO when used with
multifd, that doesn't look like what the reporter was using..):
*** migration/file.c:
file_start_outgoing_migration[121] if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
file_start_incoming_migration[190] qio_channel_io_seek(QIO_CHANNEL(fioc), offset, SEEK_SET, errp) < 0) {
*** migration/qemu-file.c:
qemu_set_offset[611] ret = qio_channel_io_seek(f->ioc, off, whence, &err);
qemu_get_offset[624] ret = qio_channel_io_seek(f->ioc, 0, SEEK_CUR, &err);
All these references are about file migrations, not generic live
migrations..
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-08-26 20:32 ` Michael Tokarev
2025-08-26 21:27 ` Peter Xu
@ 2025-08-26 21:52 ` Fabiano Rosas
2025-08-27 6:59 ` [PATCH] virtio: Call set_features during reset (was: migration: fix SEEK_CUR offset calculation in qio_channel_block_seek) Michael Tokarev
2025-08-28 0:57 ` [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Akihiko Odaki
1 sibling, 2 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-08-26 21:52 UTC (permalink / raw)
To: Michael Tokarev
Cc: Marco.Cavenati, mjt, peterx, qemu-devel, qemu-trivial,
Akihiko Odaki
Michael Tokarev <mjt@tls.msk.ru> writes:
+CC Akihiko
> Hi!
>
> This is
>
> commit c0b32426ce56182c1ce2a12904f3a702c2ecc460
> Author: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> Date: Wed Mar 26 17:22:30 2025 +0100
>
> migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
>
> which went to 10.0.0-rc2, and has been cherry-picked to
> 7.2 and 9.2 stable series.
>
> Reportedly it breaks migration in 7.2.18 and up. Which is
> kinda strange, as it shouldn't do any harm?
>
Yeah, this is not it. Unless you're using colo or mapped-ram.
> https://bugs.debian.org/1112044
>
> any guess what's going on?
>
The virtio changes are probably the issue. One of them touches
mhdr.num_buffers, under mergeable_rx_bufs, which is migrated state. The
flag in turn depends on VIRTIO_NET_F_MRG_RXBUF, which is set on the
cmdline with -device virtio-net-pci,mrg_rxbuf= but also reset by
virtio_set_features_nocheck, if I'm reading this right.
Let's ask Akihiko.
> Thanks,
>
> /mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: Call set_features during reset (was: migration: fix SEEK_CUR offset calculation in qio_channel_block_seek)
2025-08-26 21:52 ` Fabiano Rosas
@ 2025-08-27 6:59 ` Michael Tokarev
2025-08-28 0:57 ` [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Akihiko Odaki
1 sibling, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2025-08-27 6:59 UTC (permalink / raw)
To: Fabiano Rosas
Cc: Marco.Cavenati, peterx, qemu-devel, qemu-trivial, Akihiko Odaki
On 27.08.2025 00:52, Fabiano Rosas wrote:
> Michael Tokarev <mjt@tls.msk.ru> writes:
>
> +CC Akihiko
>
>> Hi!
>>
>> This is
>>
>> commit c0b32426ce56182c1ce2a12904f3a702c2ecc460
>> Author: Marco Cavenati <Marco.Cavenati@eurecom.fr>
>> Date: Wed Mar 26 17:22:30 2025 +0100
>>
>> migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
>>
>> which went to 10.0.0-rc2, and has been cherry-picked to
>> 7.2 and 9.2 stable series.
>>
>> Reportedly it breaks migration in 7.2.18 and up. Which is
>> kinda strange, as it shouldn't do any harm?
>>
>
> Yeah, this is not it. Unless you're using colo or mapped-ram.
>
>> https://bugs.debian.org/1112044
>>
>> any guess what's going on?
>>
>
> The virtio changes are probably the issue. One of them touches
> mhdr.num_buffers, under mergeable_rx_bufs, which is migrated state. The
> flag in turn depends on VIRTIO_NET_F_MRG_RXBUF, which is set on the
> cmdline with -device virtio-net-pci,mrg_rxbuf= but also reset by
> virtio_set_features_nocheck, if I'm reading this right.
>
> Let's ask Akihiko.
The reporter just reported that they used the wrong commit id, --
actually, the issue is caused by ce1431615292dc735597db4062834b:
commit ce1431615292dc735597db4062834bfb271381bc
Author: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Date: Mon Apr 21 21:17:20 2025 +0900
virtio: Call set_features during reset
virtio-net expects set_features() will be called when the feature set
used by the guest changes to update the number of virtqueues but it is
not called during reset, which will clear all features, leaving the
queues added for VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS. Not only these
extra queues are visible to the guest, they will cause segmentation
fault during migration.
Call set_features() during reset to remove those queues for virtio-net
as we call set_status(). It will also prevent similar bugs for
virtio-net and other devices in the future.
Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest
doesn't support multiqueue")
Buglink: https://issues.redhat.com/browse/RHEL-73842
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250421-reset-v2-1-e4c1ead88ea1@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0caed25cd171c611781589b5402161d27d57229c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(so commit 0caed25cd171c611781589b5402161 in master).
So yes, this one looks much more real in this context.
Thanks,
/mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
2025-08-26 21:52 ` Fabiano Rosas
2025-08-27 6:59 ` [PATCH] virtio: Call set_features during reset (was: migration: fix SEEK_CUR offset calculation in qio_channel_block_seek) Michael Tokarev
@ 2025-08-28 0:57 ` Akihiko Odaki
2025-08-29 8:34 ` [PATCH] virtio: Call set_features during reset Michael Tokarev
1 sibling, 1 reply; 12+ messages in thread
From: Akihiko Odaki @ 2025-08-28 0:57 UTC (permalink / raw)
To: Fabiano Rosas, Michael Tokarev
Cc: Marco.Cavenati, peterx, qemu-devel, qemu-trivial
On 2025/08/27 6:52, Fabiano Rosas wrote:
> Michael Tokarev <mjt@tls.msk.ru> writes:
>
> +CC Akihiko
>
>> Hi!
>>
>> This is
>>
>> commit c0b32426ce56182c1ce2a12904f3a702c2ecc460
>> Author: Marco Cavenati <Marco.Cavenati@eurecom.fr>
>> Date: Wed Mar 26 17:22:30 2025 +0100
>>
>> migration: fix SEEK_CUR offset calculation in qio_channel_block_seek
>>
>> which went to 10.0.0-rc2, and has been cherry-picked to
>> 7.2 and 9.2 stable series.
>>
>> Reportedly it breaks migration in 7.2.18 and up. Which is
>> kinda strange, as it shouldn't do any harm?
>>
>
> Yeah, this is not it. Unless you're using colo or mapped-ram.
>
>> https://bugs.debian.org/1112044
>>
>> any guess what's going on?
>>
>
> The virtio changes are probably the issue. One of them touches
> mhdr.num_buffers, under mergeable_rx_bufs, which is migrated state. The
> flag in turn depends on VIRTIO_NET_F_MRG_RXBUF, which is set on the
> cmdline with -device virtio-net-pci,mrg_rxbuf= but also reset by
> virtio_set_features_nocheck, if I'm reading this right.
I don't think commit cefd67f25430 ("virtio-net: Fix num_buffers for
version 1") is related to the issue. Commit ce1431615292 ("virtio: Call
set_features during reset") is more likely.
virtio_set_features_nocheck() shouldn't reset VIRTIO_NET_F_MRG_RXBUF. It
calls virtio_net_set_features(), which does not clear features.
virtio_net_get_features() clears features, but it is called before
migration.
The posted call trace indicates a lockup happens in the control path,
but commit cefd67f25430 ("virtio-net: Fix num_buffers for version 1")
changes the data path.
On the other hand, I can come up with a possible failure scenario with
commit ce1431615292 ("virtio: Call set_features during reset"). Perhaps
it changed the machine state before loading the migrated state, and
caused a mismatch between them.
I need more information to understand the issue. A command line to
reproduce the issue is especially helpful because options like
mrg_rxbuf=, which you mentioned, tell enabled features, which is
valuable information.
Regards,
Akihiko Odaki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: Call set_features during reset
2025-08-28 0:57 ` [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Akihiko Odaki
@ 2025-08-29 8:34 ` Michael Tokarev
2025-08-29 14:40 ` Akihiko Odaki
0 siblings, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2025-08-29 8:34 UTC (permalink / raw)
To: Akihiko Odaki, Fabiano Rosas; +Cc: peterx, qemu-devel
On 28.08.2025 03:57, Akihiko Odaki wrote:
> The posted call trace indicates a lockup happens in the control path,
> but commit cefd67f25430 ("virtio-net: Fix num_buffers for version 1")
> changes the data path.
>
> On the other hand, I can come up with a possible failure scenario with
> commit ce1431615292 ("virtio: Call set_features during reset"). Perhaps
> it changed the machine state before loading the migrated state, and
> caused a mismatch between them.
Yes, the problem commit is 0caed25cd171c6 "virtio: Call set_features
during reset", - the OP corrected himself in the next message (subject
line updated).
> I need more information to understand the issue. A command line to
> reproduce the issue is especially helpful because options like
> mrg_rxbuf=, which you mentioned, tell enabled features, which is
> valuable information.
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1112044#69 for
the command line. The guest is started by libvirtd.
Please note: this is stable-7.2 series, it is *not* master (I picked
up this commit to 7.2.x). It'd be interesting to check if master is
affected too. Unfortunately I never tried migration, and now I only
have my notebook, so can only migrate between two qemu instances on
the same box - which is probably okay too.
Thanks,
/mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: Call set_features during reset
2025-08-29 8:34 ` [PATCH] virtio: Call set_features during reset Michael Tokarev
@ 2025-08-29 14:40 ` Akihiko Odaki
2025-08-29 15:14 ` Michael Tokarev
0 siblings, 1 reply; 12+ messages in thread
From: Akihiko Odaki @ 2025-08-29 14:40 UTC (permalink / raw)
To: Michael Tokarev, Fabiano Rosas; +Cc: peterx, qemu-devel
On 2025/08/29 17:34, Michael Tokarev wrote:
> On 28.08.2025 03:57, Akihiko Odaki wrote:
>
>> The posted call trace indicates a lockup happens in the control path,
>> but commit cefd67f25430 ("virtio-net: Fix num_buffers for version 1")
>> changes the data path.
>>
>> On the other hand, I can come up with a possible failure scenario with
>> commit ce1431615292 ("virtio: Call set_features during reset").
>> Perhaps it changed the machine state before loading the migrated
>> state, and caused a mismatch between them.
>
> Yes, the problem commit is 0caed25cd171c6 "virtio: Call set_features
> during reset", - the OP corrected himself in the next message (subject
> line updated).
>
>> I need more information to understand the issue. A command line to
>> reproduce the issue is especially helpful because options like
>> mrg_rxbuf=, which you mentioned, tell enabled features, which is
>> valuable information.
>
> See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1112044#69 for
> the command line. The guest is started by libvirtd.
Thank you, now I think I understand the problem.
>
> Please note: this is stable-7.2 series, it is *not* master (I picked
> up this commit to 7.2.x). It'd be interesting to check if master is
> affected too. Unfortunately I never tried migration, and now I only
> have my notebook, so can only migrate between two qemu instances on
> the same box - which is probably okay too.
I think you need to backport commit 9379ea9db3c0 ("virtio-net: Add
queues before loading them") and adda0ad56bd2 ("virtio-net: Add queues
for RSS during migration"). Here is an explanation:
First, let me define two variables for conciseness:
N: the number of queue pairs
M: the maximum number of queue pairs, which is determined with
n->max_queue_pairs
The problem is that QEMU inconsistently chose N for virtio-net in the
past. Before commit 8c49756825da ("virtio-net: Add only one queue pair
when realizing"):
1) realize() chose M.
2) set_features() chose: 1 (when RSS and MQ are disabled)
M (otherwise)
This itself was a problem; both RSS and MQ were disabled when realize()
but N was M, which is inconsistent with 2) and this inconsistency was
guest-visible.
I wrote commit 8c49756825da ("virtio-net: Add only one queue pair when
realizing") to make QEMU implement the behavior in 2) also during
realization and fix the inconsistency, but it broke migration when the
migrated VM had enabled VIRTIO_NET_F_RSS and VIRTIO_NET_F_MQ because it
expected that N == M.
This is also why the backported commit also broke migration; it
accidentally fixed the inconsistency between the first reset state and
the state after set_features() and caused the same problem.
I wrote commit 9379ea9db3c0 ("virtio-net: Add queues before loading
them") to fix the issue and later complemented it with commit
adda0ad56bd2 ("virtio-net: Add queues for RSS during migration").
There are several relevant commits because I could not fix the
underlying problem at once, but hopefully this email clarifies how the
two commits fixed it in the end.
Regards,
Akihiko Odaki
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: Call set_features during reset
2025-08-29 14:40 ` Akihiko Odaki
@ 2025-08-29 15:14 ` Michael Tokarev
0 siblings, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2025-08-29 15:14 UTC (permalink / raw)
To: Akihiko Odaki; +Cc: qemu-devel
On 29.08.2025 17:40, Akihiko Odaki wrote:
> On 2025/08/29 17:34, Michael Tokarev wrote:
>> On 28.08.2025 03:57, Akihiko Odaki wrote:
...
>> Please note: this is stable-7.2 series, it is *not* master (I picked
>> up this commit to 7.2.x). It'd be interesting to check if master is
>> affected too. Unfortunately I never tried migration, and now I only
>> have my notebook, so can only migrate between two qemu instances on
>> the same box - which is probably okay too.
>
> I think you need to backport commit 9379ea9db3c0 ("virtio-net: Add
> queues before loading them") and adda0ad56bd2 ("virtio-net: Add queues
> for RSS during migration"). Here is an explanation:
"Add queues before loading them" is marked as fixing
Both the mentioned commits were tagged with Cc: qemu-stable, but both
has Fixes: the same commit 8c49756825da ("virtio-net: Add only one queue
pair when realizing"), which is in 9.1, but not in 7.2. I guess, in
this case, I also need 8c49756825da - let's first break things, and
next fix them :)
> First, let me define two variables for conciseness:
> N: the number of queue pairs
> M: the maximum number of queue pairs, which is determined with
> n->max_queue_pairs
>
> The problem is that QEMU inconsistently chose N for virtio-net in the
> past. Before commit 8c49756825da ("virtio-net: Add only one queue pair
> when realizing"):
> 1) realize() chose M.
> 2) set_features() chose: 1 (when RSS and MQ are disabled)
> M (otherwise)
>
> This itself was a problem; both RSS and MQ were disabled when realize()
> but N was M, which is inconsistent with 2) and this inconsistency was
> guest-visible.
>
> I wrote commit 8c49756825da ("virtio-net: Add only one queue pair when
> realizing") to make QEMU implement the behavior in 2) also during
> realization and fix the inconsistency, but it broke migration when the
> migrated VM had enabled VIRTIO_NET_F_RSS and VIRTIO_NET_F_MQ because it
> expected that N == M.
Yup, 8c49756825da it is, which is missing in 7.2, which broke things :))
> This is also why the backported commit also broke migration; it
> accidentally fixed the inconsistency between the first reset state and
> the state after set_features() and caused the same problem.
>
> I wrote commit 9379ea9db3c0 ("virtio-net: Add queues before loading
> them") to fix the issue and later complemented it with commit
> adda0ad56bd2 ("virtio-net: Add queues for RSS during migration").
>
> There are several relevant commits because I could not fix the
> underlying problem at once, but hopefully this email clarifies how the
> two commits fixed it in the end.
Now, with 3 commits picked up to 7.2, things are at least compiles :))
Let's test the result.. hopefully I didn't break something else in the
process of breaking+fixing stuff :))
Thank you very much for the detailed explanation and attention to
details. It makes sense.
An for the 7.2.x series which is getting old(ish) - I plan to end
support for it in the near future, so there will be less surprises
like this one.. if not only for 10.0.x :)
/mjt
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-08-30 15:34 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-26 16:22 [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Marco Cavenati
2025-03-26 16:48 ` Daniel P. Berrangé
2025-03-29 5:41 ` Michael Tokarev
2025-03-31 12:26 ` Fabiano Rosas
2025-08-26 20:32 ` Michael Tokarev
2025-08-26 21:27 ` Peter Xu
2025-08-26 21:52 ` Fabiano Rosas
2025-08-27 6:59 ` [PATCH] virtio: Call set_features during reset (was: migration: fix SEEK_CUR offset calculation in qio_channel_block_seek) Michael Tokarev
2025-08-28 0:57 ` [PATCH] migration: fix SEEK_CUR offset calculation in qio_channel_block_seek Akihiko Odaki
2025-08-29 8:34 ` [PATCH] virtio: Call set_features during reset Michael Tokarev
2025-08-29 14:40 ` Akihiko Odaki
2025-08-29 15:14 ` Michael Tokarev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).