* QEMU migration-test CI intermittent failure
@ 2023-09-13 19:23 Stefan Hajnoczi
2023-09-13 19:42 ` Fabiano Rosas
0 siblings, 1 reply; 16+ messages in thread
From: Stefan Hajnoczi @ 2023-09-13 19:23 UTC (permalink / raw)
To: Juan Quintela, Peter Xu, Leonardo Bras; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1162 bytes --]
Hi,
The following intermittent failure occurred in the CI and I have filed
an Issue for it:
https://gitlab.com/qemu-project/qemu/-/issues/1886
Output:
>>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
stderr:
qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
(test program exited with status code -6)
You can find the full output here:
https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
Please take a look!
Thanks,
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-13 19:23 QEMU migration-test CI intermittent failure Stefan Hajnoczi
@ 2023-09-13 19:42 ` Fabiano Rosas
2023-09-13 19:51 ` Stefan Hajnoczi
2023-09-14 14:56 ` Peter Xu
0 siblings, 2 replies; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-13 19:42 UTC (permalink / raw)
To: Stefan Hajnoczi, Juan Quintela, Peter Xu, Leonardo Bras; +Cc: qemu-devel
Stefan Hajnoczi <stefanha@redhat.com> writes:
> Hi,
> The following intermittent failure occurred in the CI and I have filed
> an Issue for it:
> https://gitlab.com/qemu-project/qemu/-/issues/1886
>
> Output:
>
> >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
> stderr:
> qemu-system-x86_64: Unable to read from socket: Connection reset by peer
> Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
> **
> ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
> (test program exited with status code -6)
>
> You can find the full output here:
> https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
This is the postcopy return path issue that I'm addressing here:
https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
Subject: [PATCH v6 00/10] Fix segfault on migration return path
Message-ID: <20230911171320.24372-1-farosas@suse.de>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-13 19:42 ` Fabiano Rosas
@ 2023-09-13 19:51 ` Stefan Hajnoczi
2023-09-14 14:56 ` Peter Xu
1 sibling, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2023-09-13 19:51 UTC (permalink / raw)
To: Fabiano Rosas
Cc: Stefan Hajnoczi, Juan Quintela, Peter Xu, Leonardo Bras,
qemu-devel
On Wed, 13 Sept 2023 at 15:44, Fabiano Rosas <farosas@suse.de> wrote:
>
> Stefan Hajnoczi <stefanha@redhat.com> writes:
>
> > Hi,
> > The following intermittent failure occurred in the CI and I have filed
> > an Issue for it:
> > https://gitlab.com/qemu-project/qemu/-/issues/1886
> >
> > Output:
> >
> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> > ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
> > stderr:
> > qemu-system-x86_64: Unable to read from socket: Connection reset by peer
> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
> > **
> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
> > (test program exited with status code -6)
> >
> > You can find the full output here:
> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
>
> This is the postcopy return path issue that I'm addressing here:
>
> https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
> Subject: [PATCH v6 00/10] Fix segfault on migration return path
> Message-ID: <20230911171320.24372-1-farosas@suse.de>
Awesome, thanks!
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-13 19:42 ` Fabiano Rosas
2023-09-13 19:51 ` Stefan Hajnoczi
@ 2023-09-14 14:56 ` Peter Xu
2023-09-14 15:10 ` Fabiano Rosas
1 sibling, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-09-14 14:56 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
>
> > Hi,
> > The following intermittent failure occurred in the CI and I have filed
> > an Issue for it:
> > https://gitlab.com/qemu-project/qemu/-/issues/1886
> >
> > Output:
> >
> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> > ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
> > stderr:
> > qemu-system-x86_64: Unable to read from socket: Connection reset by peer
> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
> > **
> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
> > (test program exited with status code -6)
> >
> > You can find the full output here:
> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
>
> This is the postcopy return path issue that I'm addressing here:
>
> https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
> Subject: [PATCH v6 00/10] Fix segfault on migration return path
> Message-ID: <20230911171320.24372-1-farosas@suse.de>
Hmm I just noticed one thing, that Stefan's failure is a ram check issue
only, which means qemu won't crash?
Fabiano, are you sure it's the same issue on your return-path fix?
I'm also trying to reproduce either of them with some loads. I think I hit
some but it's very hard to reproduce solidly.
--
Peter Xu
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 14:56 ` Peter Xu
@ 2023-09-14 15:10 ` Fabiano Rosas
2023-09-14 15:35 ` Peter Xu
0 siblings, 1 reply; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-14 15:10 UTC (permalink / raw)
To: Peter Xu; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
Peter Xu <peterx@redhat.com> writes:
> On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>>
>> > Hi,
>> > The following intermittent failure occurred in the CI and I have filed
>> > an Issue for it:
>> > https://gitlab.com/qemu-project/qemu/-/issues/1886
>> >
>> > Output:
>> >
>> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>> > ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
>> > stderr:
>> > qemu-system-x86_64: Unable to read from socket: Connection reset by peer
>> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
>> > **
>> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
>> > (test program exited with status code -6)
>> >
>> > You can find the full output here:
>> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
>>
>> This is the postcopy return path issue that I'm addressing here:
>>
>> https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
>> Subject: [PATCH v6 00/10] Fix segfault on migration return path
>> Message-ID: <20230911171320.24372-1-farosas@suse.de>
>
> Hmm I just noticed one thing, that Stefan's failure is a ram check issue
> only, which means qemu won't crash?
>
The source could have crashed and left the migration at an inconsistent
state and then the destination saw corrupted memory?
> Fabiano, are you sure it's the same issue on your return-path fix?
>
I've been running the preempt tests on my branch for thousands of
iterations and didn't see any other errors. Since there's no code going
into the migration tree recently I assume it's the same error.
I run the tests with GDB attached to QEMU, so I'll always see a crash
before any memory corruption.
> I'm also trying to reproduce either of them with some loads. I think I hit
> some but it's very hard to reproduce solidly.
Well, if you find anything else let me know and we'll fix it.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 15:10 ` Fabiano Rosas
@ 2023-09-14 15:35 ` Peter Xu
2023-09-14 15:57 ` Fabiano Rosas
0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-09-14 15:35 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Thu, Sep 14, 2023 at 12:10:04PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote:
> >> Stefan Hajnoczi <stefanha@redhat.com> writes:
> >>
> >> > Hi,
> >> > The following intermittent failure occurred in the CI and I have filed
> >> > an Issue for it:
> >> > https://gitlab.com/qemu-project/qemu/-/issues/1886
> >> >
> >> > Output:
> >> >
> >> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> >> > ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
> >> > stderr:
> >> > qemu-system-x86_64: Unable to read from socket: Connection reset by peer
> >> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
> >> > **
> >> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
> >> > (test program exited with status code -6)
> >> >
> >> > You can find the full output here:
> >> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
> >>
> >> This is the postcopy return path issue that I'm addressing here:
> >>
> >> https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
> >> Subject: [PATCH v6 00/10] Fix segfault on migration return path
> >> Message-ID: <20230911171320.24372-1-farosas@suse.de>
> >
> > Hmm I just noticed one thing, that Stefan's failure is a ram check issue
> > only, which means qemu won't crash?
> >
>
> The source could have crashed and left the migration at an inconsistent
> state and then the destination saw corrupted memory?
>
> > Fabiano, are you sure it's the same issue on your return-path fix?
> >
>
> I've been running the preempt tests on my branch for thousands of
> iterations and didn't see any other errors. Since there's no code going
> into the migration tree recently I assume it's the same error.
>
> I run the tests with GDB attached to QEMU, so I'll always see a crash
> before any memory corruption.
Okay, maybe that stops you from seeing the above check_guests_ram() error?
Worth checking whether it fails differently always if you just don't attach
gdb to it; I had a feeling that it'll always fail in the other way (I think
migration-test will say something like "qemu killed" etc. in most cases),
further to identify the issues.
>
> > I'm also trying to reproduce either of them with some loads. I think I hit
> > some but it's very hard to reproduce solidly.
>
> Well, if you find anything else let me know and we'll fix it.
I think Stefan's issue is the one I triggered once, but only once; I did
see check_guests_ram() lines.
I ran concurrently 10 migration-tests (on 8 cores; just to make scheduler
start to really work), each looping over preempt/plain for 500 times and
hit nothing.. I'm trying again with a larger host with more instances, so
far I've run 200 loops over 40 instances running together, I hit
nothing.. but I'm keeping trying.
--
Peter Xu
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 15:35 ` Peter Xu
@ 2023-09-14 15:57 ` Fabiano Rosas
2023-09-14 16:39 ` Peter Xu
0 siblings, 1 reply; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-14 15:57 UTC (permalink / raw)
To: Peter Xu; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
Peter Xu <peterx@redhat.com> writes:
> On Thu, Sep 14, 2023 at 12:10:04PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>>
>> > On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote:
>> >> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> >>
>> >> > Hi,
>> >> > The following intermittent failure occurred in the CI and I have filed
>> >> > an Issue for it:
>> >> > https://gitlab.com/qemu-project/qemu/-/issues/1886
>> >> >
>> >> > Output:
>> >> >
>> >> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-x86_64 /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>> >> > ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
>> >> > stderr:
>> >> > qemu-system-x86_64: Unable to read from socket: Connection reset by peer
>> >> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
>> >> > **
>> >> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
>> >> > (test program exited with status code -6)
>> >> >
>> >> > You can find the full output here:
>> >> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
>> >>
>> >> This is the postcopy return path issue that I'm addressing here:
>> >>
>> >> https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
>> >> Subject: [PATCH v6 00/10] Fix segfault on migration return path
>> >> Message-ID: <20230911171320.24372-1-farosas@suse.de>
>> >
>> > Hmm I just noticed one thing, that Stefan's failure is a ram check issue
>> > only, which means qemu won't crash?
>> >
>>
>> The source could have crashed and left the migration at an inconsistent
>> state and then the destination saw corrupted memory?
>>
>> > Fabiano, are you sure it's the same issue on your return-path fix?
>> >
>>
>> I've been running the preempt tests on my branch for thousands of
>> iterations and didn't see any other errors. Since there's no code going
>> into the migration tree recently I assume it's the same error.
>>
>> I run the tests with GDB attached to QEMU, so I'll always see a crash
>> before any memory corruption.
>
> Okay, maybe that stops you from seeing the above check_guests_ram() error?
> Worth checking whether it fails differently always if you just don't attach
> gdb to it; I had a feeling that it'll always fail in the other way (I think
> migration-test will say something like "qemu killed" etc. in most cases),
> further to identify the issues.
>
>>
>> > I'm also trying to reproduce either of them with some loads. I think I hit
>> > some but it's very hard to reproduce solidly.
>>
>> Well, if you find anything else let me know and we'll fix it.
>
> I think Stefan's issue is the one I triggered once, but only once; I did
> see check_guests_ram() lines.
>
> I ran concurrently 10 migration-tests (on 8 cores; just to make scheduler
> start to really work), each looping over preempt/plain for 500 times and
> hit nothing.. I'm trying again with a larger host with more instances, so
> far I've run 200 loops over 40 instances running together, I hit
> nothing.. but I'm keeping trying.
I managed to reproduce it. It's not the return path error. In hindsight
that's obvious because that error happens in the 'recovery' test and this
one in the 'plain' one. Sorry about the noise.
This one reproduced with just 4 iterations of preempt/plain. I'll
investigate.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 15:57 ` Fabiano Rosas
@ 2023-09-14 16:39 ` Peter Xu
2023-09-14 21:13 ` Fabiano Rosas
0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-09-14 16:39 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
> I managed to reproduce it. It's not the return path error. In hindsight
> that's obvious because that error happens in the 'recovery' test and this
> one in the 'plain' one. Sorry about the noise.
No worry. It's good to finally identify that.
>
> This one reproduced with just 4 iterations of preempt/plain. I'll
> investigate.
Great, thank you!
--
Peter Xu
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 16:39 ` Peter Xu
@ 2023-09-14 21:13 ` Fabiano Rosas
2023-09-14 22:54 ` Fabiano Rosas
0 siblings, 1 reply; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-14 21:13 UTC (permalink / raw)
To: Peter Xu; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
Peter Xu <peterx@redhat.com> writes:
> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
>> I managed to reproduce it. It's not the return path error. In hindsight
>> that's obvious because that error happens in the 'recovery' test and this
>> one in the 'plain' one. Sorry about the noise.
>
> No worry. It's good to finally identify that.
>
>>
>> This one reproduced with just 4 iterations of preempt/plain. I'll
>> investigate.
It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
("migration: Fix potential race on postcopy_qemufile_src").
I'm trying to determine why that happens when other times it just
returns 0 as expected.
Could this mean that we're kicking the dest too soon while it is still
receiving valid data?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 21:13 ` Fabiano Rosas
@ 2023-09-14 22:54 ` Fabiano Rosas
2023-09-14 23:27 ` Peter Xu
0 siblings, 1 reply; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-14 22:54 UTC (permalink / raw)
To: Peter Xu; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
Fabiano Rosas <farosas@suse.de> writes:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
>>> I managed to reproduce it. It's not the return path error. In hindsight
>>> that's obvious because that error happens in the 'recovery' test and this
>>> one in the 'plain' one. Sorry about the noise.
>>
>> No worry. It's good to finally identify that.
>>
>>>
>>> This one reproduced with just 4 iterations of preempt/plain. I'll
>>> investigate.
>
> It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
> that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
> ("migration: Fix potential race on postcopy_qemufile_src").
>
> I'm trying to determine why that happens when other times it just
> returns 0 as expected.
>
> Could this mean that we're kicking the dest too soon while it is still
> receiving valid data?
Looking a bit more into this, what's happening is that
postcopy_ram_incoming_cleanup() is shutting the postcopy_qemufile_dst
while ram_load_postcopy() is still running.
The postcopy_ram_listen_thread() function waits for the
main_thread_load_event, but that only works when not using preempt. With
the preempt thread, the event is set right away and we proceed to do the
cleanup without waiting.
So the assumption of commit 6621883f93 that the incoming side knows when
it has finished migrating is wrong IMO. Without the EOS we're relying on
the chance that the shutdown() happens after the last recvmsg has
returned and not during it.
Peter, what do you think?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 22:54 ` Fabiano Rosas
@ 2023-09-14 23:27 ` Peter Xu
2023-09-15 1:56 ` Fabiano Rosas
0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-09-14 23:27 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Thu, Sep 14, 2023 at 07:54:17PM -0300, Fabiano Rosas wrote:
> Fabiano Rosas <farosas@suse.de> writes:
>
> > Peter Xu <peterx@redhat.com> writes:
> >
> >> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
> >>> I managed to reproduce it. It's not the return path error. In hindsight
> >>> that's obvious because that error happens in the 'recovery' test and this
> >>> one in the 'plain' one. Sorry about the noise.
> >>
> >> No worry. It's good to finally identify that.
> >>
> >>>
> >>> This one reproduced with just 4 iterations of preempt/plain. I'll
> >>> investigate.
> >
> > It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
> > that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
> > ("migration: Fix potential race on postcopy_qemufile_src").
> >
> > I'm trying to determine why that happens when other times it just
> > returns 0 as expected.
> >
> > Could this mean that we're kicking the dest too soon while it is still
> > receiving valid data?
>
> Looking a bit more into this, what's happening is that
> postcopy_ram_incoming_cleanup() is shutting the postcopy_qemufile_dst
> while ram_load_postcopy() is still running.
>
> The postcopy_ram_listen_thread() function waits for the
> main_thread_load_event, but that only works when not using preempt. With
> the preempt thread, the event is set right away and we proceed to do the
> cleanup without waiting.
>
> So the assumption of commit 6621883f93 that the incoming side knows when
> it has finished migrating is wrong IMO. Without the EOS we're relying on
> the chance that the shutdown() happens after the last recvmsg has
> returned and not during it.
>
> Peter, what do you think?
That's a good point.
One thing to verify that (sorry, I still cannot reproduce it myself, which
is so weirdly... it seems loads won't really help reproducing this) is to
let the main thread wait for all requested pages to arrive:
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 29aea9456d..df055c51ea 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -597,6 +597,12 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
trace_postcopy_ram_incoming_cleanup_entry();
if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
+ /*
+ * NOTE! it's possible that the preempt thread is still handling
+ * the last pages to arrive which were requested by faults. Making
+ * sure nothing is left behind.
+ */
+ while (qatomic_read(&mis->page_requested_count));
/* Notify the fast load thread to quit */
mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
if (mis->postcopy_qemufile_dst) {
If that can work solidly, we can figure out a better way than a dead loop
here.
Thanks,
--
Peter Xu
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-14 23:27 ` Peter Xu
@ 2023-09-15 1:56 ` Fabiano Rosas
2023-09-15 16:28 ` Peter Xu
0 siblings, 1 reply; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-15 1:56 UTC (permalink / raw)
To: Peter Xu; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
Peter Xu <peterx@redhat.com> writes:
> On Thu, Sep 14, 2023 at 07:54:17PM -0300, Fabiano Rosas wrote:
>> Fabiano Rosas <farosas@suse.de> writes:
>>
>> > Peter Xu <peterx@redhat.com> writes:
>> >
>> >> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
>> >>> I managed to reproduce it. It's not the return path error. In hindsight
>> >>> that's obvious because that error happens in the 'recovery' test and this
>> >>> one in the 'plain' one. Sorry about the noise.
>> >>
>> >> No worry. It's good to finally identify that.
>> >>
>> >>>
>> >>> This one reproduced with just 4 iterations of preempt/plain. I'll
>> >>> investigate.
>> >
>> > It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
>> > that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
>> > ("migration: Fix potential race on postcopy_qemufile_src").
>> >
>> > I'm trying to determine why that happens when other times it just
>> > returns 0 as expected.
>> >
>> > Could this mean that we're kicking the dest too soon while it is still
>> > receiving valid data?
>>
>> Looking a bit more into this, what's happening is that
>> postcopy_ram_incoming_cleanup() is shutting the postcopy_qemufile_dst
>> while ram_load_postcopy() is still running.
>>
>> The postcopy_ram_listen_thread() function waits for the
>> main_thread_load_event, but that only works when not using preempt. With
>> the preempt thread, the event is set right away and we proceed to do the
>> cleanup without waiting.
>>
>> So the assumption of commit 6621883f93 that the incoming side knows when
>> it has finished migrating is wrong IMO. Without the EOS we're relying on
>> the chance that the shutdown() happens after the last recvmsg has
>> returned and not during it.
>>
>> Peter, what do you think?
>
> That's a good point.
>
> One thing to verify that (sorry, I still cannot reproduce it myself, which
> is so weirdly... it seems loads won't really help reproducing this) is to
> let the main thread wait for all requested pages to arrive:
>
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 29aea9456d..df055c51ea 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -597,6 +597,12 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
> trace_postcopy_ram_incoming_cleanup_entry();
>
> if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
> + /*
> + * NOTE! it's possible that the preempt thread is still handling
> + * the last pages to arrive which were requested by faults. Making
> + * sure nothing is left behind.
> + */
> + while (qatomic_read(&mis->page_requested_count));
> /* Notify the fast load thread to quit */
> mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
> if (mis->postcopy_qemufile_dst) {
>
> If that can work solidly, we can figure out a better way than a dead loop
> here.
Yep, 2000+ iterations so far and no error.
Should we add something that makes ram_load_postcopy return once it's
finished? Then this code could just set PREEMPT_THREAD_QUIT and join the
preempt thread.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-15 1:56 ` Fabiano Rosas
@ 2023-09-15 16:28 ` Peter Xu
2023-09-15 16:55 ` Peter Xu
0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-09-15 16:28 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Thu, Sep 14, 2023 at 10:56:23PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > On Thu, Sep 14, 2023 at 07:54:17PM -0300, Fabiano Rosas wrote:
> >> Fabiano Rosas <farosas@suse.de> writes:
> >>
> >> > Peter Xu <peterx@redhat.com> writes:
> >> >
> >> >> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
> >> >>> I managed to reproduce it. It's not the return path error. In hindsight
> >> >>> that's obvious because that error happens in the 'recovery' test and this
> >> >>> one in the 'plain' one. Sorry about the noise.
> >> >>
> >> >> No worry. It's good to finally identify that.
> >> >>
> >> >>>
> >> >>> This one reproduced with just 4 iterations of preempt/plain. I'll
> >> >>> investigate.
> >> >
> >> > It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
> >> > that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
> >> > ("migration: Fix potential race on postcopy_qemufile_src").
> >> >
> >> > I'm trying to determine why that happens when other times it just
> >> > returns 0 as expected.
> >> >
> >> > Could this mean that we're kicking the dest too soon while it is still
> >> > receiving valid data?
> >>
> >> Looking a bit more into this, what's happening is that
> >> postcopy_ram_incoming_cleanup() is shutting the postcopy_qemufile_dst
> >> while ram_load_postcopy() is still running.
> >>
> >> The postcopy_ram_listen_thread() function waits for the
> >> main_thread_load_event, but that only works when not using preempt. With
> >> the preempt thread, the event is set right away and we proceed to do the
> >> cleanup without waiting.
> >>
> >> So the assumption of commit 6621883f93 that the incoming side knows when
> >> it has finished migrating is wrong IMO. Without the EOS we're relying on
> >> the chance that the shutdown() happens after the last recvmsg has
> >> returned and not during it.
> >>
> >> Peter, what do you think?
> >
> > That's a good point.
> >
> > One thing to verify that (sorry, I still cannot reproduce it myself, which
> > is so weirdly... it seems loads won't really help reproducing this) is to
> > let the main thread wait for all requested pages to arrive:
> >
> > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > index 29aea9456d..df055c51ea 100644
> > --- a/migration/postcopy-ram.c
> > +++ b/migration/postcopy-ram.c
> > @@ -597,6 +597,12 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
> > trace_postcopy_ram_incoming_cleanup_entry();
> >
> > if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
> > + /*
> > + * NOTE! it's possible that the preempt thread is still handling
> > + * the last pages to arrive which were requested by faults. Making
> > + * sure nothing is left behind.
> > + */
> > + while (qatomic_read(&mis->page_requested_count));
> > /* Notify the fast load thread to quit */
> > mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
> > if (mis->postcopy_qemufile_dst) {
> >
> > If that can work solidly, we can figure out a better way than a dead loop
> > here.
>
> Yep, 2000+ iterations so far and no error.
Definitely good news.. thanks.
>
> Should we add something that makes ram_load_postcopy return once it's
> finished? Then this code could just set PREEMPT_THREAD_QUIT and join the
> preempt thread.
Yeah it'll be nice to have that. We used to have an EOS message at the end
for the preempt channel, but that used to cause another race (which IIRC I
could easily reproduce at that time) and we removed it after
6621883f9398bc3 for 7.2+.
Now instead of fiddling with the protocol again, we do have a way to detect
that, but it may need some trick on counting on page_requested_count and
also provide a signal mechanism.
I've drafted one solution here and attached here as a complete patch, feel
free to try. If you have anything better, that'll be even nicer; just let
me know your thoughts.
---8<---
From 384aff2264650872e15916dcfeaec593e5e9b781 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Fri, 15 Sep 2023 12:11:34 -0400
Subject: [PATCH] migration: Fix race that dest preempt thread close too early
We hit intermit CI issue on failing at migration-test over the unit test
preempt/plain:
qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
(test program exited with status code -6)
Fabiano debugged into it and found that the preempt thread can quit even
without receiving all the pages, which can cause guest not receiving all
the pages and corrupt the guest memory.
To make sure preempt thread finished receiving all the pages, we can rely
on the page_requested_count being zero because preempt channel will only
receive requested page faults. Note, not all the faulted pages are required
to be sent via the preempt channel/thread; imagine the case when a
requested page is just queued into the background main channel for
migration, the src qemu will just still send it via the background channel.
Here instead of spinning over reading the count, we add a condvar so the
main thread can wait on it if that unusual case happened, without burning
the cpu for no good reason, even if the duration is short; so even if we
spin in this rare case is probably fine. It's just better to not do so.
The condvar is only used when that special case is triggered. Some memory
ordering trick is needed to guarantee it from happening (against the
preempt thread status field), so the main thread will always get a kick
when that triggers correctly.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
Debugged-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.h | 13 ++++++++++++-
migration/migration.c | 1 +
migration/postcopy-ram.c | 35 +++++++++++++++++++++++++++++++++++
3 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/migration/migration.h b/migration/migration.h
index c390500604..cdaa10d515 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -196,7 +196,10 @@ struct MigrationIncomingState {
/* A tree of pages that we requested to the source VM */
GTree *page_requested;
- /* For debugging purpose only, but would be nice to keep */
+ /*
+ * For postcopy only, count the number of requested page faults that
+ * still haven't been resolved.
+ */
int page_requested_count;
/*
* The mutex helps to maintain the requested pages that we sent to the
@@ -210,6 +213,14 @@ struct MigrationIncomingState {
* contains valid information.
*/
QemuMutex page_request_mutex;
+ /*
+ * If postcopy preempt is enabled, there is a chance that the main
+ * thread finished loading its data before the preempt channel has
+ * finished loading the urgent pages. If that happens, the two threads
+ * will use this condvar to synchronize, so the main thread will always
+ * wait until all pages received.
+ */
+ QemuCond page_request_cond;
/*
* Number of devices that have yet to approve switchover. When this reaches
diff --git a/migration/migration.c b/migration/migration.c
index d61e572742..e86b3f7368 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -153,6 +153,7 @@ void migration_object_init(void)
qemu_sem_init(¤t_incoming->postcopy_qemufile_dst_done, 0);
qemu_mutex_init(¤t_incoming->page_request_mutex);
+ qemu_cond_init(¤t_incoming->page_request_cond);
current_incoming->page_requested = g_tree_new(page_request_addr_cmp);
migration_object_check(current_migration, &error_fatal);
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 29aea9456d..dbf02cd3ed 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -599,6 +599,30 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
/* Notify the fast load thread to quit */
mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
+ /*
+ * Update preempt_thread_status before reading count. Note: mutex
+ * lock only provide ACQUIRE semantic, and it doesn't stops this
+ * write to be reordered after reading the count.
+ */
+ smp_mb();
+ /*
+ * It's possible that the preempt thread is still handling the last
+ * pages to arrive which were requested by guest page faults.
+ * Making sure nothing is left behind by waiting on the condvar if
+ * that unlikely case happened.
+ */
+ WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) {
+ if (qatomic_read(&mis->page_requested_count)) {
+ /*
+ * It is guaranteed to receive a signal later, because the
+ * count>0 now, so it's destined to be decreased to zero
+ * very soon by the preempt thread.
+ */
+ qemu_cond_wait(&mis->page_request_cond,
+ &mis->page_request_mutex);
+ }
+ }
+ /* Notify the fast load thread to quit */
if (mis->postcopy_qemufile_dst) {
qemu_file_shutdown(mis->postcopy_qemufile_dst);
}
@@ -1279,6 +1303,17 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr,
g_tree_remove(mis->page_requested, host_addr);
mis->page_requested_count--;
trace_postcopy_page_req_del(host_addr, mis->page_requested_count);
+ /* Order the update of count and read of preempt status */
+ smp_mb();
+ if (qatomic_read(&mis->preempt_thread_status) ==
+ PREEMPT_THREAD_QUIT) {
+ /*
+ * This probably means the main thread is waiting for us.
+ * Notify that we've finished receiving the last requested
+ * page.
+ */
+ qemu_cond_signal(&mis->page_request_cond);
+ }
}
qemu_mutex_unlock(&mis->page_request_mutex);
mark_postcopy_blocktime_end((uintptr_t)host_addr);
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-15 16:28 ` Peter Xu
@ 2023-09-15 16:55 ` Peter Xu
2023-09-18 14:15 ` Fabiano Rosas
0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-09-15 16:55 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Fri, Sep 15, 2023 at 12:28:36PM -0400, Peter Xu wrote:
> On Thu, Sep 14, 2023 at 10:56:23PM -0300, Fabiano Rosas wrote:
> > Peter Xu <peterx@redhat.com> writes:
> >
> > > On Thu, Sep 14, 2023 at 07:54:17PM -0300, Fabiano Rosas wrote:
> > >> Fabiano Rosas <farosas@suse.de> writes:
> > >>
> > >> > Peter Xu <peterx@redhat.com> writes:
> > >> >
> > >> >> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
> > >> >>> I managed to reproduce it. It's not the return path error. In hindsight
> > >> >>> that's obvious because that error happens in the 'recovery' test and this
> > >> >>> one in the 'plain' one. Sorry about the noise.
> > >> >>
> > >> >> No worry. It's good to finally identify that.
> > >> >>
> > >> >>>
> > >> >>> This one reproduced with just 4 iterations of preempt/plain. I'll
> > >> >>> investigate.
> > >> >
> > >> > It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
> > >> > that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
> > >> > ("migration: Fix potential race on postcopy_qemufile_src").
> > >> >
> > >> > I'm trying to determine why that happens when other times it just
> > >> > returns 0 as expected.
> > >> >
> > >> > Could this mean that we're kicking the dest too soon while it is still
> > >> > receiving valid data?
> > >>
> > >> Looking a bit more into this, what's happening is that
> > >> postcopy_ram_incoming_cleanup() is shutting the postcopy_qemufile_dst
> > >> while ram_load_postcopy() is still running.
> > >>
> > >> The postcopy_ram_listen_thread() function waits for the
> > >> main_thread_load_event, but that only works when not using preempt. With
> > >> the preempt thread, the event is set right away and we proceed to do the
> > >> cleanup without waiting.
> > >>
> > >> So the assumption of commit 6621883f93 that the incoming side knows when
> > >> it has finished migrating is wrong IMO. Without the EOS we're relying on
> > >> the chance that the shutdown() happens after the last recvmsg has
> > >> returned and not during it.
> > >>
> > >> Peter, what do you think?
> > >
> > > That's a good point.
> > >
> > > One thing to verify that (sorry, I still cannot reproduce it myself, which
> > > is so weirdly... it seems loads won't really help reproducing this) is to
> > > let the main thread wait for all requested pages to arrive:
> > >
> > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > > index 29aea9456d..df055c51ea 100644
> > > --- a/migration/postcopy-ram.c
> > > +++ b/migration/postcopy-ram.c
> > > @@ -597,6 +597,12 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
> > > trace_postcopy_ram_incoming_cleanup_entry();
> > >
> > > if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
> > > + /*
> > > + * NOTE! it's possible that the preempt thread is still handling
> > > + * the last pages to arrive which were requested by faults. Making
> > > + * sure nothing is left behind.
> > > + */
> > > + while (qatomic_read(&mis->page_requested_count));
> > > /* Notify the fast load thread to quit */
> > > mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
> > > if (mis->postcopy_qemufile_dst) {
> > >
> > > If that can work solidly, we can figure out a better way than a dead loop
> > > here.
> >
> > Yep, 2000+ iterations so far and no error.
>
> Definitely good news.. thanks.
>
> >
> > Should we add something that makes ram_load_postcopy return once it's
> > finished? Then this code could just set PREEMPT_THREAD_QUIT and join the
> > preempt thread.
>
> Yeah it'll be nice to have that. We used to have an EOS message at the end
> for the preempt channel, but that used to cause another race (which IIRC I
> could easily reproduce at that time) and we removed it after
> 6621883f9398bc3 for 7.2+.
>
> Now instead of fiddling with the protocol again, we do have a way to detect
> that, but it may need some trick on counting on page_requested_count and
> also provide a signal mechanism.
>
> I've drafted one solution here and attached here as a complete patch, feel
> free to try. If you have anything better, that'll be even nicer; just let
> me know your thoughts.
>
> ---8<---
> From 384aff2264650872e15916dcfeaec593e5e9b781 Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Fri, 15 Sep 2023 12:11:34 -0400
> Subject: [PATCH] migration: Fix race that dest preempt thread close too early
>
> We hit intermit CI issue on failing at migration-test over the unit test
> preempt/plain:
>
> qemu-system-x86_64: Unable to read from socket: Connection reset by peer
> Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
> **
> ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
> (test program exited with status code -6)
>
> Fabiano debugged into it and found that the preempt thread can quit even
> without receiving all the pages, which can cause guest not receiving all
> the pages and corrupt the guest memory.
>
> To make sure preempt thread finished receiving all the pages, we can rely
> on the page_requested_count being zero because preempt channel will only
> receive requested page faults. Note, not all the faulted pages are required
> to be sent via the preempt channel/thread; imagine the case when a
> requested page is just queued into the background main channel for
> migration, the src qemu will just still send it via the background channel.
>
> Here instead of spinning over reading the count, we add a condvar so the
> main thread can wait on it if that unusual case happened, without burning
> the cpu for no good reason, even if the duration is short; so even if we
> spin in this rare case is probably fine. It's just better to not do so.
>
> The condvar is only used when that special case is triggered. Some memory
> ordering trick is needed to guarantee it from happening (against the
> preempt thread status field), so the main thread will always get a kick
> when that triggers correctly.
>
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
> Debugged-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/migration.h | 13 ++++++++++++-
> migration/migration.c | 1 +
> migration/postcopy-ram.c | 35 +++++++++++++++++++++++++++++++++++
> 3 files changed, 48 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.h b/migration/migration.h
> index c390500604..cdaa10d515 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -196,7 +196,10 @@ struct MigrationIncomingState {
>
> /* A tree of pages that we requested to the source VM */
> GTree *page_requested;
> - /* For debugging purpose only, but would be nice to keep */
> + /*
> + * For postcopy only, count the number of requested page faults that
> + * still haven't been resolved.
> + */
> int page_requested_count;
> /*
> * The mutex helps to maintain the requested pages that we sent to the
> @@ -210,6 +213,14 @@ struct MigrationIncomingState {
> * contains valid information.
> */
> QemuMutex page_request_mutex;
> + /*
> + * If postcopy preempt is enabled, there is a chance that the main
> + * thread finished loading its data before the preempt channel has
> + * finished loading the urgent pages. If that happens, the two threads
> + * will use this condvar to synchronize, so the main thread will always
> + * wait until all pages received.
> + */
> + QemuCond page_request_cond;
>
> /*
> * Number of devices that have yet to approve switchover. When this reaches
> diff --git a/migration/migration.c b/migration/migration.c
> index d61e572742..e86b3f7368 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -153,6 +153,7 @@ void migration_object_init(void)
> qemu_sem_init(¤t_incoming->postcopy_qemufile_dst_done, 0);
>
> qemu_mutex_init(¤t_incoming->page_request_mutex);
> + qemu_cond_init(¤t_incoming->page_request_cond);
> current_incoming->page_requested = g_tree_new(page_request_addr_cmp);
>
> migration_object_check(current_migration, &error_fatal);
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 29aea9456d..dbf02cd3ed 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -599,6 +599,30 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
> if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
> /* Notify the fast load thread to quit */
> mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
> + /*
> + * Update preempt_thread_status before reading count. Note: mutex
> + * lock only provide ACQUIRE semantic, and it doesn't stops this
> + * write to be reordered after reading the count.
> + */
> + smp_mb();
> + /*
> + * It's possible that the preempt thread is still handling the last
> + * pages to arrive which were requested by guest page faults.
> + * Making sure nothing is left behind by waiting on the condvar if
> + * that unlikely case happened.
> + */
> + WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) {
> + if (qatomic_read(&mis->page_requested_count)) {
> + /*
> + * It is guaranteed to receive a signal later, because the
> + * count>0 now, so it's destined to be decreased to zero
> + * very soon by the preempt thread.
> + */
> + qemu_cond_wait(&mis->page_request_cond,
> + &mis->page_request_mutex);
> + }
> + }
> + /* Notify the fast load thread to quit */
> if (mis->postcopy_qemufile_dst) {
> qemu_file_shutdown(mis->postcopy_qemufile_dst);
> }
> @@ -1279,6 +1303,17 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr,
> g_tree_remove(mis->page_requested, host_addr);
> mis->page_requested_count--;
> trace_postcopy_page_req_del(host_addr, mis->page_requested_count);
> + /* Order the update of count and read of preempt status */
> + smp_mb();
> + if (qatomic_read(&mis->preempt_thread_status) ==
> + PREEMPT_THREAD_QUIT) {
> + /*
> + * This probably means the main thread is waiting for us.
> + * Notify that we've finished receiving the last requested
> + * page.
> + */
> + qemu_cond_signal(&mis->page_request_cond);
> + }
> }
> qemu_mutex_unlock(&mis->page_request_mutex);
> mark_postcopy_blocktime_end((uintptr_t)host_addr);
> --
> 2.41.0
I just made some changes on top of this patch..
One major thing is to double check we only kick main thread only if it's
the last page, as I think there can be >1 pages on the fly..
Since at it, I cleaned up a bit on accessing either the preempt status
field or the counter, to make it always consistent (always not use atomic
ops for status as it's defined as volatile anyway; meanwhile always use
qatomic for the counter).
---8<---
From bfdd1f872fbeafe26ebad280a6d9e5a88de621de Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Fri, 15 Sep 2023 12:45:01 -0400
Subject: [PATCH] fixup! migration: Fix race that dest preempt thread close too
early
Three more small changes:
- Always use qatomic_* for page_requested_count accesses
- only kick main thread when we're sure it's the last page
- drop qatomic_read() on reading preempt status, to make it match use
cases elsewhere, not needed when defined as volatile already
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 2 +-
migration/postcopy-ram.c | 7 ++++---
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index e86b3f7368..3ee1e6b0d6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -368,7 +368,7 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis,
* things like g_tree_lookup() will return TRUE (1) when found.
*/
g_tree_insert(mis->page_requested, aligned, (gpointer)1);
- mis->page_requested_count++;
+ qatomic_inc(&mis->page_requested_count);
trace_postcopy_page_req_add(aligned, mis->page_requested_count);
}
}
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index dbf02cd3ed..5408e028c6 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1301,12 +1301,13 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr,
*/
if (g_tree_lookup(mis->page_requested, host_addr)) {
g_tree_remove(mis->page_requested, host_addr);
- mis->page_requested_count--;
+ int left_pages = qatomic_dec_fetch(&mis->page_requested_count);
+
trace_postcopy_page_req_del(host_addr, mis->page_requested_count);
/* Order the update of count and read of preempt status */
smp_mb();
- if (qatomic_read(&mis->preempt_thread_status) ==
- PREEMPT_THREAD_QUIT) {
+ if (mis->preempt_thread_status == PREEMPT_THREAD_QUIT &&
+ left_pages == 0) {
/*
* This probably means the main thread is waiting for us.
* Notify that we've finished receiving the last requested
--
2.41.0
--
Peter Xu
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-15 16:55 ` Peter Xu
@ 2023-09-18 14:15 ` Fabiano Rosas
2023-09-18 15:35 ` Peter Xu
0 siblings, 1 reply; 16+ messages in thread
From: Fabiano Rosas @ 2023-09-18 14:15 UTC (permalink / raw)
To: Peter Xu; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
Peter Xu <peterx@redhat.com> writes:
> On Fri, Sep 15, 2023 at 12:28:36PM -0400, Peter Xu wrote:
>> On Thu, Sep 14, 2023 at 10:56:23PM -0300, Fabiano Rosas wrote:
>> > Peter Xu <peterx@redhat.com> writes:
>> >
>> > > On Thu, Sep 14, 2023 at 07:54:17PM -0300, Fabiano Rosas wrote:
>> > >> Fabiano Rosas <farosas@suse.de> writes:
>> > >>
>> > >> > Peter Xu <peterx@redhat.com> writes:
>> > >> >
>> > >> >> On Thu, Sep 14, 2023 at 12:57:08PM -0300, Fabiano Rosas wrote:
>> > >> >>> I managed to reproduce it. It's not the return path error. In hindsight
>> > >> >>> that's obvious because that error happens in the 'recovery' test and this
>> > >> >>> one in the 'plain' one. Sorry about the noise.
>> > >> >>
>> > >> >> No worry. It's good to finally identify that.
>> > >> >>
>> > >> >>>
>> > >> >>> This one reproduced with just 4 iterations of preempt/plain. I'll
>> > >> >>> investigate.
>> > >> >
>> > >> > It seems that we're getting a tcp disconnect (ECONNRESET) on when doing
>> > >> > that shutdown() on postcopy_qemufile_src. The one from commit 6621883f93
>> > >> > ("migration: Fix potential race on postcopy_qemufile_src").
>> > >> >
>> > >> > I'm trying to determine why that happens when other times it just
>> > >> > returns 0 as expected.
>> > >> >
>> > >> > Could this mean that we're kicking the dest too soon while it is still
>> > >> > receiving valid data?
>> > >>
>> > >> Looking a bit more into this, what's happening is that
>> > >> postcopy_ram_incoming_cleanup() is shutting the postcopy_qemufile_dst
>> > >> while ram_load_postcopy() is still running.
>> > >>
>> > >> The postcopy_ram_listen_thread() function waits for the
>> > >> main_thread_load_event, but that only works when not using preempt. With
>> > >> the preempt thread, the event is set right away and we proceed to do the
>> > >> cleanup without waiting.
>> > >>
>> > >> So the assumption of commit 6621883f93 that the incoming side knows when
>> > >> it has finished migrating is wrong IMO. Without the EOS we're relying on
>> > >> the chance that the shutdown() happens after the last recvmsg has
>> > >> returned and not during it.
>> > >>
>> > >> Peter, what do you think?
>> > >
>> > > That's a good point.
>> > >
>> > > One thing to verify that (sorry, I still cannot reproduce it myself, which
>> > > is so weirdly... it seems loads won't really help reproducing this) is to
>> > > let the main thread wait for all requested pages to arrive:
>> > >
>> > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
>> > > index 29aea9456d..df055c51ea 100644
>> > > --- a/migration/postcopy-ram.c
>> > > +++ b/migration/postcopy-ram.c
>> > > @@ -597,6 +597,12 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
>> > > trace_postcopy_ram_incoming_cleanup_entry();
>> > >
>> > > if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
>> > > + /*
>> > > + * NOTE! it's possible that the preempt thread is still handling
>> > > + * the last pages to arrive which were requested by faults. Making
>> > > + * sure nothing is left behind.
>> > > + */
>> > > + while (qatomic_read(&mis->page_requested_count));
>> > > /* Notify the fast load thread to quit */
>> > > mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
>> > > if (mis->postcopy_qemufile_dst) {
>> > >
>> > > If that can work solidly, we can figure out a better way than a dead loop
>> > > here.
>> >
>> > Yep, 2000+ iterations so far and no error.
>>
>> Definitely good news.. thanks.
>>
>> >
>> > Should we add something that makes ram_load_postcopy return once it's
>> > finished? Then this code could just set PREEMPT_THREAD_QUIT and join the
>> > preempt thread.
>>
>> Yeah it'll be nice to have that. We used to have an EOS message at the end
>> for the preempt channel, but that used to cause another race (which IIRC I
>> could easily reproduce at that time) and we removed it after
>> 6621883f9398bc3 for 7.2+.
>>
>> Now instead of fiddling with the protocol again, we do have a way to detect
>> that, but it may need some trick on counting on page_requested_count and
>> also provide a signal mechanism.
>>
>> I've drafted one solution here and attached here as a complete patch, feel
>> free to try. If you have anything better, that'll be even nicer; just let
>> me know your thoughts.
>>
>> ---8<---
>> From 384aff2264650872e15916dcfeaec593e5e9b781 Mon Sep 17 00:00:00 2001
>> From: Peter Xu <peterx@redhat.com>
>> Date: Fri, 15 Sep 2023 12:11:34 -0400
>> Subject: [PATCH] migration: Fix race that dest preempt thread close too early
>>
>> We hit intermit CI issue on failing at migration-test over the unit test
>> preempt/plain:
>>
>> qemu-system-x86_64: Unable to read from socket: Connection reset by peer
>> Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
>> **
>> ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
>> (test program exited with status code -6)
>>
>> Fabiano debugged into it and found that the preempt thread can quit even
>> without receiving all the pages, which can cause guest not receiving all
>> the pages and corrupt the guest memory.
>>
>> To make sure preempt thread finished receiving all the pages, we can rely
>> on the page_requested_count being zero because preempt channel will only
>> receive requested page faults. Note, not all the faulted pages are required
>> to be sent via the preempt channel/thread; imagine the case when a
>> requested page is just queued into the background main channel for
>> migration, the src qemu will just still send it via the background channel.
>>
>> Here instead of spinning over reading the count, we add a condvar so the
>> main thread can wait on it if that unusual case happened, without burning
>> the cpu for no good reason, even if the duration is short; so even if we
>> spin in this rare case is probably fine. It's just better to not do so.
>>
>> The condvar is only used when that special case is triggered. Some memory
>> ordering trick is needed to guarantee it from happening (against the
>> preempt thread status field), so the main thread will always get a kick
>> when that triggers correctly.
>>
>> Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
>> Debugged-by: Fabiano Rosas <farosas@suse.de>
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>> migration/migration.h | 13 ++++++++++++-
>> migration/migration.c | 1 +
>> migration/postcopy-ram.c | 35 +++++++++++++++++++++++++++++++++++
>> 3 files changed, 48 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/migration.h b/migration/migration.h
>> index c390500604..cdaa10d515 100644
>> --- a/migration/migration.h
>> +++ b/migration/migration.h
>> @@ -196,7 +196,10 @@ struct MigrationIncomingState {
>>
>> /* A tree of pages that we requested to the source VM */
>> GTree *page_requested;
>> - /* For debugging purpose only, but would be nice to keep */
>> + /*
>> + * For postcopy only, count the number of requested page faults that
>> + * still haven't been resolved.
>> + */
>> int page_requested_count;
>> /*
>> * The mutex helps to maintain the requested pages that we sent to the
>> @@ -210,6 +213,14 @@ struct MigrationIncomingState {
>> * contains valid information.
>> */
>> QemuMutex page_request_mutex;
>> + /*
>> + * If postcopy preempt is enabled, there is a chance that the main
>> + * thread finished loading its data before the preempt channel has
>> + * finished loading the urgent pages. If that happens, the two threads
>> + * will use this condvar to synchronize, so the main thread will always
>> + * wait until all pages received.
>> + */
>> + QemuCond page_request_cond;
>>
>> /*
>> * Number of devices that have yet to approve switchover. When this reaches
>> diff --git a/migration/migration.c b/migration/migration.c
>> index d61e572742..e86b3f7368 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -153,6 +153,7 @@ void migration_object_init(void)
>> qemu_sem_init(¤t_incoming->postcopy_qemufile_dst_done, 0);
>>
>> qemu_mutex_init(¤t_incoming->page_request_mutex);
>> + qemu_cond_init(¤t_incoming->page_request_cond);
>> current_incoming->page_requested = g_tree_new(page_request_addr_cmp);
>>
>> migration_object_check(current_migration, &error_fatal);
>> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
>> index 29aea9456d..dbf02cd3ed 100644
>> --- a/migration/postcopy-ram.c
>> +++ b/migration/postcopy-ram.c
>> @@ -599,6 +599,30 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
>> if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
>> /* Notify the fast load thread to quit */
>> mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
>> + /*
>> + * Update preempt_thread_status before reading count. Note: mutex
>> + * lock only provide ACQUIRE semantic, and it doesn't stops this
>> + * write to be reordered after reading the count.
>> + */
>> + smp_mb();
>> + /*
>> + * It's possible that the preempt thread is still handling the last
>> + * pages to arrive which were requested by guest page faults.
>> + * Making sure nothing is left behind by waiting on the condvar if
>> + * that unlikely case happened.
>> + */
>> + WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) {
>> + if (qatomic_read(&mis->page_requested_count)) {
>> + /*
>> + * It is guaranteed to receive a signal later, because the
>> + * count>0 now, so it's destined to be decreased to zero
>> + * very soon by the preempt thread.
>> + */
>> + qemu_cond_wait(&mis->page_request_cond,
>> + &mis->page_request_mutex);
>> + }
>> + }
>> + /* Notify the fast load thread to quit */
>> if (mis->postcopy_qemufile_dst) {
>> qemu_file_shutdown(mis->postcopy_qemufile_dst);
>> }
>> @@ -1279,6 +1303,17 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr,
>> g_tree_remove(mis->page_requested, host_addr);
>> mis->page_requested_count--;
>> trace_postcopy_page_req_del(host_addr, mis->page_requested_count);
>> + /* Order the update of count and read of preempt status */
>> + smp_mb();
>> + if (qatomic_read(&mis->preempt_thread_status) ==
>> + PREEMPT_THREAD_QUIT) {
>> + /*
>> + * This probably means the main thread is waiting for us.
>> + * Notify that we've finished receiving the last requested
>> + * page.
>> + */
>> + qemu_cond_signal(&mis->page_request_cond);
>> + }
>> }
>> qemu_mutex_unlock(&mis->page_request_mutex);
>> mark_postcopy_blocktime_end((uintptr_t)host_addr);
>> --
>> 2.41.0
>
> I just made some changes on top of this patch..
>
> One major thing is to double check we only kick main thread only if it's
> the last page, as I think there can be >1 pages on the fly..
>
> Since at it, I cleaned up a bit on accessing either the preempt status
> field or the counter, to make it always consistent (always not use atomic
> ops for status as it's defined as volatile anyway; meanwhile always use
> qatomic for the counter).
>
> ---8<---
> From bfdd1f872fbeafe26ebad280a6d9e5a88de621de Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Fri, 15 Sep 2023 12:45:01 -0400
> Subject: [PATCH] fixup! migration: Fix race that dest preempt thread close too
> early
>
> Three more small changes:
>
> - Always use qatomic_* for page_requested_count accesses
> - only kick main thread when we're sure it's the last page
> - drop qatomic_read() on reading preempt status, to make it match use
> cases elsewhere, not needed when defined as volatile already
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/migration.c | 2 +-
> migration/postcopy-ram.c | 7 ++++---
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index e86b3f7368..3ee1e6b0d6 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -368,7 +368,7 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis,
> * things like g_tree_lookup() will return TRUE (1) when found.
> */
> g_tree_insert(mis->page_requested, aligned, (gpointer)1);
> - mis->page_requested_count++;
> + qatomic_inc(&mis->page_requested_count);
> trace_postcopy_page_req_add(aligned, mis->page_requested_count);
> }
> }
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index dbf02cd3ed..5408e028c6 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1301,12 +1301,13 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr,
> */
> if (g_tree_lookup(mis->page_requested, host_addr)) {
> g_tree_remove(mis->page_requested, host_addr);
> - mis->page_requested_count--;
> + int left_pages = qatomic_dec_fetch(&mis->page_requested_count);
> +
> trace_postcopy_page_req_del(host_addr, mis->page_requested_count);
> /* Order the update of count and read of preempt status */
> smp_mb();
> - if (qatomic_read(&mis->preempt_thread_status) ==
> - PREEMPT_THREAD_QUIT) {
> + if (mis->preempt_thread_status == PREEMPT_THREAD_QUIT &&
> + left_pages == 0) {
> /*
> * This probably means the main thread is waiting for us.
> * Notify that we've finished receiving the last requested
> --
> 2.41.0
This fixes the issue. It looks ok to me. Do you want me to send a separate patch with it?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: QEMU migration-test CI intermittent failure
2023-09-18 14:15 ` Fabiano Rosas
@ 2023-09-18 15:35 ` Peter Xu
0 siblings, 0 replies; 16+ messages in thread
From: Peter Xu @ 2023-09-18 15:35 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Juan Quintela, Leonardo Bras, qemu-devel
On Mon, Sep 18, 2023 at 11:15:57AM -0300, Fabiano Rosas wrote:
> This fixes the issue. It looks ok to me. Do you want me to send a separate patch with it?
I've consulted Stefan on how to proceed. I think it's wanted that we have
these fixes in first even before Juan's back (note: Juan will be further
away for quite some days, unfortunately..).
Fabiano, would you propose a patchset containing your patches 1-8 and also
the patch I attached that you tested (please make sure the fixup is
squashed into the same patch), so that Stefan can merge directly before
Juan's back? Please mention in the cover letter that it fixes all known
(two..) migration-test possible failures for CI so Stefan will be super
clear on what that patchset is.
Thank you!
--
Peter Xu
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-09-18 15:36 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-13 19:23 QEMU migration-test CI intermittent failure Stefan Hajnoczi
2023-09-13 19:42 ` Fabiano Rosas
2023-09-13 19:51 ` Stefan Hajnoczi
2023-09-14 14:56 ` Peter Xu
2023-09-14 15:10 ` Fabiano Rosas
2023-09-14 15:35 ` Peter Xu
2023-09-14 15:57 ` Fabiano Rosas
2023-09-14 16:39 ` Peter Xu
2023-09-14 21:13 ` Fabiano Rosas
2023-09-14 22:54 ` Fabiano Rosas
2023-09-14 23:27 ` Peter Xu
2023-09-15 1:56 ` Fabiano Rosas
2023-09-15 16:28 ` Peter Xu
2023-09-15 16:55 ` Peter Xu
2023-09-18 14:15 ` Fabiano Rosas
2023-09-18 15:35 ` Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).