From: Claudio Fontana <cfontana@suse.de>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "Jason J. Herne" <jjherne@linux.ibm.com>,
Fam Zheng <fam@euphon.net>, Thomas Huth <thuth@redhat.com>,
Peter Maydell <peter.maydell@linaro.org>,
Cornelia Huck <cohuck@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
Juan Quintela <quintela@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
Date: Mon, 13 Jul 2020 13:39:58 +0200 [thread overview]
Message-ID: <c5878d54-e8b5-b1e7-a664-2d2ab36015e6@suse.de> (raw)
In-Reply-To: <20200713110333.GE3122@work-vm>
Hello,
On 7/13/20 1:03 PM, Dr. David Alan Gilbert wrote:
> * Claudio Fontana (cfontana@suse.de) wrote:
>> Hello all,
>>
>> during unrelated work for splitting QTest from the TCG instruction counting module,
>>
>> I encountered what seems to be a migration stream issue, which is apparent only on s390, and only shows in block test 267.
>>
>> ./check -qcow2 267
>>
>> when it comes to snapshot save and load using backing file.
>>
>> Here is a minimal reproducer patch that causes the issue on s390 only.
>>
>> --------------------------------------------cut-------------------------------------------
>> diff --git a/cpus.c b/cpus.c
>> index 41d1c5099f..443b88697a 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>
>> static bool icount_state_needed(void *opaque)
>> {
>> - return use_icount;
>> + return 0;
>> }
>
> That's weird; I mean that's just turning a subsection on and off;
> so you'd hope if this is a test that generates it's own snapshot and
> then uses it then it should be consistent.
Indeed.
>
>> static bool warp_timer_state_needed(void *opaque)
>> --------------------------------------------cut-------------------------------------------
>>
>> config.status configure line:
>> exec '/home/cfontana/qemu-build/../qemu/configure' '--enable-tcg' '--disable-kvm' '--disable-hax' '--target-list=s390x-softmmu' '--enable-debug'
>>
>> $ make check-block
>>
>> TEST iotest-qcow2: 267 [fail]
>> QEMU -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x" -nodefaults -display none -accel qtest
>> QEMU_IMG -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-img"
>> QEMU_IO -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-io" --cache writeback --aio threads -f qcow2
>> QEMU_NBD -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-nbd"
>> IMGFMT -- qcow2 (compat=1.1)
>> IMGPROTO -- file
>> PLATFORM -- Linux/s390x s390zp13 5.3.18-21-default
>> TEST_DIR -- /home/cfontana/qemu-build/tests/qemu-iotests/scratch
>> SOCK_DIR -- /tmp/tmp.bLJcJVtzk5
>> SOCKET_SCM_HELPER -- /home/cfontana/qemu-build/tests/qemu-iotests/socket_scm_helper
>>
>> --- /home/cfontana/qemu/tests/qemu-iotests/267.out 2020-07-12 05:10:07.948262675 -0400
>> +++ /home/cfontana/qemu-build/tests/qemu-iotests/267.out.bad 2020-07-12 05:27:03.358362781 -0400
>> @@ -137,6 +137,9 @@
>> ID TAG VM SIZE DATE VM CLOCK
>> -- snap0 SIZE yyyy-mm-dd hh:mm:ss 00:00:00.000
>> (qemu) loadvm snap0
>> +Unexpected storage key flag data: 0
>> +error while loading state for instance 0x0 of device 's390-skeys'
>> +Error: Error -22 while loading VM state
>> (qemu) quit
>>
>>
>> -----------
>>
>>
>> Not run: 172 186 192 259 287
>> Failures: 267
>> Failed 1 of 115 iotests
>> make: *** [/home/cfontana/qemu/tests/Makefile.include:880: check-tests/check-block.sh] Error 1
>>
>> -----------
>>
>> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
>>
>>
>> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
>> which include a buffer of 32768 bytes of keydata in the stream.
>>
>> The code (hw/s390x/s390-skeys.c),
>> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
>>
>> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).
>>
>> ----------
>> Observation: the migration/qemu-file.c shows an IO_BUF_SIZE of 32768.
>>
>> --
>>
>> The following workarounds hide the problem (make the test pass):
>>
>> 1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).
>>
>> 2) increasing the IO_BUF_SIZE also hides the problem:
>>
>> ----------------------cut--------------------------
>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> index be21518c57..f81d1272eb 100644
>> --- a/migration/qemu-file.c
>> +++ b/migration/qemu-file.c
>> @@ -30,7 +30,7 @@
>> #include "trace.h"
>> #include "qapi/error.h"
>>
>> -#define IO_BUF_SIZE 32768
>> +#define IO_BUF_SIZE 65536
>> #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
>>
>> struct QEMUFile {
>> ----------------------cut--------------------------
>>
>> 3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:
>>
>> ----------------------cut--------------------------
>> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
>> index 1e036cc602..47c9a015af 100644
>> --- a/hw/s390x/s390-skeys.c
>> +++ b/hw/s390x/s390-skeys.c
>> @@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
>> .class_size = sizeof(S390SKeysClass),
>> };
>>
>> +extern void qemu_fflush(QEMUFile *f);
>> +
>> static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>> {
>> S390SKeysState *ss = S390_SKEYS(opaque);
>> @@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>> g_free(buf);
>> end_stream:
>> qemu_put_be64(f, eos);
>> + qemu_fflush(f);
>> }
>>
>> static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
>> ----------------------cut--------------------------
>>
>> Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
>> and what could be the next step in the investigation?
>>
>> Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
>> and should s390 code also follow the same pattern?
>
> I didn't think it was required.
> And qemu_put_buffer loops if needed and calls qemu_fflush internally.
This detail probably does not help in this case, because the buffer itself is fine (32678 zero bytes is actually correct).
The End Of Stream marker written just after with qemu_put_be64 is not ok,
and anything past that point is just a whole bunch of zeroes in the stream when we hookup the calls to
qcow2_co_pwritev_task and qcow2_co_preadv_task
(see my response to Paolo in this thread).
> It's possible here that the storage key code is just the canary - the
> first thing that detects that the stream is invalid after it all goes
> wrong.
Seems the exact opposite to me, ie, the stream seems just fine up to the point where the EOS marker after the keydata in "s390-skeys" should be.
Thanks!
Claudio
>
> Dave
>
>
>> Thanks,
>>
>> Claudio
>>
>>
>> --
>> Claudio Fontana
>> Engineering Manager Virtualization, SUSE Labs Core
>>
>> SUSE Software Solutions Italy Srl
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
next prev parent reply other threads:[~2020-07-13 11:41 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-12 10:00 migration: broken snapshot saves appear on s390 when small fields in migration stream removed Claudio Fontana
2020-07-12 16:11 ` Paolo Bonzini
2020-07-13 9:11 ` Claudio Fontana
2020-07-14 14:29 ` Claudio Fontana
2020-07-14 14:35 ` Thomas Huth
2020-07-15 11:10 ` Claudio Fontana
2020-07-15 12:25 ` Claudio Fontana
2020-07-16 12:58 ` Claudio Fontana
2020-07-20 18:24 ` Claudio Fontana
2020-07-21 8:22 ` Claudio Fontana
2020-07-27 23:09 ` Bruce Rogers
2020-07-28 8:15 ` Vladimir Sementsov-Ogievskiy
2020-07-28 8:43 ` Vladimir Sementsov-Ogievskiy
2020-07-28 13:23 ` Bruce Rogers
2020-07-28 11:10 ` Max Reitz
2020-07-28 11:27 ` Vladimir Sementsov-Ogievskiy
2020-07-28 11:33 ` Vladimir Sementsov-Ogievskiy
2020-07-28 11:35 ` Paolo Bonzini
2020-07-28 11:45 ` Max Reitz
2020-07-28 12:09 ` Paolo Bonzini
2020-07-28 12:47 ` Claudio Fontana
2020-07-13 11:03 ` Dr. David Alan Gilbert
2020-07-13 11:39 ` Cornelia Huck
2020-07-13 11:39 ` Claudio Fontana [this message]
2020-07-13 11:45 ` Claudio Fontana
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c5878d54-e8b5-b1e7-a664-2d2ab36015e6@suse.de \
--to=cfontana@suse.de \
--cc=cohuck@redhat.com \
--cc=dgilbert@redhat.com \
--cc=fam@euphon.net \
--cc=jjherne@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).