qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Leonardo Bras" <leobras@redhat.com>,
	"Yury Kotov" <yury-kotov@yandex-team.ru>
Subject: Re: [PATCH 12/42] migration-test: Enable back ignore-shared test
Date: Wed, 21 Jun 2023 23:53:41 +0200	[thread overview]
Message-ID: <87legc8ot6.fsf@secure.mitica> (raw)
In-Reply-To: <ZJNVRQfPRxhbLpfZ@x1n> (Peter Xu's message of "Wed, 21 Jun 2023 15:53:41 -0400")

Peter Xu <peterx@redhat.com> wrote:
> On Wed, Jun 21, 2023 at 09:38:08PM +0200, Juan Quintela wrote:
>> Peter Xu <peterx@redhat.com> wrote:
>> > On Fri, Jun 09, 2023 at 12:49:13AM +0200, Juan Quintela wrote:
>> >> It failed on aarch64 tcg, lets see if that is still the case.
>> >> 
>> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> >
>> > According to the history:
>> >
>> > https://lore.kernel.org/all/20190305180635.GA3803@work-vm/
>> >
>> > It's never enabled, and not sure whether Yury followed it up.  Juan: have
>> > you tried it out on aarch64 before enabling it again?  I assume we rely on
>> > the previous patch but that doesn't even sound like aarch64 specific.  I
>> > worry it'll just keep failing on aarch64.
>> 
>> Hi
>> 
>> I am resending this series.
>> 
>> I hard tested this time.  x86_64 host.
>> Two build directories:
>> - x86_64 (I just build qemu-system-x86_64, kvm)
>> - aarch64 (I just build qemu-system-aarch64, tcg)
>> 
>> Everything is run as:
>> 
>> while true; do $command || break; done
>> 
>> And run this:
>> - x86_64:
>>   * make check (nit: you can't run two make checks on the same
>>     directory)
>>   * 4 ./test/qtest/migration-test
>>   * 2 ./test/qtest/migration-test -p ./tests/qtest/migration-test -p /x86_64/migration/multifd/tcp/plain/cancel
>>   * 2 ./test/qtest/migration-test -p ./tests/qtest/migration-test -p /x86_64/migration/ignore_shared
>> 
>> - aarch64:
>>   The same with s/x86_64/aarch64/
>> 
>> And left it running for 6 hours.  No errors.
>> Machine has enough RAM for running this (128GB) and 18 cores (intel
>> i9900K).
>> Load of the machine while running this tests is around 50 (I really hope
>> that our CI hosts have less load).
>> 
>> A run master with the same configuration.  In less than 10 minutes I get
>> the dreaded:
>> 
>> # starting QEMU: exec ./qemu-system-aarch64 -qtest unix:/tmp/qtest-3264370.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-3264370.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -machine virt,gic-version=max -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-1A1461/dest_serial -incoming defer -cpu max -kernel /tmp/migration-test-1A1461/bootsect    -accel qtest
>> Broken pipe
>> ../../../../../mnt/code/qemu/multifd/tests/qtest/libqtest.c:195: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
>> Aborted (core dumped)
>> $
>> 
>> On multifd+cancel.
>> 
>> I have no been able to ever get ignore_shared to fail on my machine.
>> But I didn't tested aarch64 TCG in the past so hard, and in x86_64 it
>> has always worked for me.
>
> Thanks a lot, Juan.
>
> Do you mean master is broken with QEMU_TEST_FLAKY_TESTS=1?

Yeap.  I mean multifd+cancel.  That is the reason why we put the FLAKY
part.

> And after the
> whole series applied we cannot trigger issue in the few hours test even
> with it?

Yeap.

> Shall we wait for another 1-2 days to see whether Yury would comment
> (before you repost)?  Otherwise I agree if it survives your few-hours test
> we should give it a try - at least according to Dave's comment before it
> was failing easily, but it is not now on the test bed.

From the v2 series that I am about to post:

    migration-test: Re-enable multifd_cancel test

    Why?
    - migration/multifd: Protect accesses to migration_threads
      this patch fixed the problem about memory corruption
    - migration-test: Move serial to GuestState
      now we are using guest name as serial file name
      In the past there was a conflict between vm "to" and "to2" that used
      the same file name.
    - migration-test: Wait for first target to finish
      Now we wait from vm "to" to finish before launching "to2".  So we
      avoid similar problems in the future.

    Signed-off-by: Juan Quintela <quintela@redhat.com>


> Maybe it's still just hidden, but in that case I also agree enabling it in
> the repo is the simplest way to reproduce the failure again, if we still
> ever want to enable it one day..

We want.  If it still fails, we want to know why and fix it.

Later, Juan.



  parent reply	other threads:[~2023-06-21 21:54 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-08 22:49 [PATCH 00/42] Migration test refactoring Juan Quintela
2023-06-08 22:49 ` [PATCH 01/42] migration-test: Be consistent for ppc Juan Quintela
2023-06-20 14:54   ` Peter Xu
2023-06-20 19:27     ` Laurent Vivier
2023-06-20 19:42       ` Peter Xu
2023-06-08 22:49 ` [PATCH 02/42] migration-test: Make ignore_stderr regular with other options Juan Quintela
2023-06-20 14:59   ` Peter Xu
2023-06-21 10:20     ` Juan Quintela
2023-06-08 22:49 ` [PATCH 03/42] migration-test: simplify shmem_opts handling Juan Quintela
2023-06-20 15:02   ` Peter Xu
2023-06-21  9:42     ` Juan Quintela
2023-06-21 13:15       ` Peter Xu
2023-06-08 22:49 ` [PATCH 04/42] migration-test: Make machine_opts regular with other options Juan Quintela
2023-06-20 15:03   ` Peter Xu
2023-06-08 22:49 ` [PATCH 05/42] migration-test: Create arch_opts Juan Quintela
2023-06-20 15:06   ` Peter Xu
2023-06-08 22:49 ` [PATCH 06/42] migration-test: machine_opts is really arch specific Juan Quintela
2023-06-20 15:07   ` Peter Xu
2023-06-08 22:49 ` [PATCH 07/42] migration-test: Create kvm_opts Juan Quintela
2023-06-20 15:07   ` Peter Xu
2023-06-08 22:49 ` [PATCH 08/42] migration-test: bootpath is the same for all tests and for all archs Juan Quintela
2023-06-20 15:11   ` Peter Xu
2023-06-08 22:49 ` [PATCH 09/42] migration-test: Add bootfile_create/delete() functions Juan Quintela
2023-06-20 15:17   ` Peter Xu
2023-06-08 22:49 ` [PATCH 10/42] migration-test: dirtylimit checks for x86_64 arch before Juan Quintela
2023-06-20 15:18   ` Peter Xu
2023-06-08 22:49 ` [PATCH 11/42] migration-test: Update test_ignore_shared to use args Juan Quintela
2023-06-20 15:21   ` Peter Xu
2023-06-21  9:58     ` Juan Quintela
2023-06-08 22:49 ` [PATCH 12/42] migration-test: Enable back ignore-shared test Juan Quintela
2023-06-20 15:27   ` Peter Xu
2023-06-21 19:38     ` Juan Quintela
2023-06-21 19:53       ` Peter Xu
2023-06-21 20:37         ` Juan Quintela
2023-06-21 21:53         ` Juan Quintela [this message]
2023-06-08 22:49 ` [PATCH 13/42] migration-test: Check for shared memory like for everything else Juan Quintela
2023-06-20 15:32   ` Peter Xu
2023-06-21 10:07     ` Juan Quintela
2023-06-21 13:14       ` Peter Xu
2023-06-21 18:56         ` Juan Quintela
2023-06-08 22:49 ` [PATCH 14/42] migration-test: test_migrate_start() always return 0 Juan Quintela
2023-06-20 15:35   ` Peter Xu
2023-06-08 22:49 ` [PATCH 15/42] migration-test: migrate_postcopy_prepare() " Juan Quintela
2023-06-20 15:36   ` Peter Xu
2023-06-08 22:49 ` [PATCH 16/42] migration-test: Create do_migrate() Juan Quintela
2023-06-20 15:53   ` Peter Xu
2023-06-21 10:30     ` Juan Quintela
2023-06-08 22:49 ` [PATCH 17/42] migration-test: Introduce GuestState Juan Quintela
2023-06-08 22:49 ` [PATCH 18/42] migration-test: Create guest before calling do_test_validate_uuid() Juan Quintela
2023-06-08 22:49 ` [PATCH 19/42] migration-test: Create guest before calling test_precopy_common() Juan Quintela
2023-06-08 22:49 ` [PATCH 20/42] migration-test: Create guest before calling test_postcopy_common() Juan Quintela
2023-06-08 22:49 ` [PATCH 21/42] migration-test: Move common guest code to guest_create() Juan Quintela
2023-06-08 22:49 ` [PATCH 22/42] migration-test: Create guest_use_dirty_log() Juan Quintela
2023-06-08 22:49 ` [PATCH 23/42] migration-test: Move serial to GuestState Juan Quintela
2023-06-08 22:49 ` [PATCH 24/42] migration-test: Re-enable multifd_cancel test Juan Quintela
2023-06-09  7:53   ` Daniel P. Berrangé
2023-06-09 10:22     ` Juan Quintela
2023-06-09 10:40       ` Daniel P. Berrangé
2023-06-08 22:49 ` [PATCH 25/42] migration-test: We were not waiting for "target" to finish Juan Quintela
2023-06-08 22:49 ` [PATCH 26/42] migration-test: create guest_use_shmem() Juan Quintela
2023-06-08 22:49 ` [PATCH 27/42] migration-test: Create guest_extra_opts() Juan Quintela
2023-06-08 22:49 ` [PATCH 28/42] migration-test: Create guest_hide_stderr() Juan Quintela
2023-06-08 22:49 ` [PATCH 29/42] migration-test: Create the migration unix socket by guest Juan Quintela
2023-06-08 22:49 ` [PATCH 30/42] migration-test: Hooks also need GuestState Juan Quintela
2023-06-08 22:49 ` [PATCH 31/42] migration-test: Preffer to->uri to uri parameter for migration Juan Quintela
2023-06-08 22:49 ` [PATCH 32/42] migration-test: Create guest_set_uri() Juan Quintela
2023-06-08 22:49 ` [PATCH 33/42] migration-test: Remove connect_uri Juan Quintela
2023-06-08 22:49 ` [PATCH 34/42] migration-test: Use new schema for all tests that use unix sockets Juan Quintela
2023-06-08 22:49 ` [PATCH 35/42] migration-test: Set uri for tcp tests with guest_set_uri() Juan Quintela
2023-06-08 22:49 ` [PATCH 36/42] migration-test: Remove unused listen_uri Juan Quintela
2023-06-08 22:49 ` [PATCH 37/42] migration-test: Create get_event GuestState variable Juan Quintela
2023-06-08 22:49 ` [PATCH 38/42] migration-test: Create guest_realize() Juan Quintela
2023-06-08 22:49 ` [PATCH 39/42] migration-test: Unfold test_migrate_end() into three functions Juan Quintela
2023-06-08 22:49 ` [PATCH 40/42] migration-test: Create migrate_incoming() function Juan Quintela
2023-06-08 22:49 ` [PATCH 41/42] migration-test: Move functions to migration-helpers.c Juan Quintela
2023-06-08 22:49 ` [PATCH 42/42] migration-test: Split vcpu-dirty-limit-test Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87legc8ot6.fsf@secure.mitica \
    --to=quintela@redhat.com \
    --cc=berrange@redhat.com \
    --cc=leobras@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=yury-kotov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).