From: Juan Quintela <quintela@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <richard.henderson@linaro.org>,
qemu-devel@nongnu.org, Leonardo Bras <leobras@redhat.com>,
Thomas Huth <thuth@redhat.com>,
Laurent Vivier <lvivier@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Peter Xu <peterx@redhat.com>,
Lukas Straub <lukasstraub2@web.de>,
"Daniel P. Berrange" <berrange@redhat.com>
Subject: Re: [PULL 00/21] Migration 20230428 patches
Date: Wed, 03 May 2023 15:29:34 +0200 [thread overview]
Message-ID: <87ttwtr1a9.fsf@secure.mitica> (raw)
In-Reply-To: <CAFEAcA-gu1Xxp49wOdtpif-C04fFd3nFrC+qNa8NizmPq9HGLQ@mail.gmail.com> (Peter Maydell's message of "Wed, 3 May 2023 13:57:55 +0100")
Peter Maydell <peter.maydell@linaro.org> wrote:
> On Wed, 3 May 2023 at 10:17, Juan Quintela <quintela@redhat.com> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> wrote:
>> > On Tue, 2 May 2023 at 11:39, Juan Quintela <quintela@redhat.com> wrote:
>> >> Richard, once that we are here, one of the problem that we are having is
>> >> that the test is exiting with an abort, so we have no clue what is
>> >> happening. Is there a way to get a backtrace, or at least the number
>> >
>> > This has been consistently an issue with the migration tests.
>> > As the owner of the tests, if they are not providing you with
>> > the level of detail that you need to diagnose failures, I
>> > think that is something that is in your court to address:
>> > the CI system is always going to only be able to provide
>> > you with what your tests are outputting to the logs.
>>
>> Right now I would be happy just to see what test it is failing at.
>>
>> I am doing something wrong, or from the links that I see on richard
>> email, I am not able to reach anywhere where I can see the full logs.
>>
>> > For the specific case of backtraces from assertion failures,
>> > I think Dan was looking at whether we could put something
>> > together for that. It won't help with segfaults and the like, though.
>>
>> I am waiting for that O:-)
>>
>> > You should be able to at least get the number of the subtest out of
>> > the logs (either directly in the logs of the job, or else
>> > from the more detailed log file that gets stored as a
>> > job artefact in most cases).
>>
>> Also note that the test is stopping in an abort, with no diagnostic
>> message that I can see. But I don't see where the abort cames from:
>
> So, as an example I took the check-system-opensuse log:
> https://gitlab.com/qemu-project/qemu/-/jobs/4201998342
>
> Use your browser's "search in web page" to look for "SIGABRT":
> it'll show you the two errors (as well as the summary at
> the bottom of the page which just says the tests aborted).
> Here's one:
>
> 5/351 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR
> 246.12s killed by signal 6 SIGABRT
>>>> QTEST_QEMU_BINARY=./qemu-system-x86_64 QTEST_QEMU_IMG=./qemu-img
>>> MALLOC_PERTURB_=48
>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
>>> G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
>>> /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
> stderr:
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> Could not access KVM kernel module: No such file or directory
> **
> ERROR:../tests/qtest/migration-helpers.c:205:wait_for_migration_status:
> assertion failed: (g_test_timer_elapsed() <
> MIGRATION_STATUS_WAIT_TIMEOUT)
> (test program exited with status code -6)
> ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
> ▶ 6/351 ERROR:../tests/qtest/migration-helpers.c:205:wait_for_migration_status:
> assertion failed: (g_test_timer_elapsed() <
> MIGRATION_STATUS_WAIT_TIMEOUT) ERROR
> 6/351 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test ERROR
> 221.18s killed by signal 6 SIGABRT
>
> Looks like it failed on a timeout in the test code.
Thanks.
> I think there ought to be artefacts from the job which have a
> copy of the full log, but I can't find them: not sure if this
> is just because the gitlab UI is terrible, or if they really
> didn't get generated.
So now we are between a rock and a hard place.
We have slowed down the bandwidth for migration test because on non
loaded machines, migration was too fast to need more than one pass.
And we slowed it so much than now we hit the timer that was set at 120
seconds.
So .....
It is going to be interesting.
BTW, what procesor speed do that aarch64 machines have? Or are they so
loaded that they are efectively trashing?
2minutes for a pass looks a bit too much.
Will give a try to get this test done changing when we detect that we
don't move to the completion stage.
Thanks for the explanation on where to find the data. The other issue
is that whan I really want is to know what test failed. I can't see a
way to get that info. According to Daniel answer, we don't upload that
files for tests that fail.
Later, Juan.
next prev parent reply other threads:[~2023-05-03 13:32 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-28 19:11 [PULL 00/21] Migration 20230428 patches Juan Quintela
2023-04-28 19:11 ` [PULL 01/21] multifd: We already account for this packet on the multifd thread Juan Quintela
2023-04-28 19:11 ` [PULL 02/21] migration: Move ram_stats to its own file migration-stats.[ch] Juan Quintela
2023-04-28 19:11 ` [PULL 03/21] migration: Rename ram_counters to mig_stats Juan Quintela
2023-04-28 19:11 ` [PULL 04/21] migration: Rename RAMStats to MigrationAtomicStats Juan Quintela
2023-04-28 19:11 ` [PULL 05/21] migration/rdma: Split the zero page case from acct_update_position Juan Quintela
2023-04-28 19:11 ` [PULL 06/21] migration/rdma: Unfold last user of acct_update_position() Juan Quintela
2023-04-28 19:11 ` [PULL 07/21] migration: Drop unused parameter for migration_tls_get_creds() Juan Quintela
2023-04-28 19:11 ` [PULL 08/21] migration: Drop unused parameter for migration_tls_client_create() Juan Quintela
2023-04-28 19:11 ` [PULL 09/21] qtest/migration-test.c: Add tests with compress enabled Juan Quintela
2023-04-28 19:11 ` [PULL 10/21] qtest/migration-test.c: Add postcopy " Juan Quintela
2023-04-28 19:11 ` [PULL 11/21] ram.c: Let the compress threads return a CompressResult enum Juan Quintela
2023-04-28 19:11 ` [PULL 12/21] ram.c: Dont change param->block in the compress thread Juan Quintela
2023-04-28 19:11 ` [PULL 13/21] ram.c: Reset result after sending queued data Juan Quintela
2023-04-28 19:11 ` [PULL 14/21] ram.c: Do not call save_page_header() from compress threads Juan Quintela
2023-04-28 19:11 ` [PULL 15/21] ram.c: Call update_compress_thread_counts from compress_send_queued_data Juan Quintela
2023-04-28 19:11 ` [PULL 16/21] ram.c: Remove last ram.c dependency from the core compress code Juan Quintela
2023-04-28 19:11 ` [PULL 17/21] ram.c: Move core compression code into its own file Juan Quintela
2023-04-28 19:12 ` [PULL 18/21] ram.c: Move core decompression " Juan Quintela
2023-04-28 19:12 ` [PULL 19/21] ram compress: Assert that the file buffer matches the result Juan Quintela
2023-04-28 19:12 ` [PULL 20/21] ram-compress.c: Make target independent Juan Quintela
2023-04-28 19:12 ` [PULL 21/21] migration: Initialize and cleanup decompression in migration.c Juan Quintela
2023-04-29 18:45 ` [PULL 00/21] Migration 20230428 patches Richard Henderson
2023-04-29 20:14 ` Lukas Straub
2023-04-29 22:08 ` Richard Henderson
2023-05-02 10:39 ` Juan Quintela
2023-05-02 10:43 ` Peter Maydell
2023-05-03 9:17 ` Juan Quintela
2023-05-03 12:57 ` Daniel P. Berrangé
2023-05-03 12:57 ` Peter Maydell
2023-05-03 13:29 ` Juan Quintela [this message]
2023-05-03 13:58 ` Peter Maydell
2023-05-08 1:06 ` Lukas Straub
2023-05-08 8:12 ` Juan Quintela
2023-05-08 9:47 ` Lukas Straub
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttwtr1a9.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=berrange@redhat.com \
--cc=leobras@redhat.com \
--cc=lukasstraub2@web.de \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).