From: "Alex Bennée" <alex.bennee@linaro.org>
To: Thomas Huth <thuth@redhat.com>,
qemu-devel@nongnu.org,
Richard Henderson <richard.henderson@linaro.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: Failing avocado tests in CI (was: Re: [PULL 00/24] tcg + linux-user queue for 8.1-rc3)
Date: Thu, 24 Aug 2023 16:31:25 +0100 [thread overview]
Message-ID: <87sf88fobd.fsf@linaro.org> (raw)
In-Reply-To: <59a970fb-ad7b-d30b-1290-7b167bec0226@linaro.org>
Richard Henderson <richard.henderson@linaro.org> writes:
> On 8/23/23 06:04, Thomas Huth wrote:
>> On 06/08/2023 05.36, Richard Henderson wrote:
>>> The following changes since commit 6db03ccc7f4ca33c99debaac290066f4500a2dfb:
>>>
>>> Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into
>>> staging (2023-08-04 14:47:00 -0700)
>>>
>>> are available in the Git repository at:
>>>
>>> https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230805
>>>
>>> for you to fetch changes up to 843246699425adfb6b81f927c16c9c6249b51e1d:
>>>
>>> linux-user/elfload: Set V in ELF_HWCAP for RISC-V (2023-08-05 18:17:20 +0000)
>>>
>>> ----------------------------------------------------------------
>>> accel/tcg: Do not issue misaligned i/o
>>> accel/tcg: Call save_iotlb_data from io_readx
>>> gdbstub: use 0 ("any process") on packets with no PID
>>> linux-user: Fixes for MAP_FIXED_NOREPLACE
>>> linux-user: Fixes for brk
>>> linux-user: Adjust task_unmapped_base for reserved_va
>>> linux-user: Use ELF_ET_DYN_BASE for ET_DYN with interpreter
>>> linux-user: Remove host != guest page size workarounds in brk and image load
>>> linux-user: Set V in ELF_HWCAP for RISC-V
>>> *-user: Remove last_brk as unused
>> Hi Richard,
>> I noticed that we currently have two failing Avocado jobs in our CI,
>> avocado-system-centos and avocado-system-opensuse, where the
>> boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg and the
>> boot_linux.py:BootLinuxX8664.test_pc_q35_tcg are now apparently
>> crashing. If I've got the history right, it started with your pull
>> request here, in the preceeding one from Paolo, everything is still
>> green:
>> https://gitlab.com/qemu-project/qemu/-/pipelines/956543770
>> But here the jobs started failing:
>> https://gitlab.com/qemu-project/qemu/-/pipelines/957458385
>> Could you please have a look?
>
> It's some sort of timing issue, which sometimes goes away when re-run.
> I was re-running tests *a lot* in order to get them to go green while
> running the 8.1 release.
There is a definite regression point for the test_pc_q35 case:
./tests/venv/bin/avocado run ./tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg
JOB ID : b8ea329d3353db7a47eb955fcad2f26b2dbe9f29
JOB LOG : /home/alex.bennee/avocado/job-results/job-2023-08-24T15.27-b8ea329/job.log
(1/1) ./tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg: PASS (110.70 s)
RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME : 111.22 s
🕙15:29:06 alex.bennee@hackbox2:qemu.git/builds/bisect (190aba8) (BISECTING) [$!?] took 1m51s
➜ make -j30
[1/8] Generating qemu-version.h with a custom command (wrapped by meson to capture output)
[2/8] Compiling C object qga/qemu-ga.p/main.c.o
[3/8] Compiling C object libqmp.fa.p/monitor_qmp-cmds-control.c.o
[4/8] Compiling C object libqemu-x86_64-softmmu.fa.p/accel_tcg_cputlb.c.o
[5/8] Compiling C object libcommon.fa.p/softmmu_vl.c.o
[6/8] Linking static target libqmp.fa
[7/8] Linking target qga/qemu-ga
[8/8] Linking target qemu-system-x86_64
🕙15:30:12 alex.bennee@hackbox2:qemu.git/builds/bisect (f7eaf9d) (BISECTING) [$!?] took 5s
➜ ./tests/venv/bin/avocado run ./tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg
JOB ID : 56768272dee373062792251ee3445cc81092634e
JOB LOG : /home/alex.bennee/avocado/job-results/job-2023-08-24T15.30-5676827/job.log
(1/1) ./tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg: INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n{'name': '1-./tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg', 'logdir': '/home/alex.bennee/avocado/job-results/job-2023-08-24T15.30-5676827/test-results... (480.28 s)
RESULTS : PASS 0 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 1 | CANCEL 0
JOB TIME : 480.80 s
which bisects to:
commit f7eaf9d702efdd02481d5f1c25f7d8e0ffb64c6e (HEAD, refs/bisect/bad)
Author: Richard Henderson <richard.henderson@linaro.org>
Date: Tue Aug 1 10:46:03 2023 -0700
accel/tcg: Do not issue misaligned i/o
In the single-page case we were issuing misaligned i/o to
the memory subsystem, which does not handle it properly.
Split such accesses via do_{ld,st}_mmio_*.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1800
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>
> For instance, with very little added except for your s390x pull, the
> same BootLinuxX8664.test_pc_i440fx_tcg test passes:
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4931341744#L136
>
> In the failing i44fx_tcg test, you can even see it's a timing issue:
>
> https://qemu-project.gitlab.io/-/qemu/-/jobs/4813804725/artifacts/build/tests/results/latest/test-results/02-tests_avocado_boot_linux.py_BootLinuxX8664.test_pc_i440fx_tcg/debug.log
>
> 23:42:30 DEBUG| [ 61.003328] Sending NMI from CPU 0 to CPUs 1:
> 23:42:30 DEBUG| [ 61.007829] INFO: NMI handler
> (nmi_cpu_backtrace_handler) took too long to run: 2.622 msecs
> 23:42:30 DEBUG| [ 61.003328] NMI backtrace for cpu 1 skipped: idling
> at native_safe_halt+0xe/0x10
> 23:42:30 DEBUG| [ 61.003328] rcu: rcu_sched kthread starved for
> 60002 jiffies! g-963 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> 23:42:30 DEBUG| [ 61.003328] rcu: RCU grace-period kthread stack dump:
> 23:42:30 DEBUG| [ 61.003328] rcu_sched I 0 10 2 0x80004000
> 23:42:30 DEBUG| [ 61.003328] Call Trace:
> 23:42:30 DEBUG| [ 61.003328] ? __schedule+0x29f/0x680
> ...
>
>
> r~
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
next prev parent reply other threads:[~2023-08-24 15:34 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-06 3:36 [PULL 00/24] tcg + linux-user queue for 8.1-rc3 Richard Henderson
2023-08-06 3:36 ` [PULL 01/24] accel/tcg: Adjust parameters and locking with do_{ld, st}_mmio_* Richard Henderson
2023-08-06 3:36 ` [PULL 02/24] accel/tcg: Issue wider aligned i/o in do_{ld,st}_mmio_* Richard Henderson
2023-08-06 3:36 ` [PULL 03/24] accel/tcg: Do not issue misaligned i/o Richard Henderson
2023-08-06 3:36 ` [PULL 04/24] gdbstub: use 0 ("any process") on packets with no PID Richard Henderson
2023-08-06 3:36 ` [PULL 05/24] linux-user: Unset MAP_FIXED_NOREPLACE for host Richard Henderson
2023-08-06 3:36 ` [PULL 06/24] linux-user: Fix MAP_FIXED_NOREPLACE on old kernels Richard Henderson
2023-08-06 3:36 ` [PULL 07/24] linux-user: Do not call get_errno() in do_brk() Richard Henderson
2023-08-06 6:53 ` Michael Tokarev
2023-08-07 1:23 ` Richard Henderson
2023-08-06 3:36 ` [PULL 08/24] linux-user: Use MAP_FIXED_NOREPLACE for do_brk() Richard Henderson
2023-08-06 3:37 ` [PULL 09/24] linux-user: Do nothing if too small brk is specified Richard Henderson
2023-08-06 3:37 ` [PULL 10/24] linux-user: Do not align brk with host page size Richard Henderson
2023-08-06 3:37 ` [PULL 11/24] linux-user: Remove last_brk Richard Henderson
2023-08-06 3:37 ` [PULL 12/24] bsd-user: " Richard Henderson
2023-08-06 3:37 ` [PULL 13/24] linux-user: Adjust task_unmapped_base for reserved_va Richard Henderson
2023-08-06 3:37 ` [PULL 14/24] linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h Richard Henderson
2023-08-06 3:37 ` [PULL 15/24] linux-user: Define ELF_ET_DYN_BASE " Richard Henderson
2023-08-06 3:37 ` [PULL 16/24] linux-user: Use MAP_FIXED_NOREPLACE for initial image mmap Richard Henderson
2023-08-06 3:37 ` [PULL 17/24] linux-user: Use elf_et_dyn_base for ET_DYN with interpreter Richard Henderson
2023-08-06 3:37 ` [PULL 18/24] linux-user: Adjust initial brk when interpreter is close to executable Richard Henderson
2023-08-06 3:37 ` [PULL 19/24] linux-user: Properly set image_info.brk in flatload Richard Henderson
2023-08-06 3:37 ` [PULL 20/24] linux-user: Do not adjust image mapping for host page size Richard Henderson
2023-08-06 3:37 ` [PULL 21/24] linux-user: Do not adjust zero_bss " Richard Henderson
2023-08-06 3:37 ` [PULL 22/24] linux-user: Use zero_bss for PT_LOAD with no file contents too Richard Henderson
2023-08-06 3:37 ` [PULL 23/24] accel/tcg: Call save_iotlb_data from io_readx as well Richard Henderson
2023-08-06 3:37 ` [PULL 24/24] linux-user/elfload: Set V in ELF_HWCAP for RISC-V Richard Henderson
2023-08-07 1:22 ` [PULL 00/24] tcg + linux-user queue for 8.1-rc3 Richard Henderson
2023-08-23 13:04 ` Failing avocado tests in CI (was: Re: [PULL 00/24] tcg + linux-user queue for 8.1-rc3) Thomas Huth
2023-08-23 16:27 ` Richard Henderson
2023-08-24 15:31 ` Alex Bennée [this message]
2023-08-24 16:23 ` Michael Tokarev
2023-08-25 11:05 ` Philippe Mathieu-Daudé
2023-08-24 18:31 ` Richard Henderson
2023-08-25 11:36 ` Philippe Mathieu-Daudé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sf88fobd.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.