* tests/avocado/riscv_opensbi.py does not work reliable
@ 2024-08-30 15:34 Thomas Huth
2024-09-02 23:54 ` Alistair Francis
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Huth @ 2024-08-30 15:34 UTC (permalink / raw)
To: QEMU Developers, qemu-riscv, Daniel Henrique Barboza
Cc: Bin Meng, Alistair Francis
Hi!
While running a lot of tests (i.e. with a very loaded machine), I noticed
that tests/avocado/riscv_opensbi.py is very flaky when the host machine is
slow. I can easily reproduce the problem when running a big compilation job
on all CPUs in the background and then run the riscv_opensbi.py avocado
test. One of test_riscv32_spike, test_riscv64_spike, test_riscv32_sifive_u
or test_riscv64_sifive_u is failing most of the time (but not the virt
machine tests).
Looking at the logs, it seems like the output sometimes stops somewhere at a
random place before the boot process reaches the spot that the test is
looking for. Looking at riscv_htif.c, there does not seem to be any flow
control implemented here, so I guess at least the spike test is currently
doomed to fail occasionally. Is there anything that can be done about this
(e.g. is flow control somehow possible here or does the interface not allow
this?)? Otherwise, I think it might be best to mark the spike and sifive_u
tests with QEMU_TEST_FLAKY_TESTS here to make it clear that these tests are
not reliable by default...?
Thomas
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: tests/avocado/riscv_opensbi.py does not work reliable
2024-08-30 15:34 tests/avocado/riscv_opensbi.py does not work reliable Thomas Huth
@ 2024-09-02 23:54 ` Alistair Francis
2024-09-03 5:55 ` Thomas Huth
0 siblings, 1 reply; 3+ messages in thread
From: Alistair Francis @ 2024-09-02 23:54 UTC (permalink / raw)
To: Thomas Huth
Cc: QEMU Developers, qemu-riscv, Daniel Henrique Barboza, Bin Meng,
Alistair Francis
On Sat, Aug 31, 2024 at 1:35 AM Thomas Huth <thuth@redhat.com> wrote:
>
>
> Hi!
>
> While running a lot of tests (i.e. with a very loaded machine), I noticed
> that tests/avocado/riscv_opensbi.py is very flaky when the host machine is
> slow. I can easily reproduce the problem when running a big compilation job
> on all CPUs in the background and then run the riscv_opensbi.py avocado
> test. One of test_riscv32_spike, test_riscv64_spike, test_riscv32_sifive_u
> or test_riscv64_sifive_u is failing most of the time (but not the virt
> machine tests).
>
> Looking at the logs, it seems like the output sometimes stops somewhere at a
> random place before the boot process reaches the spot that the test is
> looking for. Looking at riscv_htif.c, there does not seem to be any flow
I suspect this is: https://gitlab.com/qemu-project/qemu/-/issues/2114
> control implemented here, so I guess at least the spike test is currently
> doomed to fail occasionally. Is there anything that can be done about this
> (e.g. is flow control somehow possible here or does the interface not allow
> this?)? Otherwise, I think it might be best to mark the spike and sifive_u
Patches have been sent to the list to hopefully fix this:
https://mail.gnu.org/archive/html/qemu-devel/2024-08/msg02743.html
Just waiting on reviews and then the merge window to open up again
Alistair
> tests with QEMU_TEST_FLAKY_TESTS here to make it clear that these tests are
> not reliable by default...?
>
> Thomas
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: tests/avocado/riscv_opensbi.py does not work reliable
2024-09-02 23:54 ` Alistair Francis
@ 2024-09-03 5:55 ` Thomas Huth
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Huth @ 2024-09-03 5:55 UTC (permalink / raw)
To: Alistair Francis
Cc: QEMU Developers, qemu-riscv, Daniel Henrique Barboza, Bin Meng,
Alistair Francis
On 03/09/2024 01.54, Alistair Francis wrote:
> On Sat, Aug 31, 2024 at 1:35 AM Thomas Huth <thuth@redhat.com> wrote:
>>
>>
>> Hi!
>>
>> While running a lot of tests (i.e. with a very loaded machine), I noticed
>> that tests/avocado/riscv_opensbi.py is very flaky when the host machine is
>> slow. I can easily reproduce the problem when running a big compilation job
>> on all CPUs in the background and then run the riscv_opensbi.py avocado
>> test. One of test_riscv32_spike, test_riscv64_spike, test_riscv32_sifive_u
>> or test_riscv64_sifive_u is failing most of the time (but not the virt
>> machine tests).
>>
>> Looking at the logs, it seems like the output sometimes stops somewhere at a
>> random place before the boot process reaches the spot that the test is
>> looking for. Looking at riscv_htif.c, there does not seem to be any flow
>
> I suspect this is: https://gitlab.com/qemu-project/qemu/-/issues/2114
Indeed, that looks like the same issue - I should have had a look at the bug
tracker first!
>> control implemented here, so I guess at least the spike test is currently
>> doomed to fail occasionally. Is there anything that can be done about this
>> (e.g. is flow control somehow possible here or does the interface not allow
>> this?)? Otherwise, I think it might be best to mark the spike and sifive_u
>
> Patches have been sent to the list to hopefully fix this:
>
> https://mail.gnu.org/archive/html/qemu-devel/2024-08/msg02743.html
>
> Just waiting on reviews and then the merge window to open up again
Thanks a lot, these patches are fixing the issues for me!
Thomas
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-09-03 5:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30 15:34 tests/avocado/riscv_opensbi.py does not work reliable Thomas Huth
2024-09-02 23:54 ` Alistair Francis
2024-09-03 5:55 ` Thomas Huth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).