qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* tests/avocado/riscv_opensbi.py does not work reliable
@ 2024-08-30 15:34 Thomas Huth
  2024-09-02 23:54 ` Alistair Francis
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Huth @ 2024-08-30 15:34 UTC (permalink / raw)
  To: QEMU Developers, qemu-riscv, Daniel Henrique Barboza
  Cc: Bin Meng, Alistair Francis


  Hi!

While running a lot of tests (i.e. with a very loaded machine), I noticed 
that tests/avocado/riscv_opensbi.py is very flaky when the host machine is 
slow. I can easily reproduce the problem when running a big compilation job 
on all CPUs in the background and then run the riscv_opensbi.py avocado 
test. One of test_riscv32_spike, test_riscv64_spike, test_riscv32_sifive_u 
or test_riscv64_sifive_u is failing most of the time (but not the virt 
machine tests).

Looking at the logs, it seems like the output sometimes stops somewhere at a 
random place before the boot process reaches the spot that the test is 
looking for. Looking at riscv_htif.c, there does not seem to be any flow 
control implemented here, so I guess at least the spike test is currently 
doomed to fail occasionally. Is there anything that can be done about this 
(e.g. is flow control somehow possible here or does the interface not allow 
this?)? Otherwise, I think it might be best to mark the spike and sifive_u 
tests with QEMU_TEST_FLAKY_TESTS here to make it clear that these tests are 
not reliable by default...?

  Thomas



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: tests/avocado/riscv_opensbi.py does not work reliable
  2024-08-30 15:34 tests/avocado/riscv_opensbi.py does not work reliable Thomas Huth
@ 2024-09-02 23:54 ` Alistair Francis
  2024-09-03  5:55   ` Thomas Huth
  0 siblings, 1 reply; 3+ messages in thread
From: Alistair Francis @ 2024-09-02 23:54 UTC (permalink / raw)
  To: Thomas Huth
  Cc: QEMU Developers, qemu-riscv, Daniel Henrique Barboza, Bin Meng,
	Alistair Francis

On Sat, Aug 31, 2024 at 1:35 AM Thomas Huth <thuth@redhat.com> wrote:
>
>
>   Hi!
>
> While running a lot of tests (i.e. with a very loaded machine), I noticed
> that tests/avocado/riscv_opensbi.py is very flaky when the host machine is
> slow. I can easily reproduce the problem when running a big compilation job
> on all CPUs in the background and then run the riscv_opensbi.py avocado
> test. One of test_riscv32_spike, test_riscv64_spike, test_riscv32_sifive_u
> or test_riscv64_sifive_u is failing most of the time (but not the virt
> machine tests).
>
> Looking at the logs, it seems like the output sometimes stops somewhere at a
> random place before the boot process reaches the spot that the test is
> looking for. Looking at riscv_htif.c, there does not seem to be any flow

I suspect this is: https://gitlab.com/qemu-project/qemu/-/issues/2114

> control implemented here, so I guess at least the spike test is currently
> doomed to fail occasionally. Is there anything that can be done about this
> (e.g. is flow control somehow possible here or does the interface not allow
> this?)? Otherwise, I think it might be best to mark the spike and sifive_u

Patches have been sent to the list to hopefully fix this:

https://mail.gnu.org/archive/html/qemu-devel/2024-08/msg02743.html

Just waiting on reviews and then the merge window to open up again

Alistair

> tests with QEMU_TEST_FLAKY_TESTS here to make it clear that these tests are
> not reliable by default...?
>
>   Thomas
>
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: tests/avocado/riscv_opensbi.py does not work reliable
  2024-09-02 23:54 ` Alistair Francis
@ 2024-09-03  5:55   ` Thomas Huth
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Huth @ 2024-09-03  5:55 UTC (permalink / raw)
  To: Alistair Francis
  Cc: QEMU Developers, qemu-riscv, Daniel Henrique Barboza, Bin Meng,
	Alistair Francis

On 03/09/2024 01.54, Alistair Francis wrote:
> On Sat, Aug 31, 2024 at 1:35 AM Thomas Huth <thuth@redhat.com> wrote:
>>
>>
>>    Hi!
>>
>> While running a lot of tests (i.e. with a very loaded machine), I noticed
>> that tests/avocado/riscv_opensbi.py is very flaky when the host machine is
>> slow. I can easily reproduce the problem when running a big compilation job
>> on all CPUs in the background and then run the riscv_opensbi.py avocado
>> test. One of test_riscv32_spike, test_riscv64_spike, test_riscv32_sifive_u
>> or test_riscv64_sifive_u is failing most of the time (but not the virt
>> machine tests).
>>
>> Looking at the logs, it seems like the output sometimes stops somewhere at a
>> random place before the boot process reaches the spot that the test is
>> looking for. Looking at riscv_htif.c, there does not seem to be any flow
> 
> I suspect this is: https://gitlab.com/qemu-project/qemu/-/issues/2114

Indeed, that looks like the same issue - I should have had a look at the bug 
tracker first!

>> control implemented here, so I guess at least the spike test is currently
>> doomed to fail occasionally. Is there anything that can be done about this
>> (e.g. is flow control somehow possible here or does the interface not allow
>> this?)? Otherwise, I think it might be best to mark the spike and sifive_u
> 
> Patches have been sent to the list to hopefully fix this:
> 
> https://mail.gnu.org/archive/html/qemu-devel/2024-08/msg02743.html
> 
> Just waiting on reviews and then the merge window to open up again

Thanks a lot, these patches are fixing the issues for me!

  Thomas



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-03  5:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30 15:34 tests/avocado/riscv_opensbi.py does not work reliable Thomas Huth
2024-09-02 23:54 ` Alistair Francis
2024-09-03  5:55   ` Thomas Huth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).