qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* mips system emulation failure with virtio
@ 2023-09-05 13:19 Richard Purdie
  2023-09-05 13:59 ` Alex Bennée
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Purdie @ 2023-09-05 13:19 UTC (permalink / raw)
  To: qemu-devel

With qemu 8.1.0 we see boot hangs fox x86-64 targets. 

These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
mips64 break, hanging at boot unable to find a rootfs. 

We use virtio for network and disk and both of those change in the
bootlog from messages like:

[    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
[    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
[    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
...
[    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
[    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical 

to:

[    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
[    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
[    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
...
[    1.894852] virtio_rng: probe of virtio1 failed with error -28
...
[    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
[    2.064260] virtio_blk: probe of virtio2 failed with error -28
[    2.069080] virtio_net: probe of virtio0 failed with error -28


i.e. the virtio drivers no longer work.

I tested with current qemu master
(17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
there.

Is this issue known about?

Cheers,

Richard


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 13:19 mips system emulation failure with virtio Richard Purdie
@ 2023-09-05 13:59 ` Alex Bennée
  2023-09-05 14:50   ` Richard Purdie
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Bennée @ 2023-09-05 13:59 UTC (permalink / raw)
  To: Richard Purdie
  Cc: qemu-devel, Philippe Mathieu-Daudé, Aurelien Jarno,
	Jiaxun Yang, Aleksandar Rikalo, Richard Henderson


Richard Purdie <richard.purdie@linuxfoundation.org> writes:

> With qemu 8.1.0 we see boot hangs fox x86-64 targets. 
>
> These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
> Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
> mips64 break, hanging at boot unable to find a rootfs. 

(Widen CC list)

>
> We use virtio for network and disk and both of those change in the
> bootlog from messages like:
>
> [    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
> [    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
> [    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
> ...
> [    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
> [    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical 
>
> to:
>
> [    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
> [    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
> [    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
> ...
> [    1.894852] virtio_rng: probe of virtio1 failed with error -28
> ...
> [    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
> [    2.064260] virtio_blk: probe of virtio2 failed with error -28
> [    2.069080] virtio_net: probe of virtio0 failed with error -28
>
>
> i.e. the virtio drivers no longer work.

Interesting, as you say this seems to be VirtIO specific as the baseline
tests (using IDE) work fine:

  ➜  ./tests/venv/bin/avocado run ./tests/avocado/tuxrun_baselines.py:test_mips64
  JOB ID     : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b
  JOB LOG    : /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log
   (1/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: PASS (12.19 s)
   (2/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: PASS (11.78 s)
  RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
  JOB TIME   : 24.79 s

> I tested with current qemu master
> (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
> there.
>
> Is this issue known about?

Could you raise a bug at:

  https://gitlab.com/qemu-project/qemu/-/issues

I'm curious why MIPS VirtIO is affected but nothing else is...

>
> Cheers,
>
> Richard


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 13:59 ` Alex Bennée
@ 2023-09-05 14:50   ` Richard Purdie
  2023-09-05 15:12     ` Philippe Mathieu-Daudé
  2023-09-06 15:50     ` Philippe Mathieu-Daudé
  0 siblings, 2 replies; 10+ messages in thread
From: Richard Purdie @ 2023-09-05 14:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: qemu-devel, Philippe Mathieu-Daudé, Aurelien Jarno,
	Jiaxun Yang, Aleksandar Rikalo, Richard Henderson

On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
> Richard Purdie <richard.purdie@linuxfoundation.org> writes:
> 
> > With qemu 8.1.0 we see boot hangs fox x86-64 targets. 
> > 
> > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
> > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
> > mips64 break, hanging at boot unable to find a rootfs. 
> 
> (Widen CC list)
> 
> > 
> > We use virtio for network and disk and both of those change in the
> > bootlog from messages like:
> > 
> > [    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
> > [    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
> > [    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
> > ...
> > [    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
> > [    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical 
> > 
> > to:
> > 
> > [    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
> > [    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
> > [    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
> > ...
> > [    1.894852] virtio_rng: probe of virtio1 failed with error -28
> > ...
> > [    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
> > [    2.064260] virtio_blk: probe of virtio2 failed with error -28
> > [    2.069080] virtio_net: probe of virtio0 failed with error -28
> > 
> > 
> > i.e. the virtio drivers no longer work.
> 
> Interesting, as you say this seems to be VirtIO specific as the baseline
> tests (using IDE) work fine:
> 
>   ➜  ./tests/venv/bin/avocado run ./tests/avocado/tuxrun_baselines.py:test_mips64
>   JOB ID     : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b
>   JOB LOG    : /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log
>    (1/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: PASS (12.19 s)
>    (2/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: PASS (11.78 s)
>   RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
>   JOB TIME   : 24.79 s
> 
> > I tested with current qemu master
> > (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
> > there.
> > 
> > Is this issue known about?
> 
> Could you raise a bug at:
> 
>   https://gitlab.com/qemu-project/qemu/-/issues

Done, https://gitlab.com/qemu-project/qemu/-/issues/1866

> I'm curious why MIPS VirtIO is affected but nothing else is...

Me too, it seems there is a real code issue somewhere in this...

Cheers,

Richard





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 14:50   ` Richard Purdie
@ 2023-09-05 15:12     ` Philippe Mathieu-Daudé
  2023-09-05 15:53       ` Richard Purdie
  2023-09-06 15:50     ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-05 15:12 UTC (permalink / raw)
  To: Richard Purdie, Alex Bennée
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

Hi Richard,

On 5/9/23 16:50, Richard Purdie wrote:
> On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
>> Richard Purdie <richard.purdie@linuxfoundation.org> writes:
>>
>>> With qemu 8.1.0 we see boot hangs fox x86-64 targets.
>>>
>>> These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
>>> Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
>>> mips64 break, hanging at boot unable to find a rootfs.

Are you testing mipsel / mips64el?

>>> We use virtio for network and disk and both of those change in the
>>> bootlog from messages like:
>>>
>>> [    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
>>> [    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
>>> [    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
>>> ...
>>> [    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
>>> [    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical
>>>
>>> to:
>>>
>>> [    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
>>> [    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
>>> [    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
>>> ...
>>> [    1.894852] virtio_rng: probe of virtio1 failed with error -28
>>> ...
>>> [    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
>>> [    2.064260] virtio_blk: probe of virtio2 failed with error -28
>>> [    2.069080] virtio_net: probe of virtio0 failed with error -28
>>>
>>>
>>> i.e. the virtio drivers no longer work.
>>
>> Interesting, as you say this seems to be VirtIO specific as the baseline
>> tests (using IDE) work fine:
>>
>>    ➜  ./tests/venv/bin/avocado run ./tests/avocado/tuxrun_baselines.py:test_mips64
>>    JOB ID     : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b
>>    JOB LOG    : /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log
>>     (1/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: PASS (12.19 s)
>>     (2/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: PASS (11.78 s)
>>    RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
>>    JOB TIME   : 24.79 s
>>
>>> I tested with current qemu master
>>> (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
>>> there.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 15:12     ` Philippe Mathieu-Daudé
@ 2023-09-05 15:53       ` Richard Purdie
  2023-09-05 16:46         ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Purdie @ 2023-09-05 15:53 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Alex Bennée
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

On Tue, 2023-09-05 at 17:12 +0200, Philippe Mathieu-Daudé wrote:
> Hi Richard,
> 
> On 5/9/23 16:50, Richard Purdie wrote:
> > On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
> > > Richard Purdie <richard.purdie@linuxfoundation.org> writes:
> > > 
> > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets.
> > > > 
> > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
> > > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
> > > > mips64 break, hanging at boot unable to find a rootfs.
> 
> Are you testing mipsel / mips64el?

No, it was mips/mips64, i.e. big endian.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 15:53       ` Richard Purdie
@ 2023-09-05 16:46         ` Philippe Mathieu-Daudé
  2023-09-05 16:50           ` Richard Purdie
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-05 16:46 UTC (permalink / raw)
  To: Richard Purdie, Alex Bennée
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

On 5/9/23 17:53, Richard Purdie wrote:
> On Tue, 2023-09-05 at 17:12 +0200, Philippe Mathieu-Daudé wrote:
>> Hi Richard,
>>
>> On 5/9/23 16:50, Richard Purdie wrote:
>>> On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
>>>> Richard Purdie <richard.purdie@linuxfoundation.org> writes:
>>>>
>>>>> With qemu 8.1.0 we see boot hangs fox x86-64 targets.
>>>>>
>>>>> These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
>>>>> Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
>>>>> mips64 break, hanging at boot unable to find a rootfs.
>>
>> Are you testing mipsel / mips64el?
> 
> No, it was mips/mips64, i.e. big endian.

Sorry my question was not clear. I meant: Do you also
test mipsel / mips64el guests, and if so, do they work?
(IOW, is this bug only big-endian guest specific?)

Thanks,

Phil.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 16:46         ` Philippe Mathieu-Daudé
@ 2023-09-05 16:50           ` Richard Purdie
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2023-09-05 16:50 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Alex Bennée
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

On Tue, 2023-09-05 at 18:46 +0200, Philippe Mathieu-Daudé wrote:
> On 5/9/23 17:53, Richard Purdie wrote:
> > On Tue, 2023-09-05 at 17:12 +0200, Philippe Mathieu-Daudé wrote:
> > > Hi Richard,
> > > 
> > > On 5/9/23 16:50, Richard Purdie wrote:
> > > > On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
> > > > > Richard Purdie <richard.purdie@linuxfoundation.org> writes:
> > > > > 
> > > > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets.
> > > > > > 
> > > > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
> > > > > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
> > > > > > mips64 break, hanging at boot unable to find a rootfs.
> > > 
> > > Are you testing mipsel / mips64el?
> > 
> > No, it was mips/mips64, i.e. big endian.
> 
> Sorry my question was not clear. I meant: Do you also
> test mipsel / mips64el guests, and if so, do they work?
> (IOW, is this bug only big-endian guest specific?)

Sorry, I misunderstood. We don't test mipsel/mips64el so I don't know
if that is working or not unfortunately.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-05 14:50   ` Richard Purdie
  2023-09-05 15:12     ` Philippe Mathieu-Daudé
@ 2023-09-06 15:50     ` Philippe Mathieu-Daudé
  2023-09-06 16:06       ` Philippe Mathieu-Daudé
  2023-09-06 16:46       ` Richard Purdie
  1 sibling, 2 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-06 15:50 UTC (permalink / raw)
  To: Richard Purdie, Alex Bennée, Paolo Bonzini, Peter Maydell,
	David Hildenbrand
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

+rth/pm215/dhildenb

On 5/9/23 16:50, Richard Purdie wrote:
> On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
>> Richard Purdie <richard.purdie@linuxfoundation.org> writes:
>>
>>> With qemu 8.1.0 we see boot hangs fox x86-64 targets.
>>>
>>> These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
>>> Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
>>> mips64 break, hanging at boot unable to find a rootfs.
>>
>> (Widen CC list)
>>
>>>
>>> We use virtio for network and disk and both of those change in the
>>> bootlog from messages like:
>>>
>>> [    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
>>> [    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
>>> [    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
>>> ...
>>> [    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
>>> [    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical
>>>
>>> to:
>>>
>>> [    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
>>> [    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
>>> [    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
>>> ...
>>> [    1.894852] virtio_rng: probe of virtio1 failed with error -28
>>> ...
>>> [    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
>>> [    2.064260] virtio_blk: probe of virtio2 failed with error -28
>>> [    2.069080] virtio_net: probe of virtio0 failed with error -28
>>>
>>>
>>> i.e. the virtio drivers no longer work.
>>
>> Interesting, as you say this seems to be VirtIO specific as the baseline
>> tests (using IDE) work fine:
>>
>>    ➜  ./tests/venv/bin/avocado run ./tests/avocado/tuxrun_baselines.py:test_mips64
>>    JOB ID     : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b
>>    JOB LOG    : /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log
>>     (1/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: PASS (12.19 s)
>>     (2/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: PASS (11.78 s)
>>    RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
>>    JOB TIME   : 24.79 s
>>
>>> I tested with current qemu master
>>> (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
>>> there.
>>>
>>> Is this issue known about?
>>
>> Could you raise a bug at:
>>
>>    https://gitlab.com/qemu-project/qemu/-/issues
> 
> Done, https://gitlab.com/qemu-project/qemu/-/issues/1866
> 
>> I'm curious why MIPS VirtIO is affected but nothing else is...
> 
> Me too, it seems there is a real code issue somewhere in this...

This seems to fix the issue for me, but I'm not really sure what
I'm doing after various hours debugging, so sharing here before
I take some rest:

-- >8 --
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 18277ddd67..ec31ebcb56 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2517,7 +2517,7 @@ static void tcg_commit(MemoryListener *listener)
       * That said, the listener is also called during realize, before
       * all of the tcg machinery for run-on is initialized: thus halt_cond.
       */
-    if (cpu->halt_cond) {
+    if (cpu->halt_cond && !qemu_cpu_is_self(cpu)) {
          async_run_on_cpu(cpu, tcg_commit_cpu, RUN_ON_CPU_HOST_PTR(cpuas));
      } else {
          tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas));
---

That said, the same logic moved generically to async_run_on_cpu()
also works ...:

-- >8 --
diff --git a/cpus-common.c b/cpus-common.c
index 45c745ecf6..b0539c4fb8 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -167,6 +167,9 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func 
func, run_on_cpu_data data,

  void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
run_on_cpu_data data)
  {
+    if (qemu_cpu_is_self(cpu)) {
+        return func(cpu, data);
+    }
      struct qemu_work_item *wi;

      wi = g_new0(struct qemu_work_item, 1);
---

Regards,

Phil.


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-06 15:50     ` Philippe Mathieu-Daudé
@ 2023-09-06 16:06       ` Philippe Mathieu-Daudé
  2023-09-06 16:46       ` Richard Purdie
  1 sibling, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-06 16:06 UTC (permalink / raw)
  To: Richard Purdie, Alex Bennée, Paolo Bonzini, Peter Maydell,
	David Hildenbrand
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

On 6/9/23 17:50, Philippe Mathieu-Daudé wrote:
> +rth/pm215/dhildenb
> 
> On 5/9/23 16:50, Richard Purdie wrote:
>> On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
>>> Richard Purdie <richard.purdie@linuxfoundation.org> writes:
>>>
>>>> With qemu 8.1.0 we see boot hangs fox x86-64 targets.
>>>>
>>>> These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
>>>> Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
>>>> mips64 break, hanging at boot unable to find a rootfs.
>>>
>>> (Widen CC list)
>>>
>>>>
>>>> We use virtio for network and disk and both of those change in the
>>>> bootlog from messages like:
>>>>
>>>> [    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
>>>> [    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
>>>> [    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
>>>> ...
>>>> [    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
>>>> [    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical
>>>>
>>>> to:
>>>>
>>>> [    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
>>>> [    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
>>>> [    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
>>>> ...
>>>> [    1.894852] virtio_rng: probe of virtio1 failed with error -28
>>>> ...
>>>> [    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
>>>> [    2.064260] virtio_blk: probe of virtio2 failed with error -28
>>>> [    2.069080] virtio_net: probe of virtio0 failed with error -28
>>>>
>>>>
>>>> i.e. the virtio drivers no longer work.
>>>
>>> Interesting, as you say this seems to be VirtIO specific as the baseline
>>> tests (using IDE) work fine:
>>>
>>>    ➜  ./tests/venv/bin/avocado run 
>>> ./tests/avocado/tuxrun_baselines.py:test_mips64
>>>    JOB ID     : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b
>>>    JOB LOG    : 
>>> /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log
>>>     (1/2) 
>>> ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: 
>>> PASS (12.19 s)
>>>     (2/2) 
>>> ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: 
>>> PASS (11.78 s)
>>>    RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | 
>>> INTERRUPT 0 | CANCEL 0
>>>    JOB TIME   : 24.79 s
>>>
>>>> I tested with current qemu master
>>>> (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
>>>> there.
>>>>
>>>> Is this issue known about?
>>>
>>> Could you raise a bug at:
>>>
>>>    https://gitlab.com/qemu-project/qemu/-/issues
>>
>> Done, https://gitlab.com/qemu-project/qemu/-/issues/1866
>>
>>> I'm curious why MIPS VirtIO is affected but nothing else is...
>>
>> Me too, it seems there is a real code issue somewhere in this...
> 
> This seems to fix the issue for me, but I'm not really sure what
> I'm doing after various hours debugging, so sharing here before
> I take some rest:
> 
> -- >8 --
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index 18277ddd67..ec31ebcb56 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -2517,7 +2517,7 @@ static void tcg_commit(MemoryListener *listener)
>        * That said, the listener is also called during realize, before
>        * all of the tcg machinery for run-on is initialized: thus 
> halt_cond.
>        */
> -    if (cpu->halt_cond) {
> +    if (cpu->halt_cond && !qemu_cpu_is_self(cpu)) {
>           async_run_on_cpu(cpu, tcg_commit_cpu, 
> RUN_ON_CPU_HOST_PTR(cpuas));
>       } else {
>           tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas));
> ---
> 
> That said, the same logic moved generically to async_run_on_cpu()
> also works ...:
> 
> -- >8 --
> diff --git a/cpus-common.c b/cpus-common.c
> index 45c745ecf6..b0539c4fb8 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -167,6 +167,9 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func 
> func, run_on_cpu_data data,
> 
>   void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
> run_on_cpu_data data)
>   {
> +    if (qemu_cpu_is_self(cpu)) {
> +        return func(cpu, data);
> +    }
>       struct qemu_work_item *wi;
> 
>       wi = g_new0(struct qemu_work_item, 1);
> ---

Ah, this is what commit c978b31687 removed:

commit c978b3168727d3a76ffcb18462ea972f50b53634
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Wed Aug 31 18:03:39 2016 +0200

     cpus-common: always defer async_run_on_cpu work items

     async_run_on_cpu is only called from the I/O thread, not from
     CPU threads, so it doesn't make any difference.  It will make
     a difference however for async_safe_run_on_cpu.

So adding:

-- >8 --
diff --git a/cpus-common.c b/cpus-common.c
index 45c745ecf6..8554e3f7a1 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -169,6 +169,8 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func 
func, run_on_cpu_data data)
  {
      struct qemu_work_item *wi;

+    g_assert(!qemu_cpu_is_self(cpu));
+
      wi = g_new0(struct qemu_work_item, 1);
      wi->func = func;
      wi->data = data;
---

We get:

* thread #4, stop reason = signal SIGABRT
   * frame #0: 0x00000001887d8764 libsystem_kernel.dylib`__pthread_kill + 8
     frame #1: 0x000000018880fc28 libsystem_pthread.dylib`pthread_kill + 288
     frame #2: 0x000000018871dae8 libsystem_c.dylib`abort + 180
     frame #3: 0x0000000101efdddc 
libglib-2.0.0.dylib`g_assertion_message + 464
     frame #4: 0x0000000101efde50 
libglib-2.0.0.dylib`g_assertion_message_expr + 116
     frame #5: 0x000000010000f6a8 
qemu-system-mips`async_run_on_cpu(cpu=0x0000000104815a00, 
func=(qemu-system-mips`tcg_commit_cpu at physmem.c:2490), data=(host_int 
= 61900768, host_ulong = 105553178167264, host_ptr = 0x0000600003b087e0, 
target_ptr = 105553178167264)) at cpus-common.c:172:5
     frame #6: 0x00000001004d469c 
qemu-system-mips`tcg_commit(listener=0x0000600003b087f8) at physmem.c:2521:9
     frame #7: 0x00000001004c4b70 
qemu-system-mips`memory_region_transaction_commit at memory.c:1126:13
     frame #8: 0x000000010020f620 
qemu-system-mips`gt64120_isd_mapping(s=0x0000000104875c00) at 
gt64120.c:318:5
     frame #9: 0x000000010020dbe4 
qemu-system-mips`gt64120_writel(opaque=0x0000000104875c00, addr=104, 
val=223, size=4) at gt64120.c:502:9
     frame #10: 0x00000001004c638c 
qemu-system-mips`memory_region_write_accessor(mr=0x0000000104877600, 
addr=104, value=0x000000016ff9e2f0, size=4, shift=0, mask=4294967295, 
attrs=MemTxAttrs @ 0x000000016ff9e21c) at memory.c:493:5
     frame #11: 0x00000001004c6160 
qemu-system-mips`access_with_adjusted_size(addr=104, 
value=0x000000016ff9e2f0, size=4, access_size_min=4, access_size_max=4, 
access_fn=(qemu-system-mips`memory_region_write_accessor at 
memory.c:483), mr=0x0000000104877600, attrs=MemTxAttrs @ 
0x000000016ff9e2a8) at memory.c:564:18
     frame #12: 0x00000001004c5cd0 
qemu-system-mips`memory_region_dispatch_write(mr=0x0000000104877600, 
addr=104, data=3741319168, op=MO_32, attrs=MemTxAttrs @ 
0x000000016ff9e2e8) at memory.c:1533:16
     frame #13: 0x00000001004decc0 
qemu-system-mips`flatview_write_continue(fv=0x00006000017151c0, 
addr=335544424, attrs=MemTxAttrs @ 0x000000016ff9e384, 
ptr=0x000000016ff9e450, len=4, addr1=104, l=4, mr=0x0000000104877600) at 
physmem.c:2677:23
     frame #14: 0x00000001004db090 
qemu-system-mips`flatview_write(fv=0x00006000017151c0, addr=335544424, 
attrs=MemTxAttrs @ 0x000000016ff9e3e0, buf=0x000000016ff9e450, len=4) at 
physmem.c:2719:12
     frame #15: 0x00000001004e2f54 
qemu-system-mips`subpage_write(opaque=0x0000000104907400, addr=104, 
value=3741319168, len=4, attrs=MemTxAttrs @ 0x000000016ff9e44c) at 
physmem.c:2328:12
     frame #16: 0x00000001004c64dc 
qemu-system-mips`memory_region_write_with_attrs_accessor(mr=0x0000000104907400, 
addr=104, value=0x000000016ff9e5a0, size=4, shift=0, mask=4294967295, 
attrs=MemTxAttrs @ 0x000000016ff9e4cc) at memory.c:514:12
     frame #17: 0x00000001004c6160 
qemu-system-mips`access_with_adjusted_size(addr=104, 
value=0x000000016ff9e5a0, size=4, access_size_min=1, access_size_max=8, 
access_fn=(qemu-system-mips`memory_region_write_with_attrs_accessor at 
memory.c:504), mr=0x0000000104907400, attrs=MemTxAttrs @ 
0x000000016ff9e558) at memory.c:564:18
     frame #18: 0x00000001004c5d14 
qemu-system-mips`memory_region_dispatch_write(mr=0x0000000104907400, 
addr=104, data=3741319168, op=MO_32, attrs=MemTxAttrs @ 
0x000000016ff9e598) at memory.c:1540:13
     frame #19: 0x000000010054f5c8 
qemu-system-mips`io_writex(env=0x0000000104818260, 
full=0x0000000104828000, mmu_idx=3, val=223, addr=3019898984, 
retaddr=10737419088, op=MO_32) at cputlb.c:1449:13
     frame #20: 0x000000010054f9a4 
qemu-system-mips`do_st_mmio_leN(env=0x0000000104818260, 
full=0x0000000104828000, val_le=223, addr=3019898984, size=4, mmu_idx=3, 
ra=10737419088) at cputlb.c:2756:13
     frame #21: 0x00000001005500bc 
qemu-system-mips`do_st_4(env=0x0000000104818260, p=0x000000016ff9e760, 
val=223, mmu_idx=3, memop=226, ra=10737419088) at cputlb.c:2922:9
     frame #22: 0x000000010053dff0 
qemu-system-mips`do_st4_mmu(env=0x0000000104818260, addr=3019898984, 
val=223, oi=3619, ra=10737419088) at cputlb.c:3007:9
     frame #23: 0x000000010053df44 
qemu-system-mips`helper_stl_mmu(env=0x0000000104818260, addr=3019898984, 
val=223, oi=3619, retaddr=10737419088) at cputlb.c:3023:5
     frame #24: 0x0000000280000624
     frame #25: 0x0000000100522e70 
qemu-system-mips`cpu_tb_exec(cpu=0x0000000104815a00, 
itb=0x0000000118000200, tb_exit=0x000000016ff9ee24) at cpu-exec.c:457:11
     frame #26: 0x0000000100525714 
qemu-system-mips`cpu_loop_exec_tb(cpu=0x0000000104815a00, 
tb=0x0000000118000200, pc=3217032576, last_tb=0x000000016ff9ee28, 
tb_exit=0x000000016ff9ee24) at cpu-exec.c:919:10
     frame #27: 0x0000000100524d3c 
qemu-system-mips`cpu_exec_loop(cpu=0x0000000104815a00, 
sc=0x000000016ff9eea0) at cpu-exec.c:1040:13
     frame #28: 0x000000010052366c 
qemu-system-mips`cpu_exec_setjmp(cpu=0x0000000104815a00, 
sc=0x000000016ff9eea0) at cpu-exec.c:1057:12
     frame #29: 0x0000000100523340 
qemu-system-mips`cpu_exec(cpu=0x0000000104815a00) at cpu-exec.c:1083:11
     frame #30: 0x00000001005569bc 
qemu-system-mips`tcg_cpus_exec(cpu=0x0000000104815a00) at 
tcg-accel-ops.c:75:11
     frame #31: 0x00000001005575c4 
qemu-system-mips`mttcg_cpu_thread_fn(arg=0x0000000104815a00) at 
tcg-accel-ops-mttcg.c:95:17
     frame #32: 0x00000001007d86c8 
qemu-system-mips`qemu_thread_start(args=0x000060000022e400) at 
qemu-thread-posix.c:541:9
     frame #33: 0x000000018880ffa8 
libsystem_pthread.dylib`_pthread_start + 148

(in case that helps catching other similar issues).


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: mips system emulation failure with virtio
  2023-09-06 15:50     ` Philippe Mathieu-Daudé
  2023-09-06 16:06       ` Philippe Mathieu-Daudé
@ 2023-09-06 16:46       ` Richard Purdie
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Purdie @ 2023-09-06 16:46 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Alex Bennée, Paolo Bonzini,
	Peter Maydell, David Hildenbrand
  Cc: qemu-devel, Aurelien Jarno, Jiaxun Yang, Aleksandar Rikalo,
	Richard Henderson

On Wed, 2023-09-06 at 17:50 +0200, Philippe Mathieu-Daudé wrote:
> +rth/pm215/dhildenb
> 
> On 5/9/23 16:50, Richard Purdie wrote:
> > On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote:
> > > Richard Purdie <richard.purdie@linuxfoundation.org> writes:
> > > 
> > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets.
> > > > 
> > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu:
> > > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and
> > > > mips64 break, hanging at boot unable to find a rootfs.
> > > 
> > > (Widen CC list)
> > > 
> > > > 
> > > > We use virtio for network and disk and both of those change in the
> > > > bootlog from messages like:
> > > > 
> > > > [    1.726118] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
> > > > [    1.728864] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
> > > > [    1.729948] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
> > > > ...
> > > > [    2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues
> > > > [    2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical
> > > > 
> > > > to:
> > > > 
> > > > [    1.777051] virtio-pci 0000:00:13.0: enabling device (0000 -> 0003)
> > > > [    1.779822] virtio-pci 0000:00:14.0: enabling device (0000 -> 0003)
> > > > [    1.780926] virtio-pci 0000:00:15.0: enabling device (0000 -> 0003)
> > > > ...
> > > > [    1.894852] virtio_rng: probe of virtio1 failed with error -28
> > > > ...
> > > > [    2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues
> > > > [    2.064260] virtio_blk: probe of virtio2 failed with error -28
> > > > [    2.069080] virtio_net: probe of virtio0 failed with error -28
> > > > 
> > > > 
> > > > i.e. the virtio drivers no longer work.
> > > 
> > > Interesting, as you say this seems to be VirtIO specific as the baseline
> > > tests (using IDE) work fine:
> > > 
> > >    ➜  ./tests/venv/bin/avocado run ./tests/avocado/tuxrun_baselines.py:test_mips64
> > >    JOB ID     : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b
> > >    JOB LOG    : /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log
> > >     (1/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: PASS (12.19 s)
> > >     (2/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: PASS (11.78 s)
> > >    RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
> > >    JOB TIME   : 24.79 s
> > > 
> > > > I tested with current qemu master
> > > > (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken
> > > > there.
> > > > 
> > > > Is this issue known about?
> > > 
> > > Could you raise a bug at:
> > > 
> > >    https://gitlab.com/qemu-project/qemu/-/issues
> > 
> > Done, https://gitlab.com/qemu-project/qemu/-/issues/1866
> > 
> > > I'm curious why MIPS VirtIO is affected but nothing else is...
> > 
> > Me too, it seems there is a real code issue somewhere in this...
> 
> This seems to fix the issue for me, but I'm not really sure what
> I'm doing after various hours debugging, so sharing here before
> I take some rest:
> 
> -- >8 --
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index 18277ddd67..ec31ebcb56 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -2517,7 +2517,7 @@ static void tcg_commit(MemoryListener *listener)
>        * That said, the listener is also called during realize, before
>        * all of the tcg machinery for run-on is initialized: thus halt_cond.
>        */
> -    if (cpu->halt_cond) {
> +    if (cpu->halt_cond && !qemu_cpu_is_self(cpu)) {
>           async_run_on_cpu(cpu, tcg_commit_cpu, RUN_ON_CPU_HOST_PTR(cpuas));
>       } else {
>           tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas));

I tested with the above and confirmed it does fix 8.1.0 for the mips
test I was using.

Thanks!

Richard


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-09-06 16:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-05 13:19 mips system emulation failure with virtio Richard Purdie
2023-09-05 13:59 ` Alex Bennée
2023-09-05 14:50   ` Richard Purdie
2023-09-05 15:12     ` Philippe Mathieu-Daudé
2023-09-05 15:53       ` Richard Purdie
2023-09-05 16:46         ` Philippe Mathieu-Daudé
2023-09-05 16:50           ` Richard Purdie
2023-09-06 15:50     ` Philippe Mathieu-Daudé
2023-09-06 16:06       ` Philippe Mathieu-Daudé
2023-09-06 16:46       ` Richard Purdie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).