* Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
2024-04-05 17:30 ` Michael Tokarev
@ 2024-04-08 9:35 ` Philippe Mathieu-Daudé
2024-04-08 20:12 ` Paolo Bonzini
2024-04-09 11:13 ` Zhao Liu
2 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-04-08 9:35 UTC (permalink / raw)
To: Michael Tokarev, Paolo Bonzini, qemu-devel
Cc: Mark Cave-Ayland, qemu-stable, Richard Henderson,
Giuseppe Ghibò
On 5/4/24 19:30, Michael Tokarev wrote:
> 01.04.2024 09:02, Michael Tokarev:
>
>> Anyone can guess why this rather trivial and obviously correct patch
>> causes segfaults
>> in a few tests in staging-7.2 - when run in tcg mode, namely:
>>
>> pxe-test
>> migration-test
>> boot-serial-test
>> bios-tables-test
>> vmgenid-test
>> cdrom-test
>>
>> When reverting this single commit from staging-7.2, it all works fine
>> again.
>
> It sigsegvs in probe_access_internal():
>
> CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one
> returns NULL,
>
> and next there's a call
>
> tlb_addr = tlb_read_ofs(entry, elt_ofs);
>
> which fails.
>
> #0 0x0000555555c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at
> 7.2/accel/tcg/cputlb.c:1455
> #1 probe_access_internal
> (env=0x555556a862a0, addr=4294967280,
> fault_size=fault_size@entry=1,
> access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5,
> nonfault=nonfault@entry=false, phost=0x7fffea4d32a0,
> pfull=0x7fffea4d3298, retaddr=0)
> at 7.2/accel/tcg/cputlb.c:1555
> #2 0x0000555555c62aba in get_page_addr_code_hostp
> (env=<optimized out>, addr=addr@entry=4294967280,
> hostp=hostp@entry=0x0)
> at 7.2/accel/tcg/cputlb.c:1691
> #3 0x0000555555c52b54 in get_page_addr_code (addr=4294967280,
> env=<optimized out>)
> at 7.2/include/exec/exec-all.h:714
> #4 tb_htable_lookup
> (cpu=cpu@entry=0x555556a85530, pc=pc@entry=4294967280,
> cs_base=cs_base@entry=4294901760, flags=flags@entry=64,
> cflags=cflags@entry=4278190080) at 7.2/accel/tcg/cpu-exec.c:236
> #5 0x0000555555c53e8e in tb_lookup
> (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280,
> cpu=0x555556a85530)
> at 7.2/accel/tcg/cpu-exec.c:270
> #6 cpu_exec (cpu=cpu@entry=0x555556a85530) at
> 7.2/accel/tcg/cpu-exec.c:1001
> #7 0x0000555555c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x555556a85530)
> at 7.2/accel/tcg/tcg-accel-ops.c:69
> #8 0x0000555555c75e80 in mttcg_cpu_thread_fn
> (arg=arg@entry=0x555556a85530)
> at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95
> #9 0x0000555555ded098 in qemu_thread_start (args=0x555556adac40)
> at 7.2/util/qemu-thread-posix.c:505
> #10 0x00007ffff5793134 in start_thread (arg=<optimized out>)
> #11 0x00007ffff58137dc in clone3 ()
>
>
> I'm removing this whole set from 7.2 for now:
>
> 2cc68629a6fc target/i386: fix direction of "32-bit MMU" test
> 90f641531c78 target/i386: use separate MMU indexes for 32-bit accesses
> 5f97afe2543f target/i386: introduce function to query MMU indices
Cc'ing Giuseppe Ghibò for
https://gitlab.com/qemu-project/qemu/-/issues/2264
> This leaves us with
>
> b1661801c184 "target/i386: Fix physical address truncation"
>
> but without its fix, 2cc68629a6fc.
>
> It looks like I should revert b1661801c184 from 7.2 too, re-opening
> https://gitlab.com/qemu-project/qemu/-/issues/2040 - since to me it isn't
> clear if this change actually fixes this issue or not without the
> previous change, 90f641531c78, which is missing from 7.2.10.
>
> At the very least this will simplify possible another attempt to
> cherry-pick
> these changes to 7.2.
>
> Thanks,
>
> /mjt
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
2024-04-05 17:30 ` Michael Tokarev
2024-04-08 9:35 ` Philippe Mathieu-Daudé
@ 2024-04-08 20:12 ` Paolo Bonzini
2024-04-08 20:18 ` Michael Tokarev
2024-04-09 11:13 ` Zhao Liu
2 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2024-04-08 20:12 UTC (permalink / raw)
To: Michael Tokarev
Cc: qemu-devel, Mark Cave-Ayland, qemu-stable, Richard Henderson
[-- Attachment #1: Type: text/plain, Size: 3162 bytes --]
Il ven 5 apr 2024, 19:30 Michael Tokarev <mjt@tls.msk.ru> ha scritto:
> 01.04.2024 09:02, Michael Tokarev:
>
> > Anyone can guess why this rather trivial and obviously correct patch
> causes segfaults
> > in a few tests in staging-7.2 - when run in tcg mode, namely:
> >
> > pxe-test
> > migration-test
> > boot-serial-test
> > bios-tables-test
> > vmgenid-test
> > cdrom-test
> >
> > When reverting this single commit from staging-7.2, it all works fine
> again.
>
> It sigsegvs in probe_access_internal():
>
> CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns
> NULL,
>
> and next there's a call
>
> tlb_addr = tlb_read_ofs(entry, elt_ofs);
>
> which fails.
>
I will take a look tomorrow.
Paolo
> #0 0x0000555555c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at
> 7.2/accel/tcg/cputlb.c:1455
> #1 probe_access_internal
> (env=0x555556a862a0, addr=4294967280, fault_size=fault_size@entry=1,
> access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5,
> nonfault=nonfault@entry=false, phost=0x7fffea4d32a0,
> pfull=0x7fffea4d3298, retaddr=0)
> at 7.2/accel/tcg/cputlb.c:1555
> #2 0x0000555555c62aba in get_page_addr_code_hostp
> (env=<optimized out>, addr=addr@entry=4294967280, hostp=hostp@entry
> =0x0)
> at 7.2/accel/tcg/cputlb.c:1691
> #3 0x0000555555c52b54 in get_page_addr_code (addr=4294967280,
> env=<optimized out>)
> at 7.2/include/exec/exec-all.h:714
> #4 tb_htable_lookup
> (cpu=cpu@entry=0x555556a85530, pc=pc@entry=4294967280,
> cs_base=cs_base@entry=4294901760, flags=flags@entry=64,
> cflags=cflags@entry=4278190080) at
> 7.2/accel/tcg/cpu-exec.c:236
> #5 0x0000555555c53e8e in tb_lookup
> (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280,
> cpu=0x555556a85530)
> at 7.2/accel/tcg/cpu-exec.c:270
> #6 cpu_exec (cpu=cpu@entry=0x555556a85530) at
> 7.2/accel/tcg/cpu-exec.c:1001
> #7 0x0000555555c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x555556a85530)
> at 7.2/accel/tcg/tcg-accel-ops.c:69
> #8 0x0000555555c75e80 in mttcg_cpu_thread_fn (arg=arg@entry
> =0x555556a85530)
> at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95
> #9 0x0000555555ded098 in qemu_thread_start (args=0x555556adac40)
> at 7.2/util/qemu-thread-posix.c:505
> #10 0x00007ffff5793134 in start_thread (arg=<optimized out>)
> #11 0x00007ffff58137dc in clone3 ()
>
>
> I'm removing this whole set from 7.2 for now:
>
> 2cc68629a6fc target/i386: fix direction of "32-bit MMU" test
> 90f641531c78 target/i386: use separate MMU indexes for 32-bit accesses
> 5f97afe2543f target/i386: introduce function to query MMU indices
>
> This leaves us with
>
> b1661801c184 "target/i386: Fix physical address truncation"
>
> but without its fix, 2cc68629a6fc.
>
> It looks like I should revert b1661801c184 from 7.2 too, re-opening
> https://gitlab.com/qemu-project/qemu/-/issues/2040 - since to me it isn't
> clear if this change actually fixes this issue or not without the
> previous change, 90f641531c78, which is missing from 7.2.10.
>
> At the very least this will simplify possible another attempt to
> cherry-pick
> these changes to 7.2.
>
> Thanks,
>
> /mjt
>
>
[-- Attachment #2: Type: text/html, Size: 4191 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
2024-04-08 20:12 ` Paolo Bonzini
@ 2024-04-08 20:18 ` Michael Tokarev
0 siblings, 0 replies; 11+ messages in thread
From: Michael Tokarev @ 2024-04-08 20:18 UTC (permalink / raw)
To: Paolo Bonzini
Cc: qemu-devel, Mark Cave-Ayland, qemu-stable, Richard Henderson
08.04.2024 23:12, Paolo Bonzini wrote:
> Il ven 5 apr 2024, 19:30 Michael Tokarev <mjt@tls.msk.ru <mailto:mjt@tls.msk.ru>> ha scritto:
> It sigsegvs in probe_access_internal():
>
> CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns NULL,
>
> and next there's a call
>
> tlb_addr = tlb_read_ofs(entry, elt_ofs);
>
> which fails.
>
>
> I will take a look tomorrow.
The changes on top of 7.2.10 are available at https://gitlab.com/mjt0k/qemu/-/commits/7.2-i386-mmu-idx/ -
I'm still blaming myself for bad back-port, but I can't find where I failed.
Thanks,
/mjt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
2024-04-05 17:30 ` Michael Tokarev
2024-04-08 9:35 ` Philippe Mathieu-Daudé
2024-04-08 20:12 ` Paolo Bonzini
@ 2024-04-09 11:13 ` Zhao Liu
2024-04-09 11:02 ` Paolo Bonzini
2 siblings, 1 reply; 11+ messages in thread
From: Zhao Liu @ 2024-04-09 11:13 UTC (permalink / raw)
To: Michael Tokarev, Paolo Bonzini
Cc: qemu-devel, Mark Cave-Ayland, qemu-stable, Richard Henderson,
Philippe Mathieu-Daudé, Giuseppe Ghibò, Zhao Liu
Hi Michael & Paolo,
On Fri, Apr 05, 2024 at 08:30:43PM +0300, Michael Tokarev wrote:
> Date: Fri, 5 Apr 2024 20:30:43 +0300
> From: Michael Tokarev <mjt@tls.msk.ru>
> Subject: Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
>
> 01.04.2024 09:02, Michael Tokarev:
>
> > Anyone can guess why this rather trivial and obviously correct patch causes segfaults
> > in a few tests in staging-7.2 - when run in tcg mode, namely:
> >
> > pxe-test
> > migration-test
> > boot-serial-test
> > bios-tables-test
> > vmgenid-test
> > cdrom-test
> >
> > When reverting this single commit from staging-7.2, it all works fine again.
>
> It sigsegvs in probe_access_internal():
>
> CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns NULL,
>
> and next there's a call
>
> tlb_addr = tlb_read_ofs(entry, elt_ofs);
>
> which fails.
>
> #0 0x0000555555c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at 7.2/accel/tcg/cputlb.c:1455
> #1 probe_access_internal
> (env=0x555556a862a0, addr=4294967280, fault_size=fault_size@entry=1,
> access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5,
> nonfault=nonfault@entry=false, phost=0x7fffea4d32a0, pfull=0x7fffea4d3298,
> retaddr=0)
> at 7.2/accel/tcg/cputlb.c:1555
> #2 0x0000555555c62aba in get_page_addr_code_hostp
> (env=<optimized out>, addr=addr@entry=4294967280, hostp=hostp@entry=0x0)
> at 7.2/accel/tcg/cputlb.c:1691
> #3 0x0000555555c52b54 in get_page_addr_code (addr=4294967280, env=<optimized out>)
> at 7.2/include/exec/exec-all.h:714
> #4 tb_htable_lookup
> (cpu=cpu@entry=0x555556a85530, pc=pc@entry=4294967280,
> cs_base=cs_base@entry=4294901760, flags=flags@entry=64,
> cflags=cflags@entry=4278190080) at 7.2/accel/tcg/cpu-exec.c:236
> #5 0x0000555555c53e8e in tb_lookup
> (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280, cpu=0x555556a85530)
> at 7.2/accel/tcg/cpu-exec.c:270
> #6 cpu_exec (cpu=cpu@entry=0x555556a85530) at 7.2/accel/tcg/cpu-exec.c:1001
> #7 0x0000555555c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x555556a85530)
> at 7.2/accel/tcg/tcg-accel-ops.c:69
> #8 0x0000555555c75e80 in mttcg_cpu_thread_fn (arg=arg@entry=0x555556a85530)
> at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95
> #9 0x0000555555ded098 in qemu_thread_start (args=0x555556adac40)
> at 7.2/util/qemu-thread-posix.c:505
> #10 0x00007ffff5793134 in start_thread (arg=<optimized out>)
> #11 0x00007ffff58137dc in clone3 ()
>
I debugged it manually, and found the problem occurs in tlb_index() with
mmu_idx=5.
For v7.2, the maximum mmu index supported by i386 is 4 (since
NB_MMU_MODES = 5 defined in target/i386/cpu-param.h).
On Michael's 7.2-i386-mmu-idx tree, the commit 9fc3a7828d25 ("target/i386:
use separate MMU indexes for 32-bit accesses") introduced more indexes
without relaxing the NB_MMU_MODES for i386.
Before this fix, probe_access_internal() just got the wrong mmu_idx as 4,
and it's not out of bounds. After this fix, the right mmu_idx=5 is truly
out of bounds.
On the master branch, there's no such issue since the commits ffd824f3f32d
("include/exec: Set default NB_MMU_MODES to 16") and 6787318a5d86
("target/i386: Remove NB_MMU_MODES define") relaxed upper limit of MMU
index for i386.
So these 2 commits should also be picked (with these 2 commits, tests in
"make check" passed).
However, the cleanup for NB_MMU_MODES is a big series (total 23 patches,
from the commit ffd824f3f32d ("include/exec: Set default NB_MMU_MODES to
16") to 00da6b49a227 ("include/exec: Remove guards around NB_MMU_MODES")),
maybe changes in other arches should be picked together?
In addition, I think maybe we could add assert() in tlb_index() and
tlb_entry() to ensure the mmu_idx not exceed the limit of NB_MMU_MODES.
But I'm a little unsure if this will hurt performance, because these 2
helpers look like the hotspot functions.
-Zhao
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
2024-04-09 11:13 ` Zhao Liu
@ 2024-04-09 11:02 ` Paolo Bonzini
2024-04-09 18:21 ` Michael Tokarev
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2024-04-09 11:02 UTC (permalink / raw)
To: Zhao Liu
Cc: Michael Tokarev, qemu-devel, Mark Cave-Ayland, qemu-stable,
Richard Henderson, Philippe Mathieu-Daudé,
Giuseppe Ghibò
On Tue, Apr 9, 2024 at 12:59 PM Zhao Liu <zhao1.liu@intel.com> wrote:
>
> Hi Michael & Paolo,
>
> On Fri, Apr 05, 2024 at 08:30:43PM +0300, Michael Tokarev wrote:
> > Date: Fri, 5 Apr 2024 20:30:43 +0300
> > From: Michael Tokarev <mjt@tls.msk.ru>
> > Subject: Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
> >
> > 01.04.2024 09:02, Michael Tokarev:
> >
> > > Anyone can guess why this rather trivial and obviously correct patch causes segfaults
> > > in a few tests in staging-7.2 - when run in tcg mode, namely:
> > >
> > > pxe-test
> > > migration-test
> > > boot-serial-test
> > > bios-tables-test
> > > vmgenid-test
> > > cdrom-test
> > >
> > > When reverting this single commit from staging-7.2, it all works fine again.
> >
> > It sigsegvs in probe_access_internal():
> >
> > CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns NULL,
> >
> > and next there's a call
> >
> > tlb_addr = tlb_read_ofs(entry, elt_ofs);
> >
> > which fails.
> >
> > #0 0x0000555555c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at 7.2/accel/tcg/cputlb.c:1455
> > #1 probe_access_internal
> > (env=0x555556a862a0, addr=4294967280, fault_size=fault_size@entry=1,
> > access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5,
> > nonfault=nonfault@entry=false, phost=0x7fffea4d32a0, pfull=0x7fffea4d3298,
> > retaddr=0)
> > at 7.2/accel/tcg/cputlb.c:1555
> > #2 0x0000555555c62aba in get_page_addr_code_hostp
> > (env=<optimized out>, addr=addr@entry=4294967280, hostp=hostp@entry=0x0)
> > at 7.2/accel/tcg/cputlb.c:1691
> > #3 0x0000555555c52b54 in get_page_addr_code (addr=4294967280, env=<optimized out>)
> > at 7.2/include/exec/exec-all.h:714
> > #4 tb_htable_lookup
> > (cpu=cpu@entry=0x555556a85530, pc=pc@entry=4294967280,
> > cs_base=cs_base@entry=4294901760, flags=flags@entry=64,
> > cflags=cflags@entry=4278190080) at 7.2/accel/tcg/cpu-exec.c:236
> > #5 0x0000555555c53e8e in tb_lookup
> > (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280, cpu=0x555556a85530)
> > at 7.2/accel/tcg/cpu-exec.c:270
> > #6 cpu_exec (cpu=cpu@entry=0x555556a85530) at 7.2/accel/tcg/cpu-exec.c:1001
> > #7 0x0000555555c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x555556a85530)
> > at 7.2/accel/tcg/tcg-accel-ops.c:69
> > #8 0x0000555555c75e80 in mttcg_cpu_thread_fn (arg=arg@entry=0x555556a85530)
> > at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95
> > #9 0x0000555555ded098 in qemu_thread_start (args=0x555556adac40)
> > at 7.2/util/qemu-thread-posix.c:505
> > #10 0x00007ffff5793134 in start_thread (arg=<optimized out>)
> > #11 0x00007ffff58137dc in clone3 ()
> >
>
> I debugged it manually, and found the problem occurs in tlb_index() with
> mmu_idx=5.
>
> For v7.2, the maximum mmu index supported by i386 is 4 (since
> NB_MMU_MODES = 5 defined in target/i386/cpu-param.h).
>
> On Michael's 7.2-i386-mmu-idx tree, the commit 9fc3a7828d25 ("target/i386:
> use separate MMU indexes for 32-bit accesses") introduced more indexes
> without relaxing the NB_MMU_MODES for i386.
>
> Before this fix, probe_access_internal() just got the wrong mmu_idx as 4,
> and it's not out of bounds. After this fix, the right mmu_idx=5 is truly
> out of bounds.
>
> On the master branch, there's no such issue since the commits ffd824f3f32d
> ("include/exec: Set default NB_MMU_MODES to 16") and 6787318a5d86
> ("target/i386: Remove NB_MMU_MODES define") relaxed upper limit of MMU
> index for i386.
Thanks Zhao! Alternatively, it's enough to set NB_MMU_MODES to 8 in
commit 9fc3a7828d25.
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
2024-04-09 11:02 ` Paolo Bonzini
@ 2024-04-09 18:21 ` Michael Tokarev
0 siblings, 0 replies; 11+ messages in thread
From: Michael Tokarev @ 2024-04-09 18:21 UTC (permalink / raw)
To: Paolo Bonzini, Zhao Liu
Cc: qemu-devel, Mark Cave-Ayland, qemu-stable, Richard Henderson,
Philippe Mathieu-Daudé, Giuseppe Ghibò
09.04.2024 14:02, Paolo Bonzini wrote:
> On Tue, Apr 9, 2024 at 12:59 PM Zhao Liu <zhao1.liu@intel.com> wrote:
>>
>> Hi Michael & Paolo,
>> I debugged it manually, and found the problem occurs in tlb_index() with
>> mmu_idx=5.
>>
>> For v7.2, the maximum mmu index supported by i386 is 4 (since
>> NB_MMU_MODES = 5 defined in target/i386/cpu-param.h).
>> On Michael's 7.2-i386-mmu-idx tree, the commit 9fc3a7828d25 ("target/i386:
>> use separate MMU indexes for 32-bit accesses") introduced more indexes
>> without relaxing the NB_MMU_MODES for i386.
Oh well. I starred a few times at the new MMU_foo_IDXes in v8.2.0-1876-g90f641531c78
"target/i386: use separate MMU indexes for 32-bit accesses" a few times, it looked
somehow wrong, but it didn't occur to me what exactly was wrong there :)
>> Before this fix, probe_access_internal() just got the wrong mmu_idx as 4,
>> and it's not out of bounds. After this fix, the right mmu_idx=5 is truly
>> out of bounds.
>>
>> On the master branch, there's no such issue since the commits ffd824f3f32d
>> ("include/exec: Set default NB_MMU_MODES to 16") and 6787318a5d86
>> ("target/i386: Remove NB_MMU_MODES define") relaxed upper limit of MMU
>> index for i386.
>
> Thanks Zhao! Alternatively, it's enough to set NB_MMU_MODES to 8 in
> commit 9fc3a7828d25.
Thank you, both of you, this is exactly the prob. I fixed it by increasing
NB_MMU_NODES in "use separate MMU indexes for 32-bit accesses", since this
is the change which introduces the new indexes.
Now 7.2 work again, and appears to be under control, which is an excellent
news! :)
/mjt
^ permalink raw reply [flat|nested] 11+ messages in thread