* [Qemu-devel] How to debug crash in TCG code?
@ 2017-07-26 22:33 BALATON Zoltan
[not found] ` <alpine.BSF.2.21.1708192307590.44932@zero.eik.bme.hu>
0 siblings, 1 reply; 4+ messages in thread
From: BALATON Zoltan @ 2017-07-26 22:33 UTC (permalink / raw)
To: qemu-devel
Hello,
I'm getting a segfault in generated code that I don't know how to debug
further. The back trace shows:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe87f7700 (LWP 24372)]
0x00005555557ee0a1 in io_readx (env=0x7fffe88002a0, iotlbentry=0x7fffe8811d60, addr=3623882752, retaddr=140737096497196, size=2)
at accel/tcg/cputlb.c:766
766 if (mr->global_locking) {
(gdb) bt
#0 0x00005555557ee0a1 in io_readx (env=0x7fffe88002a0, iotlbentry=0x7fffe8811d60, addr=3623882752, retaddr=140737096497196, size=2)
at accel/tcg/cputlb.c:766
#1 0x00005555557eede9 in io_readw (env=0x7fffe88002a0, mmu_idx=1, index=4, addr=3623882752, retaddr=140737096497196)
at softmmu_template.h:104
#2 0x00005555557ef1f0 in helper_be_lduw_mmu (env=0x7fffe88002a0, addr=3623882752, oi=145, retaddr=140737096497196)
at softmmu_template.h:208
#3 0x00007fffe8a4b8d3 in code_gen_buffer ()
#4 0x00005555557f69b8 in cpu_tb_exec (cpu=0x7fffe87f8010, itb=0x7fffe8a4b660 <code_gen_buffer+1242678>)
at accel/tcg/cpu-exec.c:166
#5 0x00005555557f769f in cpu_loop_exec_tb (cpu=0x7fffe87f8010, tb=0x7fffe8a4b660 <code_gen_buffer+1242678>,
last_tb=0x7fffe87f6af8, tb_exit=0x7fffe87f6af4) at accel/tcg/cpu-exec.c:578
#6 0x00005555557f7992 in cpu_exec (cpu=0x7fffe87f8010) at accel/tcg/cpu-exec.c:676
#7 0x00005555557c2955 in tcg_cpu_exec (cpu=0x7fffe87f8010) at cpus.c:1270
#8 0x00005555557c2b8c in qemu_tcg_rr_cpu_thread_fn (arg=0x7fffe87f8010) at cpus.c:1365
#9 0x00007ffff5d515bd in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff42d062d in clone () from /lib64/libc.so.6
(gdb) p mr
$1 = (MemoryRegion *) 0x0
This is happening while reading from an emulated ATAPI DVD and happens
after several successful reads from the same device with similar calls
succeeding without a problem until hitting the above error. The point
where this happens seems to depend on the ammount of guest code executed.
The more code is there, the sooner this happens. (This is running in
TCG ppc-softmmu on x86_64 host in case that's relevant but I can't make
an easy test case to reproduce it.)
First I thought it may be related to MTTCG or removing the iothread lock
but I could also get the same with 791158d9, where the back trace is:
#0 0x00005555557e1de5 in memory_region_access_valid (mr=0x0, addr=0, size=2, is_write=false) at memory.c:1204
#1 0x00005555557e200a in memory_region_dispatch_read (mr=0x0, addr=0, pval=0x7fffe4854488, size=2, attrs=...)
at memory.c:1268
#2 0x00005555557e7f9c in io_readx (env=0x7ffff7e232a0, iotlbentry=0x7ffff7e34d58, addr=3623882752,
retaddr=140737066697996, size=2) at cputlb.c:506
#3 0x00005555557e8a9e in io_readw (env=0x7ffff7e232a0, mmu_idx=1, index=4, addr=3623882752, retaddr=140737066697996)
at softmmu_template.h:104
#4 0x00005555557e8eb0 in helper_be_lduw_mmu (env=0x7ffff7e232a0, addr=3623882752, oi=145, retaddr=140737066697996)
at softmmu_template.h:208
#5 0x00007fffe6de05b3 in code_gen_buffer ()
#6 0x0000555555783fca in cpu_tb_exec (cpu=0x7ffff7e1b010, itb=0x7fffe49a2080) at cpu-exec.c:164
#7 0x0000555555784b97 in cpu_loop_exec_tb (cpu=0x7ffff7e1b010, tb=0x7fffe49a2080, last_tb=0x7fffe4854af8,
tb_exit=0x7fffe4854af4, sc=0x7fffe4854b10) at cpu-exec.c:550
#8 0x0000555555784ea0 in cpu_exec (cpu=0x7ffff7e1b010) at cpu-exec.c:655
#9 0x00005555557c8da3 in tcg_cpu_exec (cpu=0x7ffff7e1b010) at cpus.c:1253
#10 0x00005555557c900d in qemu_tcg_cpu_thread_fn (arg=0x7ffff7e1b010) at cpus.c:1345
#11 0x00007ffff45b65bd in start_thread () from /lib64/libpthread.so.0
#12 0x00007ffff42f262d in clone () from /lib64/libc.so.6
So it seems to be caused not by thread locking issues by recent changes
but maybe by somehow referencing an invalid iotlb entry in a TB. My theory
(without knowing anything about how this part of QEMU works) is that as
code is executed instruction and data exceptions are triggered which make
changes in TLB entries but this does not correctly invalidate a TB that
already references this entry and this causes the crash when this happens
(but it works until the TLB is not changed which explains why less code
works and more code which makes these exceptions more frequent triggers it
sooner). But I have no idea if this theory is correct or how to verify it
and where to look for the problem and fix.
Does anyone have any idea that could help or point me to the right
direction please?
Thank you,
BALATON Zoltan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] How to debug crash in TCG code?
[not found] ` <d4ffee34-8a82-95ea-bedc-ed37ae5fae67@ilande.co.uk>
@ 2017-10-08 16:14 ` BALATON Zoltan
2017-10-15 11:30 ` BALATON Zoltan
0 siblings, 1 reply; 4+ messages in thread
From: BALATON Zoltan @ 2017-10-08 16:14 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-ppc
Hello,
I'm still trying to debug a crash that I don't understand why is
happening and how to debug it further. Any hints are apreciated. I've
described the problem previously in
http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00249.html
but I've dug deeper now, here's what I've found.
I'm running with -accel tcg,thread=single -singlestep so it does not seem
to be a multi threading problem. The seg fault happens when U-Boot running
on the emulated embedded PPC SoC accesses a PCI device but after several
successful iteration of the same access so it seems to be some internal
inconsistency that arrises while QEMU is running, not something generally
broken. It is also dependent on code run by the client or QEMU, so e.g. it
disappears when I enable -d in_asm or use a different U-Boot version. I'm
not sure what exactly causes the bug to happen or disappear but I can
reproduce it reliably with certain images at the same place. It always
happens during io_readx or io_writex. Here's the info I could get with a
debugger with some additional debug and trace messages enabled:
Here's the PCI device detected by U-Boot and BARs mapped:
The device is implemended here:
http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00246.html
4744@1507472117.454656:pci_cfg_read sii3112 01:0 @0xe -> 0x0
tlb_set_page_with_attrs: vaddr=0ffb0800 paddr=0x000000000ffb0800 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff73c00 paddr=0x000000000ff73c00 prot=7 idx=1
4744@1507472117.454676:pci_cfg_read sii3112 01:0 @0x0 -> 0x1095
4744@1507472117.454766:pci_cfg_read sii3112 01:0 @0x2 -> 0x3112
4744@1507472117.454775:pci_cfg_read sii3112 01:0 @0xa -> 0x104
tlb_set_page_with_attrs: vaddr=0ff74000 paddr=0x000000000ff74000 prot=7 idx=1
4744@1507472117.454794:pci_cfg_read sii3112 01:0 @0xa -> 0x104
4744@1507472117.454905:pci_cfg_read sii3112 01:0 @0x4 -> 0x0
4744@1507472117.454916:pci_cfg_write sii3112 01:0 @0x10 <- 0xffffffff
4744@1507472117.454927:pci_cfg_read sii3112 01:0 @0x10 -> 0xfffffff9
4744@1507472117.456524:pci_cfg_write sii3112 01:0 @0x10 <- 0x1000
tlb_set_page_with_attrs: vaddr=0ff74800 paddr=0x000000000ff74800 prot=7 idx=1
4744@1507472117.456823:pci_cfg_write sii3112 01:0 @0x14 <- 0xffffffff
4744@1507472117.456833:pci_cfg_read sii3112 01:0 @0x14 -> 0xfffffffd
4744@1507472117.456844:pci_cfg_write sii3112 01:0 @0x14 <- 0x1008
4744@1507472117.456852:pci_cfg_write sii3112 01:0 @0x18 <- 0xffffffff
4744@1507472117.456859:pci_cfg_read sii3112 01:0 @0x18 -> 0xfffffff9
4744@1507472117.456869:pci_cfg_write sii3112 01:0 @0x18 <- 0x1010
4744@1507472117.456878:pci_cfg_write sii3112 01:0 @0x1c <- 0xffffffff
4744@1507472117.456885:pci_cfg_read sii3112 01:0 @0x1c -> 0xfffffffd
4744@1507472117.456895:pci_cfg_write sii3112 01:0 @0x1c <- 0x1018
4744@1507472117.456903:pci_cfg_write sii3112 01:0 @0x20 <- 0xffffffff
4744@1507472117.456911:pci_cfg_read sii3112 01:0 @0x20 -> 0xfffffff1
4744@1507472117.456920:pci_cfg_write sii3112 01:0 @0x20 <- 0x1020
4744@1507472117.456929:pci_cfg_write sii3112 01:0 @0x24 <- 0xffffffff
4744@1507472117.456936:pci_cfg_read sii3112 01:0 @0x24 -> 0xfffffe00
4744@1507472117.457173:pci_cfg_write sii3112 01:0 @0x24 <- 0x80000000
4744@1507472117.457208:pci_cfg_write sii3112 01:0 @0x4 <- 0x7
4744@1507472117.457213:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0 0,0x1000+0x8
tlb_flush_nocheck: (count: 23)
4744@1507472117.457488:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0 1,0x1008+0x4
tlb_flush_nocheck: (count: 24)
4744@1507472117.457769:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0 2,0x1010+0x8
tlb_flush_nocheck: (count: 25)
4744@1507472117.458048:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0 3,0x1018+0x4
tlb_flush_nocheck: (count: 26)
4744@1507472117.458324:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0 4,0x1020+0x10
tlb_flush_nocheck: (count: 27)
4744@1507472117.458604:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0 5,0x80000000+0x200
(BAR0-4 are alias memory regions to BAR5 so all access should go to BAR5
at the end even if U-Boot accesses it through BAR0-4)
Here U-Boot starts using the device:
tlb_set_page_with_attrs: vaddr=dec00000 paddr=0x0000000c0ec00000 prot=3 idx=1
4744@1507472121.253966:pci_cfg_read ppc4xx-host-bridge 00:0 @0xe -> 0x0
4744@1507472121.253982:pci_cfg_read ppc4xx-host-bridge 00:0 @0x0 -> 0x1014
4744@1507472121.253992:pci_cfg_read ppc4xx-host-bridge 00:0 @0x2 -> 0x27f
4744@1507472121.254005:pci_cfg_read sii3112 01:0 @0xe -> 0x0
4744@1507472121.254015:pci_cfg_read sii3112 01:0 @0x0 -> 0x1095
4744@1507472121.254024:pci_cfg_read sii3112 01:0 @0x2 -> 0x3112
tlb_set_page_with_attrs: vaddr=00000800 paddr=0x0000000000000800 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff6f000 paddr=0x000000000ff6f000 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff74000 paddr=0x000000000ff74000 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff73c00 paddr=0x000000000ff73c00 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ffb0c00 paddr=0x000000000ffb0c00 prot=7 idx=1
4744@1507472121.254748:pci_cfg_write sii3112 01:0 @0x4 <- 0x7
4744@1507472121.254846:pci_cfg_read sii3112 01:0 @0x8 -> 0x1040001
4744@1507472121.255085:pci_cfg_write sii3112 01:0 @0xc <- 0x1
4744@1507472121.255228:pci_cfg_read sii3112 01:0 @0x10 -> 0x1001
4744@1507472121.255417:pci_cfg_read sii3112 01:0 @0x18 -> 0x1011
tlb_set_page_with_attrs: vaddr=0ffa3800 paddr=0x000000000ffa3800 prot=7 idx=1
And can talk to the device emulation:
tlb_set_page_with_attrs: vaddr=0ff89000 paddr=0x000000000ff89000 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
4744@1507472121.402003:sii3112_write bmdma: write (size 1) 0x86 : 0xe0
Address 0x86 in BAR5 is accessed via vaddr d8001006 which is an alias via BAR0
tlb_set_page_with_attrs: vaddr=0ff8a400 paddr=0x000000000ff8a400 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ffa3400 paddr=0x000000000ffa3400 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ffc7800 paddr=0x000000000ffc7800 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
4744@1507472121.513050:sii3112_set_irq channel 0 level 0
4744@1507472121.513057:sii3112_read bmdma: read (size 1) 0x87 : 0x50
tlb_set_page_with_attrs: vaddr=0ff6f000 paddr=0x000000000ff6f000 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0df6d000 paddr=0x000000000df6d000 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ffc0800 paddr=0x000000000ffc0800 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff87800 paddr=0x000000000ff87800 prot=7 idx=1
4744@1507472121.613784:sii3112_write bmdma: write (size 1) 0xc6 : 0xe0
tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
4744@1507472121.724016:sii3112_set_irq channel 1 level 0
4744@1507472121.724020:sii3112_read bmdma: read (size 1) 0xc7 : 0x00
tlb_set_page_with_attrs: vaddr=0ff8a400 paddr=0x000000000ff8a400 prot=7 idx=1
Accessing the device seems to work, it identifies the connected ATAPI CD
emulation and can read from it, until it hits null pointer:
tlb_set_page_with_attrs: vaddr=0ff89800 paddr=0x000000000ff89800 prot=7 idx=1
4744@1507472121.949834:sii3112_read bmdma: read (size 1) 0x82 : 0x02
4744@1507472121.950222:sii3112_read bmdma: read (size 2) 0x80 : 0x8005
4744@1507472121.950283:sii3112_read bmdma: read (size 2) 0x80 : 0x2100
4744@1507472121.950289:sii3112_read bmdma: read (size 2) 0x80 : 0x1f
4744@1507472121.950292:sii3112_read bmdma: read (size 2) 0x80 : 0x00
4744@1507472121.950295:sii3112_read bmdma: read (size 2) 0x80 : 0x4551
4744@1507472121.950298:sii3112_read bmdma: read (size 2) 0x80 : 0x554d
4744@1507472121.950301:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
4744@1507472121.950306:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
4744@1507472121.950309:sii3112_read bmdma: read (size 2) 0x80 : 0x4551
4744@1507472121.950312:sii3112_read bmdma: read (size 2) 0x80 : 0x554d
4744@1507472121.950316:sii3112_read bmdma: read (size 2) 0x80 : 0x4420
4744@1507472121.950319:sii3112_read bmdma: read (size 2) 0x80 : 0x4456
4744@1507472121.950323:sii3112_read bmdma: read (size 2) 0x80 : 0x522d
4744@1507472121.950326:sii3112_read bmdma: read (size 2) 0x80 : 0x4d4f
4744@1507472121.950330:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
4744@1507472121.950333:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
4744@1507472121.950338:sii3112_read bmdma: read (size 2) 0x80 : 0x2e32
4744@1507472121.950342:sii3112_set_irq channel 0 level 1
4744@1507472121.950346:sii3112_set_irq channel 0 level 1
4744@1507472121.950348:sii3112_read bmdma: read (size 2) 0x80 : 0x2b35
4744@1507472121.955573:sii3112_set_irq channel 0 level 0
4744@1507472121.955590:sii3112_read bmdma: read (size 1) 0x87 : 0x50
tlb_set_page_with_attrs: vaddr=0ff89c00 paddr=0x000000000ff89c00 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff8a000 paddr=0x000000000ff8a000 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=0ff89400 paddr=0x000000000ff89400 prot=7 idx=1
tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
4744@1507472121.957147:sii3112_write bmdma: write (size 1) 0x86 : 0xe0
4744@1507472121.957158:sii3112_set_irq channel 0 level 0
4744@1507472121.957161:sii3112_read bmdma: read (size 1) 0x87 : 0x50
4744@1507472121.957168:sii3112_write bmdma: write (size 1) 0x81 : 0x00
4744@1507472121.957172:sii3112_write bmdma: write (size 1) 0x82 : 0x00
4744@1507472121.957177:sii3112_write bmdma: write (size 1) 0x83 : 0x00
4744@1507472121.957181:sii3112_write bmdma: write (size 1) 0x84 : 0x00
4744@1507472121.957188:sii3112_write bmdma: write (size 1) 0x85 : 0x00
4744@1507472121.957198:sii3112_write bmdma: write (size 1) 0x86 : 0xe0
4744@1507472121.957204:sii3112_write bmdma: write (size 1) 0x87 : 0xa0
4744@1507472121.957278:sii3112_set_irq channel 0 level 0
4744@1507472121.957283:sii3112_read bmdma: read (size 1) 0x87 : 0x58
4744@1507472121.957289:sii3112_write bmdma: write (size 2) 0x80 : 0x1b
4744@1507472121.957294:sii3112_write bmdma: write (size 2) 0x80 : 0x00
4744@1507472121.957298:sii3112_write bmdma: write (size 2) 0x80 : 0x03
4744@1507472121.957302:sii3112_write bmdma: write (size 2) 0x80 : 0x00
4744@1507472121.957306:sii3112_write bmdma: write (size 2) 0x80 : 0x00
4744@1507472121.957309:sii3112_write bmdma: write (size 2) 0x80 : 0x00
4744@1507472121.957315:sii3112_set_irq channel 0 level 1
4744@1507472121.962334:sii3112_set_irq channel 0 level 0
4744@1507472121.962340:sii3112_read bmdma: read (size 1) 0x87 : 0x50
4744@1507472121.962346:sii3112_read bmdma: read (size 1) 0x85 : 0x00
4744@1507472121.962351:sii3112_read bmdma: read (size 1) 0x84 : 0x00
4744@1507472121.967372:sii3112_set_irq channel 0 level 0
4744@1507472121.967378:sii3112_read bmdma: read (size 1) 0x87 : 0x50
tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe87f7700 (LWP 4761)]
0x00005555557f4bc4 in io_writex (env=0x7fffe88002a0, iotlbentry=0x7fffe8811d60, mmu_idx=1, val=224, addr=3623882758, retaddr=140737099010411,
size=1) at accel/tcg/cputlb.c:803
803 if (mr->global_locking && !qemu_mutex_iothread_locked()) {
(gdb) bt
#0 0x00005555557f4bc4 in io_writex (env=0x7fffe88002a0, iotlbentry=0x7fffe8811d60, mmu_idx=1, val=224, addr=3623882758, retaddr=140737099010411,
size=1) at accel/tcg/cputlb.c:803
#1 0x00005555557f5673 in io_writeb (env=0x7fffe88002a0, mmu_idx=1, index=4, val=224 '\340', addr=3623882758, retaddr=140737099010411)
at accel/tcg/softmmu_template.h:265
#2 0x00005555557f57ff in helper_ret_stb_mmu (env=0x7fffe88002a0, addr=3623882758, val=224 '\340', oi=1, retaddr=140737099010411)
at accel/tcg/softmmu_template.h:300
#3 0x00007fffe8cb116b in code_gen_buffer ()
#4 0x00005555557fdaf1 in cpu_tb_exec (cpu=0x7fffe87f8010, itb=0x7fffe8cb0780 <code_gen_buffer+3753814>)
at accel/tcg/cpu-exec.c:166
#5 0x00005555557fe936 in cpu_loop_exec_tb (cpu=0x7fffe87f8010, tb=0x7fffe8cb0780 <code_gen_buffer+3753814>, last_tb=0x7fffe87f6af8,
tb_exit=0x7fffe87f6af4) at accel/tcg/cpu-exec.c:613
#6 0x00005555557fec29 in cpu_exec (cpu=0x7fffe87f8010) at accel/tcg/cpu-exec.c:711
#7 0x00005555557c7a05 in tcg_cpu_exec (cpu=0x7fffe87f8010) at qemu/cpus.c:1270
#8 0x00005555557c7c3c in qemu_tcg_rr_cpu_thread_fn (arg=0x7fffe87f8010) at qemu/cpus.c:1365
#9 0x00007ffff78cd5bd in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff42d062d in clone () from /lib64/libc.so.6
(gdb) p mr
$1 = (MemoryRegion *) 0x0
(gdb) p/x addr
$2 = 0xd8001006
(gdb) p *iotlbentry
$3 = {addr = 18446744070085669013, attrs = {unspecified = 1, secure = 0, user = 0, requester_id = 0}}
(gdb) p/x iotlbentry->addr
$4 = 0xffffffff27fff095
(gdb) up
#1 0x00005555557f5673 in io_writeb (env=0x7fffe88002a0, mmu_idx=1, index=4, val=224 '\340', addr=3623882758, retaddr=140737099010411)
at accel/tcg/softmmu_template.h:265
265 return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr, DATA_SIZE);
(gdb) up
#2 0x00005555557f57ff in helper_ret_stb_mmu (env=0x7fffe88002a0, addr=3623882758, val=224 '\340', oi=1, retaddr=140737099010411)
at accel/tcg/softmmu_template.h:300
300 glue(io_write, SUFFIX)(env, mmu_idx, index, val, addr, retaddr);
(gdb) l 260
255# endif
256#endif
257
258static inline void glue(io_write, SUFFIX)(CPUArchState *env,
259 size_t mmu_idx, size_t index,
260 DATA_TYPE val,
261 target_ulong addr,
262 uintptr_t retaddr)
263{
264 CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
265 return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr, DATA_SIZE);
266}
267
268void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
269 TCGMemOpIdx oi, uintptr_t retaddr)
270{
271 unsigned mmu_idx = get_mmuidx(oi);
272 int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
273 target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
274 unsigned a_bits = get_alignment_bits(get_memop(oi));
275 uintptr_t haddr;
276
277 if (addr & ((1 << a_bits) - 1)) {
278 cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
279 mmu_idx, retaddr);
280 }
281
282 /* If the TLB entry is for a different page, reload and try again. */
283 if ((addr & TARGET_PAGE_MASK)
284 != (tlb_addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) {
285 if (!VICTIM_TLB_HIT(addr_write, addr)) {
286 tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
287 }
288 tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
289 }
290
291 /* Handle an IO access. */
292 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
293 if ((addr & (DATA_SIZE - 1)) != 0) {
294 goto do_unaligned_access;
295 }
296
297 /* ??? Note that the io helpers always read data in the target
298 byte ordering. We should push the LE/BE request down into io. */
299 val = TGT_LE(val);
300 glue(io_write, SUFFIX)(env, mmu_idx, index, val, addr, retaddr);
301 return;
302 }
303
304 /* Handle slow unaligned access (it spans two pages or IO). */
(gdb) p/x tlb_addr
$5 = 0xd8001080
(gdb) p/x env->tlb_table[mmu_idx][index]
$6 = {{{addr_read = 0xd8001080, addr_write = 0xd8001080, addr_code = 0xffffffff, addend = 0xffffffff27fff000}, dummy = {0x80, 0x10, 0x0, 0xd8,
0x80, 0x10, 0x0, 0xd8, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0, 0xf0, 0xff, 0x27, 0xff <repeats 12 times>}}}
So as far as I understand the mmu_idx and index variables passed down to
io_write are from env->tlb_table but are used on the env->iotlb array to
get iotlbentry which seems to be invalid in this case (while apparently
normally these are in sync and this works because several successful
accesses precede this one causing the crash). Does anyone know if this is
correct and how are these arrays can get out of sync causing this crash or
what should be checked to find the problem here?
Thank you,
BALATON Zoltan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] How to debug crash in TCG code?
2017-10-08 16:14 ` [Qemu-devel] [Qemu-ppc] " BALATON Zoltan
@ 2017-10-15 11:30 ` BALATON Zoltan
2017-11-14 18:08 ` Paolo Bonzini
0 siblings, 1 reply; 4+ messages in thread
From: BALATON Zoltan @ 2017-10-15 11:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-ppc, Peter Maydell
On Sun, 8 Oct 2017, BALATON Zoltan wrote:
> I'm still trying to debug a crash that I don't understand why is
> happening and how to debug it further. Any hints are apreciated. I've
> described the problem previously in
> http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00249.html
> but I've dug deeper now, here's what I've found.
I've got a bit further with this but still could use some hints to find
what is happening. Here are some more details I've found so far.
The memory map I have (see below) is a bit complex but the interesting
part is that I have sii3112.bar5 as an mmio region with sii3112.bar0-4 as
io region aliases into this. The crash is happening when the firmware is
accessing one of these aliased io regions when
tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
is called in accel/tcg/cputlb.c:616 which then calls
635 section = address_space_translate_for_iotlb(cpu, asidx, paddr, &xlat, &sz);
this in turn calls address_space_translate_internal which calls
441 section = address_space_lookup_region(d, addr, resolve_subpage);
that eventually gets the cached section at exec.c:411
411 MemoryRegionSection *section = atomic_read(&d->mru_section);
When this is not a region covering the address as verifed by
415 if (section && section != &d->map.sections[PHYS_SECTION_UNASSIGNED] &&
416 section_covers_addr(section, addr)) {
417 update = false;
418 } else {
419 section = phys_page_find(d, addr);
420 update = true;
421 }
then the else branch is taken and phys_page_find is called and the correct
region is found and set in line 427 atomic_set(&d->mru_section, section);
In this case the iotlb value in tlb_set_page_with_attrs at
accel/tcg/cputlb.c:653
653 iotlb = memory_region_section_get_iotlb(cpu, section, vaddr, paddr, xlat,
will be 20 and access works. In this case section is
$6 = {mr = 0x7f62d03744f0, fv = 0x7f62d035c6d0, offset_within_region = 0x0, size = 0x00000000000000000000000000000400,
offset_within_address_space = 0xc08001000, readonly = 0x0}
But when the cached memory region in d->mru_section happens to be
sii3112.bar5 the update=false branch is taken and the section is
returned as is. In this case iotlb value will be 149 and the access
results in the crash described below. Section in this case is
$13 = {mr = 0x55dedbd88210, fv = 0x7f62d035c6d0, offset_within_region = 0x80, size = 0x0000000000000000000000000000000c,
offset_within_address_space = 0xc08001000, readonly = 0x0}
(note the different mr values and offset and size).
I don't yet understand why is it happening but seems to be related to
aliased memory regions or maybe wrong mapping in the memory map. The code
implementing these are in these patches:
http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00246.html
http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00245.html
http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00247.html
Here is the memory layout in case someone can spot a problem in it. Any
help is greatly appreciated.
(qemu) info mtree
address-space: memory
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000000000-000000001fffffff (prio 0, i/o): sdram-containers
0000000000000000-000000001fffffff (prio 0, i/o): alias ppc4xx.sdram0 @ppc4xx.sdram 0000000000000000-000000001fffffff
0000000400000000-000000040003ffff (prio 0, ram): ppc440.l2cache_ram
00000004bffd0000-00000004bffd00ff (prio 0, i/o): ohci
00000004bffd0400-00000004bffd13ff (prio 0, i/o): ehci
00000004bffd0400-00000004bffd040f (prio 0, i/o): capabilities
00000004bffd0410-00000004bffd0453 (prio 0, i/o): operational
00000004bffd0454-00000004bffd046b (prio 0, i/o): ports
00000004ef600300-00000004ef600307 (prio 0, i/o): serial
00000004ef600700-00000004ef600711 (prio 0, i/o): ppc4xx-i2c
00000004ef600800-00000004ef600811 (prio 0, i/o): ppc4xx-i2c
00000004fff00000-00000004ffffffff (prio 0, romd): sam460ex.flash
0000000c08000000-0000000c0800ffff (prio 0, i/o): alias isa_mmio @io 0000000000000000-000000000000ffff
0000000c0ec00000-0000000c0ec800fe (prio 0, i/o): pci-container
0000000c0ec00000-0000000c0ec00003 (prio 0, i/o): pci-conf-idx
0000000c0ec00004-0000000c0ec00007 (prio 0, i/o): pci-conf-data
0000000c0ec80000-0000000c0ec800fe (prio 0, i/o): pci.reg
0000000c80000000-0000000c8fffffff (prio 0, i/o): alias PCI Outbound Window 0 @pci bus memory 0000000080000000-000000008fffffff
0000000c90000000-0000000c9fffffff (prio 0, i/o): alias PCI Outbound Window 1 @pci bus memory 0000000090000000-000000009fffffff
0000000d20000000-0000000d3fffffff (prio 0, i/o): pcie-mmcfg-mmio
address-space: I/O
0000000000000000-000000000000ffff (prio 0, i/o): io
0000000000001000-0000000000001007 (prio 1, i/o): alias sii3112.bar0 @sii3112.bar5 0000000000000080-0000000000000087
0000000000001008-000000000000100b (prio 1, i/o): alias sii3112.bar1 @sii3112.bar5 0000000000000088-000000000000008b
0000000000001010-0000000000001017 (prio 1, i/o): alias sii3112.bar2 @sii3112.bar5 00000000000000c0-00000000000000c7
0000000000001018-000000000000101b (prio 1, i/o): alias sii3112.bar3 @sii3112.bar5 00000000000000c8-00000000000000cb
0000000000001020-000000000000102f (prio 1, i/o): alias sii3112.bar4 @sii3112.bar5 0000000000000000-000000000000000f
address-space: cpu-memory
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000000000-000000001fffffff (prio 0, i/o): sdram-containers
0000000000000000-000000001fffffff (prio 0, i/o): alias ppc4xx.sdram0 @ppc4xx.sdram 0000000000000000-000000001fffffff
0000000400000000-000000040003ffff (prio 0, ram): ppc440.l2cache_ram
00000004bffd0000-00000004bffd00ff (prio 0, i/o): ohci
00000004bffd0400-00000004bffd13ff (prio 0, i/o): ehci
00000004bffd0400-00000004bffd040f (prio 0, i/o): capabilities
00000004bffd0410-00000004bffd0453 (prio 0, i/o): operational
00000004bffd0454-00000004bffd046b (prio 0, i/o): ports
00000004ef600300-00000004ef600307 (prio 0, i/o): serial
00000004ef600700-00000004ef600711 (prio 0, i/o): ppc4xx-i2c
00000004ef600800-00000004ef600811 (prio 0, i/o): ppc4xx-i2c
00000004fff00000-00000004ffffffff (prio 0, romd): sam460ex.flash
0000000c08000000-0000000c0800ffff (prio 0, i/o): alias isa_mmio @io 0000000000000000-000000000000ffff
0000000c0ec00000-0000000c0ec800fe (prio 0, i/o): pci-container
0000000c0ec00000-0000000c0ec00003 (prio 0, i/o): pci-conf-idx
0000000c0ec00004-0000000c0ec00007 (prio 0, i/o): pci-conf-data
0000000c0ec80000-0000000c0ec800fe (prio 0, i/o): pci.reg
0000000c80000000-0000000c8fffffff (prio 0, i/o): alias PCI Outbound Window 0 @pci bus memory 0000000080000000-000000008fffffff
0000000c90000000-0000000c9fffffff (prio 0, i/o): alias PCI Outbound Window 1 @pci bus memory 0000000090000000-000000009fffffff
0000000d20000000-0000000d3fffffff (prio 0, i/o): pcie-mmcfg-mmio
address-space:
0000000000000000-ffffffffffffffff (prio 0, i/o): bus master container
0000000000000000-ffffffffffffffff (prio 0, i/o): alias bus master @bm-ppc440-pcix 0000000000000000-ffffffffffffffff
address-space: pci-bm
0000000000000000-ffffffffffffffff (prio 0, i/o): bm-ppc440-pcix
0000000000000000-ffffffffffffffff (prio 0, i/o): pci bus memory
0000000080000000-00000000800001ff (prio 1, i/o): sii3112.bar5
0000000084000000-0000000087ffffff (prio 1, ram): sm501.local
0000000088000000-00000000881fffff (prio 1, i/o): sm501.mmio
0000000088000000-000000008800006b (prio 0, i/o): sm501-system-config
0000000088080000-0000000088080fff (prio 0, i/o): sm501-disp-ctrl
0000000088100000-0000000088100053 (prio 0, i/o): sm501-2d-engine
0000000000000000-000000001fffffff (prio -1, i/o): alias PCI Inbound Window 0 @system 0000000000000000-000000001fffffff
address-space:
0000000000000000-ffffffffffffffff (prio 0, i/o): bus master container
0000000000000000-ffffffffffffffff (prio 0, i/o): alias bus master @bm-ppc440-pcix 0000000000000000-ffffffffffffffff
address-space:
0000000000000000-ffffffffffffffff (prio 0, i/o): bus master container
0000000000000000-ffffffffffffffff (prio 0, i/o): alias bus master @bm-ppc440-pcix 0000000000000000-ffffffffffffffff
memory-region: ppc4xx.sdram
0000000000000000-000000001fffffff (prio 0, ram): ppc4xx.sdram
memory-region: io
0000000000000000-000000000000ffff (prio 0, i/o): io
0000000000001000-0000000000001007 (prio 1, i/o): alias sii3112.bar0 @sii3112.bar5 0000000000000080-0000000000000087
0000000000001008-000000000000100b (prio 1, i/o): alias sii3112.bar1 @sii3112.bar5 0000000000000088-000000000000008b
0000000000001010-0000000000001017 (prio 1, i/o): alias sii3112.bar2 @sii3112.bar5 00000000000000c0-00000000000000c7
0000000000001018-000000000000101b (prio 1, i/o): alias sii3112.bar3 @sii3112.bar5 00000000000000c8-00000000000000cb
0000000000001020-000000000000102f (prio 1, i/o): alias sii3112.bar4 @sii3112.bar5 0000000000000000-000000000000000f
memory-region: pci bus memory
0000000000000000-ffffffffffffffff (prio 0, i/o): pci bus memory
0000000080000000-00000000800001ff (prio 1, i/o): sii3112.bar5
0000000084000000-0000000087ffffff (prio 1, ram): sm501.local
0000000088000000-00000000881fffff (prio 1, i/o): sm501.mmio
0000000088000000-000000008800006b (prio 0, i/o): sm501-system-config
0000000088080000-0000000088080fff (prio 0, i/o): sm501-disp-ctrl
0000000088100000-0000000088100053 (prio 0, i/o): sm501-2d-engine
memory-region: sii3112.bar5
0000000080000000-00000000800001ff (prio 1, i/o): sii3112.bar5
memory-region: bm-ppc440-pcix
0000000000000000-ffffffffffffffff (prio 0, i/o): bm-ppc440-pcix
0000000000000000-ffffffffffffffff (prio 0, i/o): pci bus memory
0000000080000000-00000000800001ff (prio 1, i/o): sii3112.bar5
0000000084000000-0000000087ffffff (prio 1, ram): sm501.local
0000000088000000-00000000881fffff (prio 1, i/o): sm501.mmio
0000000088000000-000000008800006b (prio 0, i/o): sm501-system-config
0000000088080000-0000000088080fff (prio 0, i/o): sm501-disp-ctrl
0000000088100000-0000000088100053 (prio 0, i/o): sm501-2d-engine
0000000000000000-000000001fffffff (prio -1, i/o): alias PCI Inbound Window 0 @system 0000000000000000-000000001fffffff
memory-region: system
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000000000-000000001fffffff (prio 0, i/o): sdram-containers
0000000000000000-000000001fffffff (prio 0, i/o): alias ppc4xx.sdram0 @ppc4xx.sdram 0000000000000000-000000001fffffff
0000000400000000-000000040003ffff (prio 0, ram): ppc440.l2cache_ram
00000004bffd0000-00000004bffd00ff (prio 0, i/o): ohci
00000004bffd0400-00000004bffd13ff (prio 0, i/o): ehci
00000004bffd0400-00000004bffd040f (prio 0, i/o): capabilities
00000004bffd0410-00000004bffd0453 (prio 0, i/o): operational
00000004bffd0454-00000004bffd046b (prio 0, i/o): ports
00000004ef600300-00000004ef600307 (prio 0, i/o): serial
00000004ef600700-00000004ef600711 (prio 0, i/o): ppc4xx-i2c
00000004ef600800-00000004ef600811 (prio 0, i/o): ppc4xx-i2c
00000004fff00000-00000004ffffffff (prio 0, romd): sam460ex.flash
0000000c08000000-0000000c0800ffff (prio 0, i/o): alias isa_mmio @io 0000000000000000-000000000000ffff
0000000c0ec00000-0000000c0ec800fe (prio 0, i/o): pci-container
0000000c0ec00000-0000000c0ec00003 (prio 0, i/o): pci-conf-idx
0000000c0ec00004-0000000c0ec00007 (prio 0, i/o): pci-conf-data
0000000c0ec80000-0000000c0ec800fe (prio 0, i/o): pci.reg
0000000c80000000-0000000c8fffffff (prio 0, i/o): alias PCI Outbound Window 0 @pci bus memory 0000000080000000-000000008fffffff
0000000c90000000-0000000c9fffffff (prio 0, i/o): alias PCI Outbound Window 1 @pci bus memory 0000000090000000-000000009fffffff
0000000d20000000-0000000d3fffffff (prio 0, i/o): pcie-mmcfg-mmio
> I'm running with -accel tcg,thread=single -singlestep so it does not seem to
> be a multi threading problem. The seg fault happens when U-Boot running on
> the emulated embedded PPC SoC accesses a PCI device but after several
> successful iteration of the same access so it seems to be some internal
> inconsistency that arrises while QEMU is running, not something generally
> broken. It is also dependent on code run by the client or QEMU, so e.g. it
> disappears when I enable -d in_asm or use a different U-Boot version. I'm not
> sure what exactly causes the bug to happen or disappear but I can reproduce
> it reliably with certain images at the same place. It always happens during
> io_readx or io_writex. Here's the info I could get with a debugger with some
> additional debug and trace messages enabled:
>
> Here's the PCI device detected by U-Boot and BARs mapped:
> The device is implemended here:
> http://lists.nongnu.org/archive/html/qemu-ppc/2017-08/msg00246.html
>
> 4744@1507472117.454656:pci_cfg_read sii3112 01:0 @0xe -> 0x0
> tlb_set_page_with_attrs: vaddr=0ffb0800 paddr=0x000000000ffb0800 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff73c00 paddr=0x000000000ff73c00 prot=7 idx=1
> 4744@1507472117.454676:pci_cfg_read sii3112 01:0 @0x0 -> 0x1095
> 4744@1507472117.454766:pci_cfg_read sii3112 01:0 @0x2 -> 0x3112
> 4744@1507472117.454775:pci_cfg_read sii3112 01:0 @0xa -> 0x104
> tlb_set_page_with_attrs: vaddr=0ff74000 paddr=0x000000000ff74000 prot=7 idx=1
> 4744@1507472117.454794:pci_cfg_read sii3112 01:0 @0xa -> 0x104
> 4744@1507472117.454905:pci_cfg_read sii3112 01:0 @0x4 -> 0x0
> 4744@1507472117.454916:pci_cfg_write sii3112 01:0 @0x10 <- 0xffffffff
> 4744@1507472117.454927:pci_cfg_read sii3112 01:0 @0x10 -> 0xfffffff9
> 4744@1507472117.456524:pci_cfg_write sii3112 01:0 @0x10 <- 0x1000
> tlb_set_page_with_attrs: vaddr=0ff74800 paddr=0x000000000ff74800 prot=7 idx=1
> 4744@1507472117.456823:pci_cfg_write sii3112 01:0 @0x14 <- 0xffffffff
> 4744@1507472117.456833:pci_cfg_read sii3112 01:0 @0x14 -> 0xfffffffd
> 4744@1507472117.456844:pci_cfg_write sii3112 01:0 @0x14 <- 0x1008
> 4744@1507472117.456852:pci_cfg_write sii3112 01:0 @0x18 <- 0xffffffff
> 4744@1507472117.456859:pci_cfg_read sii3112 01:0 @0x18 -> 0xfffffff9
> 4744@1507472117.456869:pci_cfg_write sii3112 01:0 @0x18 <- 0x1010
> 4744@1507472117.456878:pci_cfg_write sii3112 01:0 @0x1c <- 0xffffffff
> 4744@1507472117.456885:pci_cfg_read sii3112 01:0 @0x1c -> 0xfffffffd
> 4744@1507472117.456895:pci_cfg_write sii3112 01:0 @0x1c <- 0x1018
> 4744@1507472117.456903:pci_cfg_write sii3112 01:0 @0x20 <- 0xffffffff
> 4744@1507472117.456911:pci_cfg_read sii3112 01:0 @0x20 -> 0xfffffff1
> 4744@1507472117.456920:pci_cfg_write sii3112 01:0 @0x20 <- 0x1020
> 4744@1507472117.456929:pci_cfg_write sii3112 01:0 @0x24 <- 0xffffffff
> 4744@1507472117.456936:pci_cfg_read sii3112 01:0 @0x24 -> 0xfffffe00
> 4744@1507472117.457173:pci_cfg_write sii3112 01:0 @0x24 <- 0x80000000
> 4744@1507472117.457208:pci_cfg_write sii3112 01:0 @0x4 <- 0x7
> 4744@1507472117.457213:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0
> 0,0x1000+0x8
> tlb_flush_nocheck: (count: 23)
> 4744@1507472117.457488:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0
> 1,0x1008+0x4
> tlb_flush_nocheck: (count: 24)
> 4744@1507472117.457769:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0
> 2,0x1010+0x8
> tlb_flush_nocheck: (count: 25)
> 4744@1507472117.458048:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0
> 3,0x1018+0x4
> tlb_flush_nocheck: (count: 26)
> 4744@1507472117.458324:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0
> 4,0x1020+0x10
> tlb_flush_nocheck: (count: 27)
> 4744@1507472117.458604:pci_update_mappings_add d=0x55aca57d3ce0 00:01.0
> 5,0x80000000+0x200
>
> (BAR0-4 are alias memory regions to BAR5 so all access should go to BAR5 at
> the end even if U-Boot accesses it through BAR0-4)
> Here U-Boot starts using the device:
>
> tlb_set_page_with_attrs: vaddr=dec00000 paddr=0x0000000c0ec00000 prot=3 idx=1
> 4744@1507472121.253966:pci_cfg_read ppc4xx-host-bridge 00:0 @0xe -> 0x0
> 4744@1507472121.253982:pci_cfg_read ppc4xx-host-bridge 00:0 @0x0 -> 0x1014
> 4744@1507472121.253992:pci_cfg_read ppc4xx-host-bridge 00:0 @0x2 -> 0x27f
> 4744@1507472121.254005:pci_cfg_read sii3112 01:0 @0xe -> 0x0
> 4744@1507472121.254015:pci_cfg_read sii3112 01:0 @0x0 -> 0x1095
> 4744@1507472121.254024:pci_cfg_read sii3112 01:0 @0x2 -> 0x3112
> tlb_set_page_with_attrs: vaddr=00000800 paddr=0x0000000000000800 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff6f000 paddr=0x000000000ff6f000 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff74000 paddr=0x000000000ff74000 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff73c00 paddr=0x000000000ff73c00 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ffb0c00 paddr=0x000000000ffb0c00 prot=7 idx=1
> 4744@1507472121.254748:pci_cfg_write sii3112 01:0 @0x4 <- 0x7
> 4744@1507472121.254846:pci_cfg_read sii3112 01:0 @0x8 -> 0x1040001
> 4744@1507472121.255085:pci_cfg_write sii3112 01:0 @0xc <- 0x1
> 4744@1507472121.255228:pci_cfg_read sii3112 01:0 @0x10 -> 0x1001
> 4744@1507472121.255417:pci_cfg_read sii3112 01:0 @0x18 -> 0x1011
> tlb_set_page_with_attrs: vaddr=0ffa3800 paddr=0x000000000ffa3800 prot=7 idx=1
>
> And can talk to the device emulation:
>
> tlb_set_page_with_attrs: vaddr=0ff89000 paddr=0x000000000ff89000 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
> 4744@1507472121.402003:sii3112_write bmdma: write (size 1) 0x86 : 0xe0
>
> Address 0x86 in BAR5 is accessed via vaddr d8001006 which is an alias via
> BAR0
>
> tlb_set_page_with_attrs: vaddr=0ff8a400 paddr=0x000000000ff8a400 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ffa3400 paddr=0x000000000ffa3400 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ffc7800 paddr=0x000000000ffc7800 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
> 4744@1507472121.513050:sii3112_set_irq channel 0 level 0
> 4744@1507472121.513057:sii3112_read bmdma: read (size 1) 0x87 : 0x50
> tlb_set_page_with_attrs: vaddr=0ff6f000 paddr=0x000000000ff6f000 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0df6d000 paddr=0x000000000df6d000 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ffc0800 paddr=0x000000000ffc0800 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff87800 paddr=0x000000000ff87800 prot=7 idx=1
> 4744@1507472121.613784:sii3112_write bmdma: write (size 1) 0xc6 : 0xe0
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
> 4744@1507472121.724016:sii3112_set_irq channel 1 level 0
> 4744@1507472121.724020:sii3112_read bmdma: read (size 1) 0xc7 : 0x00
> tlb_set_page_with_attrs: vaddr=0ff8a400 paddr=0x000000000ff8a400 prot=7 idx=1
>
> Accessing the device seems to work, it identifies the connected ATAPI CD
> emulation and can read from it, until it hits null pointer:
>
> tlb_set_page_with_attrs: vaddr=0ff89800 paddr=0x000000000ff89800 prot=7 idx=1
> 4744@1507472121.949834:sii3112_read bmdma: read (size 1) 0x82 : 0x02
> 4744@1507472121.950222:sii3112_read bmdma: read (size 2) 0x80 : 0x8005
> 4744@1507472121.950283:sii3112_read bmdma: read (size 2) 0x80 : 0x2100
> 4744@1507472121.950289:sii3112_read bmdma: read (size 2) 0x80 : 0x1f
> 4744@1507472121.950292:sii3112_read bmdma: read (size 2) 0x80 : 0x00
> 4744@1507472121.950295:sii3112_read bmdma: read (size 2) 0x80 : 0x4551
> 4744@1507472121.950298:sii3112_read bmdma: read (size 2) 0x80 : 0x554d
> 4744@1507472121.950301:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
> 4744@1507472121.950306:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
> 4744@1507472121.950309:sii3112_read bmdma: read (size 2) 0x80 : 0x4551
> 4744@1507472121.950312:sii3112_read bmdma: read (size 2) 0x80 : 0x554d
> 4744@1507472121.950316:sii3112_read bmdma: read (size 2) 0x80 : 0x4420
> 4744@1507472121.950319:sii3112_read bmdma: read (size 2) 0x80 : 0x4456
> 4744@1507472121.950323:sii3112_read bmdma: read (size 2) 0x80 : 0x522d
> 4744@1507472121.950326:sii3112_read bmdma: read (size 2) 0x80 : 0x4d4f
> 4744@1507472121.950330:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
> 4744@1507472121.950333:sii3112_read bmdma: read (size 2) 0x80 : 0x2020
> 4744@1507472121.950338:sii3112_read bmdma: read (size 2) 0x80 : 0x2e32
> 4744@1507472121.950342:sii3112_set_irq channel 0 level 1
> 4744@1507472121.950346:sii3112_set_irq channel 0 level 1
> 4744@1507472121.950348:sii3112_read bmdma: read (size 2) 0x80 : 0x2b35
> 4744@1507472121.955573:sii3112_set_irq channel 0 level 0
> 4744@1507472121.955590:sii3112_read bmdma: read (size 1) 0x87 : 0x50
> tlb_set_page_with_attrs: vaddr=0ff89c00 paddr=0x000000000ff89c00 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff8a000 paddr=0x000000000ff8a000 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=0ff89400 paddr=0x000000000ff89400 prot=7 idx=1
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
> 4744@1507472121.957147:sii3112_write bmdma: write (size 1) 0x86 : 0xe0
> 4744@1507472121.957158:sii3112_set_irq channel 0 level 0
> 4744@1507472121.957161:sii3112_read bmdma: read (size 1) 0x87 : 0x50
> 4744@1507472121.957168:sii3112_write bmdma: write (size 1) 0x81 : 0x00
> 4744@1507472121.957172:sii3112_write bmdma: write (size 1) 0x82 : 0x00
> 4744@1507472121.957177:sii3112_write bmdma: write (size 1) 0x83 : 0x00
> 4744@1507472121.957181:sii3112_write bmdma: write (size 1) 0x84 : 0x00
> 4744@1507472121.957188:sii3112_write bmdma: write (size 1) 0x85 : 0x00
> 4744@1507472121.957198:sii3112_write bmdma: write (size 1) 0x86 : 0xe0
> 4744@1507472121.957204:sii3112_write bmdma: write (size 1) 0x87 : 0xa0
> 4744@1507472121.957278:sii3112_set_irq channel 0 level 0
> 4744@1507472121.957283:sii3112_read bmdma: read (size 1) 0x87 : 0x58
> 4744@1507472121.957289:sii3112_write bmdma: write (size 2) 0x80 : 0x1b
> 4744@1507472121.957294:sii3112_write bmdma: write (size 2) 0x80 : 0x00
> 4744@1507472121.957298:sii3112_write bmdma: write (size 2) 0x80 : 0x03
> 4744@1507472121.957302:sii3112_write bmdma: write (size 2) 0x80 : 0x00
> 4744@1507472121.957306:sii3112_write bmdma: write (size 2) 0x80 : 0x00
> 4744@1507472121.957309:sii3112_write bmdma: write (size 2) 0x80 : 0x00
> 4744@1507472121.957315:sii3112_set_irq channel 0 level 1
> 4744@1507472121.962334:sii3112_set_irq channel 0 level 0
> 4744@1507472121.962340:sii3112_read bmdma: read (size 1) 0x87 : 0x50
> 4744@1507472121.962346:sii3112_read bmdma: read (size 1) 0x85 : 0x00
> 4744@1507472121.962351:sii3112_read bmdma: read (size 1) 0x84 : 0x00
> 4744@1507472121.967372:sii3112_set_irq channel 0 level 0
> 4744@1507472121.967378:sii3112_read bmdma: read (size 1) 0x87 : 0x50
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3 idx=1
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffe87f7700 (LWP 4761)]
> 0x00005555557f4bc4 in io_writex (env=0x7fffe88002a0,
> iotlbentry=0x7fffe8811d60, mmu_idx=1, val=224, addr=3623882758,
> retaddr=140737099010411,
> size=1) at accel/tcg/cputlb.c:803
> 803 if (mr->global_locking && !qemu_mutex_iothread_locked()) {
> (gdb) bt
> #0 0x00005555557f4bc4 in io_writex (env=0x7fffe88002a0,
> iotlbentry=0x7fffe8811d60, mmu_idx=1, val=224, addr=3623882758,
> retaddr=140737099010411,
> size=1) at accel/tcg/cputlb.c:803
> #1 0x00005555557f5673 in io_writeb (env=0x7fffe88002a0, mmu_idx=1, index=4,
> val=224 '\340', addr=3623882758, retaddr=140737099010411)
> at accel/tcg/softmmu_template.h:265
> #2 0x00005555557f57ff in helper_ret_stb_mmu (env=0x7fffe88002a0,
> addr=3623882758, val=224 '\340', oi=1, retaddr=140737099010411)
> at accel/tcg/softmmu_template.h:300
> #3 0x00007fffe8cb116b in code_gen_buffer ()
> #4 0x00005555557fdaf1 in cpu_tb_exec (cpu=0x7fffe87f8010, itb=0x7fffe8cb0780
> <code_gen_buffer+3753814>)
> at accel/tcg/cpu-exec.c:166
> #5 0x00005555557fe936 in cpu_loop_exec_tb (cpu=0x7fffe87f8010,
> tb=0x7fffe8cb0780 <code_gen_buffer+3753814>, last_tb=0x7fffe87f6af8,
> tb_exit=0x7fffe87f6af4) at accel/tcg/cpu-exec.c:613
> #6 0x00005555557fec29 in cpu_exec (cpu=0x7fffe87f8010) at
> accel/tcg/cpu-exec.c:711
> #7 0x00005555557c7a05 in tcg_cpu_exec (cpu=0x7fffe87f8010) at
> qemu/cpus.c:1270
> #8 0x00005555557c7c3c in qemu_tcg_rr_cpu_thread_fn (arg=0x7fffe87f8010) at
> qemu/cpus.c:1365
> #9 0x00007ffff78cd5bd in start_thread () from /lib64/libpthread.so.0
> #10 0x00007ffff42d062d in clone () from /lib64/libc.so.6
> (gdb) p mr
> $1 = (MemoryRegion *) 0x0
> (gdb) p/x addr
> $2 = 0xd8001006
> (gdb) p *iotlbentry
> $3 = {addr = 18446744070085669013, attrs = {unspecified = 1, secure = 0, user
> = 0, requester_id = 0}}
> (gdb) p/x iotlbentry->addr
> $4 = 0xffffffff27fff095
> (gdb) up
> #1 0x00005555557f5673 in io_writeb (env=0x7fffe88002a0, mmu_idx=1, index=4,
> val=224 '\340', addr=3623882758, retaddr=140737099010411)
> at accel/tcg/softmmu_template.h:265
> 265 return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr,
> DATA_SIZE);
> (gdb) up
> #2 0x00005555557f57ff in helper_ret_stb_mmu (env=0x7fffe88002a0,
> addr=3623882758, val=224 '\340', oi=1, retaddr=140737099010411)
> at accel/tcg/softmmu_template.h:300
> 300 glue(io_write, SUFFIX)(env, mmu_idx, index, val, addr, retaddr);
> (gdb) l 260
> 255# endif
> 256#endif
> 257
> 258static inline void glue(io_write, SUFFIX)(CPUArchState *env,
> 259 size_t mmu_idx, size_t index,
> 260 DATA_TYPE val,
> 261 target_ulong addr,
> 262 uintptr_t retaddr)
> 263{
> 264 CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
> 265 return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr,
> DATA_SIZE);
> 266}
> 267
> 268void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE
> val,
> 269 TCGMemOpIdx oi, uintptr_t retaddr)
> 270{
> 271 unsigned mmu_idx = get_mmuidx(oi);
> 272 int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> 273 target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
> 274 unsigned a_bits = get_alignment_bits(get_memop(oi));
> 275 uintptr_t haddr;
> 276
> 277 if (addr & ((1 << a_bits) - 1)) {
> 278 cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
> 279 mmu_idx, retaddr);
> 280 }
> 281
> 282 /* If the TLB entry is for a different page, reload and try again. */
> 283 if ((addr & TARGET_PAGE_MASK)
> 284 != (tlb_addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) {
> 285 if (!VICTIM_TLB_HIT(addr_write, addr)) {
> 286 tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx,
> retaddr);
> 287 }
> 288 tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
> 289 }
> 290
> 291 /* Handle an IO access. */
> 292 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
> 293 if ((addr & (DATA_SIZE - 1)) != 0) {
> 294 goto do_unaligned_access;
> 295 }
> 296
> 297 /* ??? Note that the io helpers always read data in the target
> 298 byte ordering. We should push the LE/BE request down into io.
> */
> 299 val = TGT_LE(val);
> 300 glue(io_write, SUFFIX)(env, mmu_idx, index, val, addr, retaddr);
> 301 return;
> 302 }
> 303
> 304 /* Handle slow unaligned access (it spans two pages or IO). */
> (gdb) p/x tlb_addr
> $5 = 0xd8001080
> (gdb) p/x env->tlb_table[mmu_idx][index]
> $6 = {{{addr_read = 0xd8001080, addr_write = 0xd8001080, addr_code =
> 0xffffffff, addend = 0xffffffff27fff000}, dummy = {0x80, 0x10, 0x0, 0xd8,
> 0x80, 0x10, 0x0, 0xd8, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> 0x0, 0xf0, 0xff, 0x27, 0xff <repeats 12 times>}}}
>
> So as far as I understand the mmu_idx and index variables passed down to
> io_write are from env->tlb_table but are used on the env->iotlb array to get
> iotlbentry which seems to be invalid in this case (while apparently normally
> these are in sync and this works because several successful accesses precede
> this one causing the crash). Does anyone know if this is correct and how are
> these arrays can get out of sync causing this crash or what should be checked
> to find the problem here?
>
> Thank you,
> BALATON Zoltan
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] How to debug crash in TCG code?
2017-10-15 11:30 ` BALATON Zoltan
@ 2017-11-14 18:08 ` Paolo Bonzini
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2017-11-14 18:08 UTC (permalink / raw)
To: BALATON Zoltan, qemu-devel; +Cc: Peter Maydell, qemu-ppc
On 15/10/2017 13:30, BALATON Zoltan wrote:
> I've got a bit further with this but still could use some hints to find
> what is happening. Here are some more details I've found so far.
>
> The memory map I have (see below) is a bit complex but the interesting
> part is that I have sii3112.bar5 as an mmio region with sii3112.bar0-4
> as io region aliases into this. The crash is happening when the firmware
> is accessing one of these aliased io regions when
>
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x0000000c08001000 prot=3
> idx=1
>
> is called in accel/tcg/cputlb.c:616 which then calls
>
> 635 section = address_space_translate_for_iotlb(cpu, asidx,
> paddr, &xlat, &sz);
>
> this in turn calls address_space_translate_internal which calls
>
> 441 section = address_space_lookup_region(d, addr, resolve_subpage);
>
> that eventually gets the cached section at exec.c:411
>
> 411 MemoryRegionSection *section = atomic_read(&d->mru_section);
>
> When this is not a region covering the address as verifed by
Could it be that the cached region is only for a small part of the page,
while phys_page_find returns a subpage (and resolve_subpage is false)?
Maybe it's enough to skip mru_section if resolve_subpage is false.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-11-14 18:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-26 22:33 [Qemu-devel] How to debug crash in TCG code? BALATON Zoltan
[not found] ` <alpine.BSF.2.21.1708192307590.44932@zero.eik.bme.hu>
[not found] ` <alpine.BSF.2.21.1708202246550.65843@zero.eik.bme.hu>
[not found] ` <d4ffee34-8a82-95ea-bedc-ed37ae5fae67@ilande.co.uk>
2017-10-08 16:14 ` [Qemu-devel] [Qemu-ppc] " BALATON Zoltan
2017-10-15 11:30 ` BALATON Zoltan
2017-11-14 18:08 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).