qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] qemu-softmmu aborted with "Bad ram pointer"
@ 2017-01-05 22:52 Max Filippov
  2017-01-06 10:23 ` Peter Maydell
  0 siblings, 1 reply; 4+ messages in thread
From: Max Filippov @ 2017-01-05 22:52 UTC (permalink / raw)
  To: qemu-devel

Hello,

debugging XIP kernel running directly from CFI FLASH I've got to a point
where QEMU aborts with the message "Bad ram pointer 0xbb4".

It turns out that that happens when QEMU tries to translate code from FLASH
immediately after the kernel has written to the FLASH address range:
writing to FLASH address range turns off romd_mode of its memory region:

#0  memory_region_rom_device_set_romd (mr=0x555556160900, romd_mode=false) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/memory.c:1758
#1  0x00005555556af3c2 in pflash_write (pfl=0x555556160560, offset=104608956, value=0, width=4, be=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/block/pflash_cfi01.c:475
#2  0x00005555556afb40 in pflash_mem_write_with_attrs (opaque=0x555556160560, addr=104608956, value=0, len=4, attrs=...) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/block/pflash_cfi01.c:690
#3  0x0000555555613fab in memory_region_write_with_attrs_accessor (mr=0x555556160900, addr=104608956, value=0x7fffd74da378, size=4, shift=0, mask=4294967295, attrs=...) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/memory.c:552
#4  0x00005555556140ce in access_with_adjusted_size (addr=104608956, value=0x7fffd74da378, size=4, access_size_min=1, access_size_max=4, access=0x555555613ebd <memory_region_write_with_attrs_accessor>, mr=0x555556160900,
    attrs=...) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/memory.c:592
#5  0x0000555555616839 in memory_region_dispatch_write (mr=0x555556160900, addr=104608956, data=0, size=4, attrs=...) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/memory.c:1329
#6  0x000055555561c3d5 in io_writex (env=0x55555607b7f0, iotlbentry=0x555556085898, val=0, addr=3862705340, retaddr=140737081961295, size=4) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:535
#7  0x000055555561dfd2 in io_writel (env=0x55555607b7f0, mmu_idx=0, index=195, val=0, addr=3862705340, retaddr=140737081961295) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/softmmu_template.h:265
#8  0x000055555561e165 in helper_le_stl_mmu (env=0x55555607b7f0, addr=3862705340, val=0, oi=32, retaddr=140737081961295) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/softmmu_template.h:300
#9  0x00007fffe7c6eb4f in code_gen_buffer ()
#10 0x00005555555ce2f1 in cpu_tb_exec (cpu=0x555556073570, itb=0x7fffda16e9a0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:164
#11 0x00005555555ceeb0 in cpu_loop_exec_tb (cpu=0x555556073570, tb=0x7fffda16e9a0, last_tb=0x7fffd74daa38, tb_exit=0x7fffd74daa34, sc=0x7fffd74daa50) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:544
#12 0x00005555555cf188 in cpu_exec (cpu=0x555556073570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:638
#13 0x00005555555fde13 in tcg_cpu_exec (cpu=0x555556073570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpus.c:1117
#14 0x00005555555fe06d in qemu_tcg_cpu_thread_fn (arg=0x555556073570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpus.c:1197

then TLB gets flushed:

#1  0x000055555561b1e6 in tlb_flush (cpu=0x555556073570, flush_global=1) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:81
#2  0x00005555555c63be in tcg_commit (listener=0x555556095cb8) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/exec.c:2396
#3  0x000055555561588d in memory_region_transaction_commit () at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/memory.c:929
#4  0x0000555555617830 in memory_region_rom_device_set_romd (mr=0x555556160900, romd_mode=false) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/memory.c:1759

and then QEMU attempts code translation:

#0  tlb_set_page_with_attrs (cpu=0x555556073570, vaddr=3862794240, paddr=4131229696, attrs=..., prot=1028, mmu_idx=0, size=268435456) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:393
#1  0x000055555561bf9b in tlb_set_page (cpu=0x555556073570, vaddr=3862794240, paddr=4131229696, prot=1028, mmu_idx=0, size=268435456) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:428
#2  0x0000555555650a45 in tlb_fill (cs=0x555556073570, vaddr=3862797236, access_type=MMU_INST_FETCH, mmu_idx=0, retaddr=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/target/xtensa/op_helper.c:67
#3  0x0000555555622efe in helper_ret_ldb_cmmu (env=0x55555607b7f0, addr=3862797236, oi=0, retaddr=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/softmmu_template.h:127
#4  0x000055555561ade4 in cpu_ldub_code_ra (env=0x55555607b7f0, ptr=3862797236, retaddr=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/exec/cpu_ldst_template.h:102
#5  0x000055555561ae58 in cpu_ldub_code (env=0x55555607b7f0, ptr=3862797236) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/exec/cpu_ldst_template.h:114
#6  0x000055555561c0a5 in get_page_addr_code (env1=0x55555607b7f0, addr=3862797236) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:482
#7  0x00005555555ce761 in tb_htable_lookup (cpu=0x555556073570, pc=3862797236, cs_base=0, flags=98320) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:306
#8  0x00005555555ce8cf in tb_find (cpu=0x555556073570, last_tb=0x0, tb_exit=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:329
#9  0x00005555555cf165 in cpu_exec (cpu=0x555556073570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:637
#10 0x00005555555fde13 in tcg_cpu_exec (cpu=0x555556073570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpus.c:1117

but the condition (!memory_region_is_ram(section->mr) && !memory_region_is_romd(section->mr))
is true in the tlb_set_page_with_attrs, so TLB entry gets bogus
addend (0 - vaddr, actually), which results in the following abort:

#0  0x00007ffff34c7067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff34c8448 in __GI_abort () at abort.c:89
#2  0x000055555561b8cb in qemu_ram_addr_from_host_nofail (ptr=0xbb4) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:253
#3  0x000055555561c1de in get_page_addr_code (env1=0x55555607c7f0, addr=3862797236) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cputlb.c:498
#4  0x00005555555ce761 in tb_htable_lookup (cpu=0x555556074570, pc=3862797236, cs_base=0, flags=98320) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:306
#5  0x00005555555ce8cf in tb_find (cpu=0x555556074570, last_tb=0x0, tb_exit=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:329
#6  0x00005555555cf165 in cpu_exec (cpu=0x555556074570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpu-exec.c:637
#7  0x00005555555fde13 in tcg_cpu_exec (cpu=0x555556074570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpus.c:1117
#8  0x00005555555fe06d in qemu_tcg_cpu_thread_fn (arg=0x555556074570) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/cpus.c:1197
#9  0x00007ffff38450a4 in start_thread (arg=0x7fffd74db700) at pthread_create.c:309
#10 0x00007ffff357a62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

With CFI flash debug output enabled it looks like that:

PFLASH: pflash_write: writing offset 00000000063c34c0 value e60b1198 width 4 wcycle 0x0
PFLASH: pflash_write: CFI query
PFLASH: pflash_write: writing offset 00000000063c34c4 value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
PFLASH: pflash_write: writing offset 00000000063c34c8 value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
PFLASH: pflash_write: writing offset 00000000063c34dc value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
PFLASH: pflash_write: writing offset 00000000063c34e0 value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
PFLASH: pflash_write: writing offset 00000000063c34ec value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
PFLASH: pflash_write: writing offset 00000000063c34f4 value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
PFLASH: pflash_write: writing offset 00000000063c34fc value 00000000 width 4 wcycle 0x1
PFLASH: pflash_write: leaving query mode
Bad ram pointer 0xbb4


AFAIU the FLASH is readable at that point, but instruction
fetching doesn't work, and it's not FLASH model issue, it's
an issue somewhere in the memory region code.

Command line that I used:
qemu-system-xtensa -cpu dc233c -M kc705 -monitor null -nographic -serial mon:stdio -m 1G -pflash xip-kc705-dc233c-flash

FLASH image:
http://jcmvbkbc.spb.ru/~jcmvbkbc/tmp/201701051441/xip-kc705-dc233c-flash.gz

-- 
Thanks.
-- Max

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] qemu-softmmu aborted with "Bad ram pointer"
  2017-01-05 22:52 [Qemu-devel] qemu-softmmu aborted with "Bad ram pointer" Max Filippov
@ 2017-01-06 10:23 ` Peter Maydell
  2017-01-06 15:24   ` Max Filippov
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Maydell @ 2017-01-06 10:23 UTC (permalink / raw)
  To: Max Filippov; +Cc: QEMU Developers

On 5 January 2017 at 22:52, Max Filippov <jcmvbkbc@gmail.com> wrote:
> Hello,
>
> debugging XIP kernel running directly from CFI FLASH I've got to a point
> where QEMU aborts with the message "Bad ram pointer 0xbb4".
>
> It turns out that that happens when QEMU tries to translate code from FLASH
> immediately after the kernel has written to the FLASH address range:
> writing to FLASH address range turns off romd_mode of its memory region:

This sounds like
https://lists.nongnu.org/archive/html/qemu-devel/2016-08/msg03273.html

It's a bug that we fail with this unhelpful message and abort,
but the fix to the bug would only cause us to print the more
useful "can't execute from a device" instead. You can't
execute from a ROM that's not in ROMD mode, I'm afraid.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] qemu-softmmu aborted with "Bad ram pointer"
  2017-01-06 10:23 ` Peter Maydell
@ 2017-01-06 15:24   ` Max Filippov
  2017-01-06 16:15     ` Peter Maydell
  0 siblings, 1 reply; 4+ messages in thread
From: Max Filippov @ 2017-01-06 15:24 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On Fri, Jan 6, 2017 at 2:23 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 5 January 2017 at 22:52, Max Filippov <jcmvbkbc@gmail.com> wrote:
>> Hello,
>>
>> debugging XIP kernel running directly from CFI FLASH I've got to a point
>> where QEMU aborts with the message "Bad ram pointer 0xbb4".
>>
>> It turns out that that happens when QEMU tries to translate code from FLASH
>> immediately after the kernel has written to the FLASH address range:
>> writing to FLASH address range turns off romd_mode of its memory region:
>
> This sounds like
> https://lists.nongnu.org/archive/html/qemu-devel/2016-08/msg03273.html

Right. Strange that I haven't found it...

> It's a bug that we fail with this unhelpful message and abort,
> but the fix to the bug would only cause us to print the more
> useful "can't execute from a device" instead. You can't
> execute from a ROM that's not in ROMD mode, I'm afraid.

Yes, aborting is my main concern.
Shouldn't we do something like the following?

diff --git a/exec.c b/exec.c
index 8d4bb0e..d3f1818 100644
--- a/exec.c
+++ b/exec.c
@@ -381,7 +381,8 @@ static MemoryRegionSection
*phys_page_find(PhysPageEntry lp, hwaddr addr,

 bool memory_region_is_unassigned(MemoryRegion *mr)
 {
-    return mr != &io_mem_rom && mr != &io_mem_notdirty && !mr->rom_device
+    return mr != &io_mem_rom && mr != &io_mem_notdirty
+        && !(mr->rom_device && mr->romd_mode)
         && mr != &io_mem_watch;
 }

-- 
Thanks.
-- Max

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] qemu-softmmu aborted with "Bad ram pointer"
  2017-01-06 15:24   ` Max Filippov
@ 2017-01-06 16:15     ` Peter Maydell
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2017-01-06 16:15 UTC (permalink / raw)
  To: Max Filippov; +Cc: QEMU Developers, Paolo Bonzini

On 6 January 2017 at 15:24, Max Filippov <jcmvbkbc@gmail.com> wrote:
> On Fri, Jan 6, 2017 at 2:23 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 5 January 2017 at 22:52, Max Filippov <jcmvbkbc@gmail.com> wrote:
>>> Hello,
>>>
>>> debugging XIP kernel running directly from CFI FLASH I've got to a point
>>> where QEMU aborts with the message "Bad ram pointer 0xbb4".
>>>
>>> It turns out that that happens when QEMU tries to translate code from FLASH
>>> immediately after the kernel has written to the FLASH address range:
>>> writing to FLASH address range turns off romd_mode of its memory region:
>>
>> This sounds like
>> https://lists.nongnu.org/archive/html/qemu-devel/2016-08/msg03273.html
>
> Right. Strange that I haven't found it...
>
>> It's a bug that we fail with this unhelpful message and abort,
>> but the fix to the bug would only cause us to print the more
>> useful "can't execute from a device" instead. You can't
>> execute from a ROM that's not in ROMD mode, I'm afraid.
>
> Yes, aborting is my main concern.
> Shouldn't we do something like the following?
>
> diff --git a/exec.c b/exec.c
> index 8d4bb0e..d3f1818 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -381,7 +381,8 @@ static MemoryRegionSection
> *phys_page_find(PhysPageEntry lp, hwaddr addr,
>
>  bool memory_region_is_unassigned(MemoryRegion *mr)
>  {
> -    return mr != &io_mem_rom && mr != &io_mem_notdirty && !mr->rom_device
> +    return mr != &io_mem_rom && mr != &io_mem_notdirty
> +        && !(mr->rom_device && mr->romd_mode)
>          && mr != &io_mem_watch;
>  }

I think the problem here is that we're trying to put too
many things into the same get_page_addr_code() condition.
There are two cases here:
 (1) we tried to execute from a memory address where there
     is genuinely no assigned device: we should invoke the
     CPU's do_unassigned_access hook, which will probably
     cause an exception
 (2) we tried to execute from a memory address where there
     is something there, but it's a device (or in this
     case a ROM not in ROMD mode): real hardware would make
     some attempt at executing whatever rubbish the device
     returned, but we will report_bad_exec() and exit.

But we use the same "if (memory_region_is_unassigned())"
conditional for both things. The function name suggests
it checks for case (1) but the actual code is checking
for case (2) (but forgetting rom-not-in-romd), and the
calling code is using it for both (1) and (2) simultaneously...

I think the unassigned-access code is a bit conceptually
broken and in need of a redesign, which is one reason
I've never tried to implement the hook for ARM...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-01-06 16:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-05 22:52 [Qemu-devel] qemu-softmmu aborted with "Bad ram pointer" Max Filippov
2017-01-06 10:23 ` Peter Maydell
2017-01-06 15:24   ` Max Filippov
2017-01-06 16:15     ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).