* gdbstub and gbd segfaults on different instructions in user space emulation
@ 2019-09-30 15:46 Libo Zhou
2019-09-30 16:23 ` Peter Maydell
0 siblings, 1 reply; 3+ messages in thread
From: Libo Zhou @ 2019-09-30 15:46 UTC (permalink / raw)
To: qemu-devel
Hi all,
I am encountering segmentation fault while porting my custom ISA to QEMU. My custom ISA is VERY VERY simple, it only changes the [31:26] opcode field of LW and SW instructions. The link has my very simple implementation: https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg06976.html
Below is the objdump of the main part of my simple ELF. The dots are the automatically generated nop's by the compiler.
00400090 <main>:
400090: 23bdffe0 addi r29,r29,-32
400094: 7fbe001c sw r30,28(r29.
...
4000a0: 03a0f021 addu r30,r29,r0
4000a4: 20020001 li r2,1 # int a = 1;
4000a8: 7fc20010 sw r2,16(r30)
...
4000b4: 20020002 li r2,2 # int b = 2;
4000b8: 7fc2000c sw r2,12(r30)
...
4000c4: 5fc30010 lw r3,16(r30)
4000c8: 00000000 nop
4000cc: 5fc2000c lw r2,12(r30)
...
4000d8: 00621020 add r2,r3,r2 # int c = a + b;
4000dc: 7fc20008 sw r2,8(r30)
...
4000e8: 00001021 addu r2,r0,r0
4000ec: 03c0e821 addu r29,r30,r0
4000f0: 5fbe001c lw r30,28(r29)
4000f4: 23bd0020 addi r29,r29,32
4000f8: 03e00008 jr r31
4000fc: 00000000 nop
...
The code below gives me segfault:
$ ./qemu-mipsel -cpu mycpu testprogram
I have tried 2 ways of debugging it.
Firstly, I connected gdb-multiarch to gdbstub, and I single-stepped the instructions in my ELF. Immediately after the LW instruction, the segfault was thrown. I observed the memory location using 'x' command and found that at least my SW instruction was implemented correctly.
Secondly, I used gdb to directly debug QEMU. I set the breakpoint at function in translate.c:decode_opc. Pressing 'c' should have the same effect as single-stepping instruction in gdbstub. However, the segmentation fault wasn't thrown after LW. It was instead thrown after the 'nop' after 'jr r31' in the objdump.
At this point, I am really stuck. I have spent a long time on this, but I just can't figure out what is going wrong here. If anyone can help me out I would really appreciate it.
Cheers,
Libo Zhou
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: gdbstub and gbd segfaults on different instructions in user space emulation
2019-09-30 15:46 gdbstub and gbd segfaults on different instructions in user space emulation Libo Zhou
@ 2019-09-30 16:23 ` Peter Maydell
2019-10-06 14:11 ` gdbstub and gbd segfaults on different instructions in user spaceemulation Libo Zhou
0 siblings, 1 reply; 3+ messages in thread
From: Peter Maydell @ 2019-09-30 16:23 UTC (permalink / raw)
To: Libo Zhou; +Cc: qemu-devel
On Mon, 30 Sep 2019 at 16:57, Libo Zhou <zhlb29@foxmail.com> wrote:
> I am encountering segmentation fault while porting my custom ISA to QEMU. My custom ISA is VERY VERY simple, it only changes the [31:26] opcode field of LW and SW instructions. The link has my very simple implementation: https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg06976.html
> I have tried 2 ways of debugging it.
> Firstly, I connected gdb-multiarch to gdbstub, and I single-stepped the instructions in my ELF. Immediately after the LW instruction, the segfault was thrown. I observed the memory location using 'x' command and found that at least my SW instruction was implemented correctly.
> Secondly, I used gdb to directly debug QEMU. I set the breakpoint at function in translate.c:decode_opc. Pressing 'c' should have the same effect as single-stepping instruction in gdbstub. However, the segmentation fault wasn't thrown after LW. It was instead thrown after the 'nop' after 'jr r31' in the objdump.
(1) If you're debugging the QEMU JIT itself, then you're probably
better off using QEMU's logging facilities (under the -d option)
rather than the gdbstub. The gdbstub is good if you're sure that
QEMU is basically functional and want to debug your guest, but
if you suspect bugs in QEMU itself then it can confuse you.
The -d debug logging is at a much lower level, which makes it
a better guide to what QEMU is really doing, though it is also
trickier to interpret.
(2) No, breakpointing on decode_opc is not the same as singlestepping
an instruction in gdb. This is a really important concept in QEMU
(and JITs in general) and if you don't understand it you're going
to be very confused. A JIT has two phases:
(a) "translate time", when we take a block of guest instructions
and generate host machine code for them
(b) "execution time", when we execute one or more of the blocks
of host machine code that we wrote at translate time
QEMU calls the blocks it works with "translation blocks", and
usually it will put multiple guest instructions into each TB;
a TB usually stops after a guest branch instructions. (You can
ask QEMU to put just one guest instruction into a TB using
the -singlestep command line option -- this is sometimes useful
when debugging.)
So if you put a breakpoint on decode_opc you'll see it is hit
for every instruction in the TB, which for the TB starting at
"00400090 <main>" will be every instruction up to and including
the 'nop' in the delay slot of the 'jr'. Once the whole TB is
translated, *then* we will execute it. It's only at execute time
that we perform the actual operations on the guest CPU that
the instructions require. If the segfault is because we think
the guest has made a bad memory access, we'll generate it here.
If the segfault is an actual crash in QEMU itself, it will
happen here if the bug is one that happens at execution time.
Note that the -d logging will distinguish between things that
happen at translate time (which is when the in_asm, op, out_asm etc
logging is printed) and things that happen at execution time
(which is when cpu, exec, int, etc logs are printed).
thanks
-- PMM
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: gdbstub and gbd segfaults on different instructions in user spaceemulation
2019-09-30 16:23 ` Peter Maydell
@ 2019-10-06 14:11 ` Libo Zhou
0 siblings, 0 replies; 3+ messages in thread
From: Libo Zhou @ 2019-10-06 14:11 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel
Hi Peter,
I have finally got the chance to reply. Thanks for your explanation, I have learned the important concept of JIT.
I have been playing with the logging options in -d, but I found something weird that makes it tricky for me to figure out the cause of the segfault. As you mentioned, I need to know if the segfault is caused by guest program having bad mem access, or QEMU itself crashing.
To recap, I just changed the [31:26] opcode field of LW and SW instructions in translate.c. And I used this following line to diagnose:
$ ./qemu-mipsel -cpu maotu -d in_asm,nochain -D debug.log -singlestep test
And below is my weird in_asm log. The log looks very weird, the instructions are just not the ones I saw in my objdump. The dmult.g instruction, as you pointed out before, is a Loongson instruction. I have also noticed that, the in_asm should have given me a longer log, with some other parts besides main. This log only has main in it.
Do you have any idea what else I can try? This segfault has bugged me for 2 weeks, but I still believe there is a solution, even if the logs are tricky to interpret. I just don't know how changing 6 bits of opcode field could lead to so many issues.
----------------
IN: main
0x00400090: bovc sp,sp,0x400014
----------------
IN: main
0x00400094: dmult.g zero,sp,s8
----------------
IN: main
0x00400098: nop
----------------
IN: main
0x0040009c: nop
----------------
IN: main
0x004000a0: move s8,sp
----------------
IN: main
0x004000a4: beqzalc zero,v0,0x4000ac
----------------
IN: main
0x004000a8: addu.qb zero,s8,v0
----------------
IN: main
0x004000ac: nop
----------------
IN: main
0x004000b0: nop
----------------
IN: main
0x004000b4: beqzalc zero,v0,0x4000c0
----------------
IN: main
0x004000b8: insv v0,s8
----------------
IN: main
0x004000bc: nop
----------------
IN: main
0x004000c0: nop
----------------
IN: main
0x004000c4: bltc s8,v1,0x400108
----------------
IN: main
0x004000c8: nop
----------------
IN: main
0x004000cc: bltc s8,v0,0x400100
----------------
IN: main
0x004000d0: nop
----------------
IN: main
0x004000d4: nop
----------------
IN: main
0x004000d8: add v0,v1,v0
----------------
IN: main
0x004000dc: fork zero,s8,v0
----------------
IN: main
0x004000e0: nop
----------------
IN: main
0x004000e4: nop
----------------
IN: main
0x004000e8: move v0,zero
----------------
IN: main
0x004000ec: move sp,s8
----------------
IN: main
0x004000f0: bltc sp,s8,0x400164
----------------
IN: main
0x004000f4: bovc sp,sp,0x400178
----------------
IN: main
0x004000f8: jr ra
----------------
IN: main
0x004000fc: nop
------------------ Original ------------------
From: "Peter Maydell";<peter.maydell@linaro.org>;
Send time: Tuesday, Oct 1, 2019 0:23 AM
To: "Libo Zhou"<zhlb29@foxmail.com>;
Cc: "qemu-devel"<qemu-devel@nongnu.org>;
Subject: Re: gdbstub and gbd segfaults on different instructions in user spaceemulation
On Mon, 30 Sep 2019 at 16:57, Libo Zhou <zhlb29@foxmail.com> wrote:
> I am encountering segmentation fault while porting my custom ISA to QEMU. My custom ISA is VERY VERY simple, it only changes the [31:26] opcode field of LW and SW instructions. The link has my very simple implementation: https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg06976.html
> I have tried 2 ways of debugging it.
> Firstly, I connected gdb-multiarch to gdbstub, and I single-stepped the instructions in my ELF. Immediately after the LW instruction, the segfault was thrown. I observed the memory location using 'x' command and found that at least my SW instruction was implemented correctly.
> Secondly, I used gdb to directly debug QEMU. I set the breakpoint at function in translate.c:decode_opc. Pressing 'c' should have the same effect as single-stepping instruction in gdbstub. However, the segmentation fault wasn't thrown after LW. It was instead thrown after the 'nop' after 'jr r31' in the objdump.
(1) If you're debugging the QEMU JIT itself, then you're probably
better off using QEMU's logging facilities (under the -d option)
rather than the gdbstub. The gdbstub is good if you're sure that
QEMU is basically functional and want to debug your guest, but
if you suspect bugs in QEMU itself then it can confuse you.
The -d debug logging is at a much lower level, which makes it
a better guide to what QEMU is really doing, though it is also
trickier to interpret.
(2) No, breakpointing on decode_opc is not the same as singlestepping
an instruction in gdb. This is a really important concept in QEMU
(and JITs in general) and if you don't understand it you're going
to be very confused. A JIT has two phases:
(a) "translate time", when we take a block of guest instructions
and generate host machine code for them
(b) "execution time", when we execute one or more of the blocks
of host machine code that we wrote at translate time
QEMU calls the blocks it works with "translation blocks", and
usually it will put multiple guest instructions into each TB;
a TB usually stops after a guest branch instructions. (You can
ask QEMU to put just one guest instruction into a TB using
the -singlestep command line option -- this is sometimes useful
when debugging.)
So if you put a breakpoint on decode_opc you'll see it is hit
for every instruction in the TB, which for the TB starting at
"00400090 <main>" will be every instruction up to and including
the 'nop' in the delay slot of the 'jr'. Once the whole TB is
translated, *then* we will execute it. It's only at execute time
that we perform the actual operations on the guest CPU that
the instructions require. If the segfault is because we think
the guest has made a bad memory access, we'll generate it here.
If the segfault is an actual crash in QEMU itself, it will
happen here if the bug is one that happens at execution time.
Note that the -d logging will distinguish between things that
happen at translate time (which is when the in_asm, op, out_asm etc
logging is printed) and things that happen at execution time
(which is when cpu, exec, int, etc logs are printed).
thanks
-- PMM
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-10-06 14:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-30 15:46 gdbstub and gbd segfaults on different instructions in user space emulation Libo Zhou
2019-09-30 16:23 ` Peter Maydell
2019-10-06 14:11 ` gdbstub and gbd segfaults on different instructions in user spaceemulation Libo Zhou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).