[arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction

* [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction
@ 2019-11-12 23:22 Luis Machado
  2019-11-18 13:15 ` Will Deacon
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado @ 2019-11-12 23:22 UTC (permalink / raw)
  To: linux-arm-kernel, will

Hi,

I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP 
request by GDB won't execute the underlying instruction. As a 
consequence, the PC doesn't move, but we return a SIGTRAP just like we 
would for a regular successful PTRACE_SINGLESTEP request.

Since there are no software breakpoints inserted at PC (we are actually 
stepping over a breakpoint, so GDB removes the breakpoint at PC before 
issuing a PTRACE_SINGLESTEP request), this is an odd behavior.

Though not too harmful, i see this manifesting in the GDB testsuite 
(gdb.reverse/insn-reverse.exp), which throws the test off by making GDB 
think it is further in the instruction stream than it really is. In 
fact, we get lucky here and no FAIL's show up, only many more spurious 
PASSes.

Since the reproduction steps involve GDB and the testcase, I'll report 
my findings here for convenience. But it can be reproduced with a 
top-of-tree kernel (what i used) or an Ubuntu one (4.12.13), it doesn't 
make a difference. I've also reproduced this in real hardware and under 
QEMU.

I did some rudimentary debugging to confirm GDB wasn't doing anything 
wrong, and placed some debugging output on the arm64 ptrace-related 
functions in the kernel. I also added some debugging output to the 
function that handles software breakpoint traps, to make sure no 
breakpoints were being inadvertently left behind.

At the point where GDB issues PTRACE_SINGLESTEP, we see this:

<case 1>
<before execution>
[  524.329276] >>>> Start 
user_enable_single_step,/repos/linux/arch/arm64/kernel/debug-monitors.c:450 
<<<<
[  524.329314] >>>> PC is 400574 <<<<
[  524.329329] >>>> End 
user_enable_single_step,/repos/linux/arch/arm64/kernel/debug-monitors.c:459 
<<<<
<after execution>
[  524.329679] >>>> Start 
single_step_handler,/repos/linux/arch/arm64/kernel/debug-monitors.c:249 <<<<
[  524.329707] >>>> PC is 400574 <<<<
[  524.329725] >>>> Start 
send_user_sigtrap,/repos/linux/arch/arm64/kernel/debug-monitors.c:228 <<<<
[  524.329733] >>>> PC is 400574 <<<<
[  524.329783] >>>> End 
send_user_sigtrap,/repos/linux/arch/arm64/kernel/debug-monitors.c:241 <<<<
[  524.329794] >>>> End 
single_step_handler,/repos/linux/arch/arm64/kernel/debug-monitors.c:280 <<<<

A regular successful PTRACE_SINGLESTEP should look like this instead:

<case 2>
<before execution>
[  981.042942] >>>> Start 
user_enable_single_step,/repos/linux/arch/arm64/kernel/debug-monitors.c:450 
<<<<
[  981.042982] >>>> PC is 400574 <<<<
[  981.042997] >>>> End 
user_enable_single_step,/repos/linux/arch/arm64/kernel/debug-monitors.c:459 
<<<<
<after execution>
[  981.043411] >>>> Start 
single_step_handler,/repos/linux/arch/arm64/kernel/debug-monitors.c:249 <<<<
[  981.043453] >>>> PC is 400578 <<<<
[  981.043472] >>>> Start 
send_user_sigtrap,/repos/linux/arch/arm64/kernel/debug-monitors.c:228 <<<<
[  981.043481] >>>> PC is 400578 <<<<
[  981.043540] >>>> End 
send_user_sigtrap,/repos/linux/arch/arm64/kernel/debug-monitors.c:241 <<<<
[  981.043553] >>>> End 
single_step_handler,/repos/linux/arch/arm64/kernel/debug-monitors.c:280 <<<<

As a guess, i decided to revert commit 
3a402a709500c5a3faca2111668c33d96555e35a (arm64: debug: avoid resetting 
stepping state machine when TIF_SINGLESTEP) to see its effect on this 
particular case. Then the output looks like <case 2> above, which is 
correct.

So this is at least partially caused by commit 
3a402a709500c5a3faca2111668c33d96555e35a, but i don't understand the 
full picture (involving the kernel) here. I know said commit is needed 
for other problematic cases in GDB (fork/vfork for example), but it 
might be having undesirable side effects here.

Here's how to reproduce. Make sure you have a reasonably new GDB (I 
reproduced it with Ubuntu's GDB 7.11.1-0ubuntu1~16.5). You can also 
build GDB from the git tree if you want. A standard aarch64-linux-gnu 
GDB will do.

Grab both of these source files for the testcase:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=gdb/testsuite/gdb.reverse/insn-reverse.c;hb=HEAD
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=gdb/testsuite/gdb.reverse/insn-reverse-aarch64.c;hb=HEAD

Build the testcase with: gcc -O0 -g3 -lm insn-reverse.c -o insn-reverse

Execute gdb like so:

gdb -ex "set displaced-stepping off" -ex "b load" -ex "run" -ex "record" 
-ex "si" -ex "rsi" -ex "record stop" insn-reverse

What the above does is put a breakpoint in "load", run to it, enable 
reversible debugging, step one instruction forward, step back one 
instruction (essentially coming back to the same PC) and then shutting 
down reversible debugging.

Now, giving gdb the "si" command will cause it to execute the 
PTRACE_SINGLESTEP i pointed out above, in my explanation of the bug.

display/x $pc
stepi

You'll see, if it reproduces, the PC has not changed and the instruction 
has not executed. GDB will indicate a breakpoint hit, but this is bogus. 
It is due to the fact the PC didn't move, and GDB still has a breakpoint 
listed in this PC.

Please let me know if i can help with any other information in case any 
of the steps is not clear.

Thanks,
Luis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread