Re: [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction

From: Luis Machado <luis.machado@linaro.org>
To: Will Deacon <will@kernel.org>
Cc: mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction
Date: Tue, 26 Nov 2019 13:35:08 -0300	[thread overview]
Message-ID: <307ece3d-4e9d-21c4-0abf-9f4f3b313e74@linaro.org> (raw)
In-Reply-To: <b3a9ae7e-8a45-7c14-7bc6-1d3b62728a0c@linaro.org>

ping?

On 11/18/19 11:54 AM, Luis Machado wrote:
> Hi Will,
> 
> Thanks for the thorough explanation.
> 
> On 11/18/19 10:15 AM, Will Deacon wrote:
>> Hi Luis,
>>
>> [+Mark for the valid_user_regs() part]
>>
>> On Tue, Nov 12, 2019 at 08:22:10PM -0300, Luis Machado wrote:
>>> I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP
>>> request by GDB won't execute the underlying instruction. As a 
>>> consequence,
>>> the PC doesn't move, but we return a SIGTRAP just like we would for a
>>> regular successful PTRACE_SINGLESTEP request.
>>>
>>> Since there are no software breakpoints inserted at PC (we are actually
>>> stepping over a breakpoint, so GDB removes the breakpoint at PC before
>>> issuing a PTRACE_SINGLESTEP request), this is an odd behavior.
>>>
>>> Though not too harmful, i see this manifesting in the GDB testsuite
>>> (gdb.reverse/insn-reverse.exp), which throws the test off by making GDB
>>> think it is further in the instruction stream than it really is. In 
>>> fact, we
>>> get lucky here and no FAIL's show up, only many more spurious PASSes.
>>
>> I managed to reproduce this locally and I think I've figured out what's
>> going on, although I'm not sure that the kernel is the best place to fix
>> it.
>>
>> Looking at the specific reproducer:
>>
>>> Execute gdb like so:
>>>
>>> gdb -ex "set displaced-stepping off" -ex "b load" -ex "run" -ex 
>>> "record" -ex
>>> "si" -ex "rsi" -ex "record stop" insn-reverse
>>
>> So we've got a couple of instructions as follows (it doesn't actually 
>> matter
>> what they are, so I've changed the LD1 in your binary for a NOP in 
>> order to
>> avoid confusion with the "load" label not actually pointing at a load):
>>
>>     0x7b8:        mov    // "load"
>>     0x7bc:        nop
>>
>> "b load" places a breakpoint at 0x7b8:
>>
>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
>>
>> We run to a software breakpoint on "load" (the mov instruction). We take
>> the trap and try to execute the "si", which means we need to remove the
>> breakpoint while we step over it:
>>
>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
>>     [...]
>>     ptrace(PTRACE_SINGLESTEP, 662, 0x1, 0)  = 0
>>
>> This causes the kernel to arm the single-step state machine so that
>> MDSCR_EL1.SS == SPSR_EL1.SS == 1 (known as "active-not-pending"). Running
>> an instruction in userspace will transition to MDSCR_EL1.SS ==1 and
>> SPSR_EL1.SS == 0 (known as "active-pending"), which will cause the 
>> trap to
>> trigger, at which point gdb puts the breakpoint instruction back since 
>> the
>> step is complete:
> 
> So, just to confirm my understanding, we have a couple bits controlling 
> single-stepping in the kernel, one in MDSCR_EL1 and another in SPSR_EL1. 
> GDB doesn't have direct access to any of those, correct?
> 
> Instead, GDB has access to a SS bit in the reserved 21~22 range of CPSR.
> 
> The transition from active-not-pending to active-pending takes place via 
> a single PTRACE_SINGLESTEP request? Is that correct?
> 
>>
>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
>>
>> This is where things start to go wrong. The "rsi" command attempts to
>> perform a reverse step, which means restoring the old state when we were
>> previously executing at 0x7b8. It starts by removing the breakpoint 
>> again,
>> since we've already hit that:
>>
>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
>>
>> and then resets the CPU registers to their old values:
>>
>>     (I don't know why it does this three times)
>>     ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, 
>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>     ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, 
>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>     ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, 
>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>
>> The problem with this is that we have moved the PC back to 0x7b8 but 
>> we have
>> also cleared SPSR_EL1.SS to 0. Internally, the kernel hasn't seen 
>> stepping
>> get disabled (this usually happens by PTRACE_CONT calling
>> user_disable_single_step()) which means that MDSCR_EL1.SS remains set 
>> to 1
>> and we're in the active-pending state! Consequently, we immediately 
>> take a
>> step exception if a step operation is attempted >
> 
> While trying to reproduce this, i was paying attention to the SS bit 
> coming and going. But in the particular sequence of si/rsi, within the 
> record boundaries, i see GDB just restored the original CPSR value to 
> what it was before we processed the si command.
> 
>  From GDB's POV all state was restore to the way it was before and we're 
> good to go.
> 
> Is this not enough to restore state kernel-wise?
> 
>> Now, we *could* consider hacking the TIF_SINGLESTEP check in
>> valid_user_regs() so that SPSR_EL1.SS is preserved when stepping is 
>> active
>> but this is a user-visible change and may break things like stepping 
>> out of
>> signal handlers. I would prefer that GDB manages the SS bit explicitly in
>> this scenario, by setting it to 1 when restoring the old state in the
>> reverse step, a bit like when it disables the old breakpoint. You can
>> emulate this by doing:
> 
> I think we could let GDB control this when required, but I'm trying to 
> understand the ramifications of letting GDB do so.
> 
> For example, what if the user decides to alter the PC here and there, 
> for debugging purposes. That is a use case that happens often, in order 
> to go back or skip some parts of the code.
> 
> Would we need to pay attention to the SS bit in those cases as well?
> 
>>
>>     (gdb) set $cpsr |= (1<<21)
> 
> In particular, what does the switching of this bit accomplishes in the 
> kernel? Would we be better off forcing the SS bit every time we do a 
> single-step operation, for example?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel