public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Luis Machado <luis.machado@linaro.org>
Cc: mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction
Date: Mon, 13 Jan 2020 15:13:19 -0300	[thread overview]
Message-ID: <f39f3da6-e703-8ee3-3651-953e915e822b@linaro.org> (raw)
In-Reply-To: <307ece3d-4e9d-21c4-0abf-9f4f3b313e74@linaro.org>

Ping?

On 11/26/19 1:35 PM, Luis Machado wrote:
> ping?
> 
> On 11/18/19 11:54 AM, Luis Machado wrote:
>> Hi Will,
>>
>> Thanks for the thorough explanation.
>>
>> On 11/18/19 10:15 AM, Will Deacon wrote:
>>> Hi Luis,
>>>
>>> [+Mark for the valid_user_regs() part]
>>>
>>> On Tue, Nov 12, 2019 at 08:22:10PM -0300, Luis Machado wrote:
>>>> I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP
>>>> request by GDB won't execute the underlying instruction. As a 
>>>> consequence,
>>>> the PC doesn't move, but we return a SIGTRAP just like we would for a
>>>> regular successful PTRACE_SINGLESTEP request.
>>>>
>>>> Since there are no software breakpoints inserted at PC (we are actually
>>>> stepping over a breakpoint, so GDB removes the breakpoint at PC before
>>>> issuing a PTRACE_SINGLESTEP request), this is an odd behavior.
>>>>
>>>> Though not too harmful, i see this manifesting in the GDB testsuite
>>>> (gdb.reverse/insn-reverse.exp), which throws the test off by making GDB
>>>> think it is further in the instruction stream than it really is. In 
>>>> fact, we
>>>> get lucky here and no FAIL's show up, only many more spurious PASSes.
>>>
>>> I managed to reproduce this locally and I think I've figured out what's
>>> going on, although I'm not sure that the kernel is the best place to fix
>>> it.
>>>
>>> Looking at the specific reproducer:
>>>
>>>> Execute gdb like so:
>>>>
>>>> gdb -ex "set displaced-stepping off" -ex "b load" -ex "run" -ex 
>>>> "record" -ex
>>>> "si" -ex "rsi" -ex "record stop" insn-reverse
>>>
>>> So we've got a couple of instructions as follows (it doesn't actually 
>>> matter
>>> what they are, so I've changed the LD1 in your binary for a NOP in 
>>> order to
>>> avoid confusion with the "load" label not actually pointing at a load):
>>>
>>>     0x7b8:        mov    // "load"
>>>     0x7bc:        nop
>>>
>>> "b load" places a breakpoint at 0x7b8:
>>>
>>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
>>>
>>> We run to a software breakpoint on "load" (the mov instruction). We take
>>> the trap and try to execute the "si", which means we need to remove the
>>> breakpoint while we step over it:
>>>
>>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
>>>     [...]
>>>     ptrace(PTRACE_SINGLESTEP, 662, 0x1, 0)  = 0
>>>
>>> This causes the kernel to arm the single-step state machine so that
>>> MDSCR_EL1.SS == SPSR_EL1.SS == 1 (known as "active-not-pending"). 
>>> Running
>>> an instruction in userspace will transition to MDSCR_EL1.SS ==1 and
>>> SPSR_EL1.SS == 0 (known as "active-pending"), which will cause the 
>>> trap to
>>> trigger, at which point gdb puts the breakpoint instruction back 
>>> since the
>>> step is complete:
>>
>> So, just to confirm my understanding, we have a couple bits 
>> controlling single-stepping in the kernel, one in MDSCR_EL1 and 
>> another in SPSR_EL1. GDB doesn't have direct access to any of those, 
>> correct?
>>
>> Instead, GDB has access to a SS bit in the reserved 21~22 range of CPSR.
>>
>> The transition from active-not-pending to active-pending takes place 
>> via a single PTRACE_SINGLESTEP request? Is that correct?
>>
>>>
>>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
>>>
>>> This is where things start to go wrong. The "rsi" command attempts to
>>> perform a reverse step, which means restoring the old state when we were
>>> previously executing at 0x7b8. It starts by removing the breakpoint 
>>> again,
>>> since we've already hit that:
>>>
>>>     ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
>>>
>>> and then resets the CPU registers to their old values:
>>>
>>>     (I don't know why it does this three times)
>>>     ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, 
>>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>>     ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, 
>>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>>     ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, 
>>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>>
>>> The problem with this is that we have moved the PC back to 0x7b8 but 
>>> we have
>>> also cleared SPSR_EL1.SS to 0. Internally, the kernel hasn't seen 
>>> stepping
>>> get disabled (this usually happens by PTRACE_CONT calling
>>> user_disable_single_step()) which means that MDSCR_EL1.SS remains set 
>>> to 1
>>> and we're in the active-pending state! Consequently, we immediately 
>>> take a
>>> step exception if a step operation is attempted >
>>
>> While trying to reproduce this, i was paying attention to the SS bit 
>> coming and going. But in the particular sequence of si/rsi, within the 
>> record boundaries, i see GDB just restored the original CPSR value to 
>> what it was before we processed the si command.
>>
>>  From GDB's POV all state was restore to the way it was before and 
>> we're good to go.
>>
>> Is this not enough to restore state kernel-wise?
>>
>>> Now, we *could* consider hacking the TIF_SINGLESTEP check in
>>> valid_user_regs() so that SPSR_EL1.SS is preserved when stepping is 
>>> active
>>> but this is a user-visible change and may break things like stepping 
>>> out of
>>> signal handlers. I would prefer that GDB manages the SS bit 
>>> explicitly in
>>> this scenario, by setting it to 1 when restoring the old state in the
>>> reverse step, a bit like when it disables the old breakpoint. You can
>>> emulate this by doing:
>>
>> I think we could let GDB control this when required, but I'm trying to 
>> understand the ramifications of letting GDB do so.
>>
>> For example, what if the user decides to alter the PC here and there, 
>> for debugging purposes. That is a use case that happens often, in 
>> order to go back or skip some parts of the code.
>>
>> Would we need to pay attention to the SS bit in those cases as well?
>>
>>>
>>>     (gdb) set $cpsr |= (1<<21)
>>
>> In particular, what does the switching of this bit accomplishes in the 
>> kernel? Would we be better off forcing the SS bit every time we do a 
>> single-step operation, for example?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

      parent reply	other threads:[~2020-01-13 18:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12 23:22 [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction Luis Machado
2019-11-18 13:15 ` Will Deacon
2019-11-18 14:54   ` Luis Machado
2019-11-26 16:35     ` Luis Machado
2019-12-10 20:00       ` Luis Machado
2020-02-13 12:01         ` Will Deacon
2020-02-13 17:07           ` Luis Machado
2020-02-14 15:45             ` Luis Machado
2020-02-18  8:44               ` Will Deacon
2020-02-18 10:33                 ` Luis Machado
2020-02-26 13:01                   ` Luis Machado
2020-02-20 13:02           ` Mark Rutland
2020-02-20 13:29             ` Will Deacon
2020-02-21 11:16               ` Mark Rutland
2020-05-27 14:39                 ` Luis Machado
2020-05-31  9:52                 ` Will Deacon
2020-01-13 18:13       ` Luis Machado [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f39f3da6-e703-8ee3-3651-953e915e822b@linaro.org \
    --to=luis.machado@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox