From: Luis Machado <luis.machado@linaro.org>
To: Will Deacon <will@kernel.org>
Cc: mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction
Date: Tue, 26 Nov 2019 13:35:08 -0300 [thread overview]
Message-ID: <307ece3d-4e9d-21c4-0abf-9f4f3b313e74@linaro.org> (raw)
In-Reply-To: <b3a9ae7e-8a45-7c14-7bc6-1d3b62728a0c@linaro.org>
ping?
On 11/18/19 11:54 AM, Luis Machado wrote:
> Hi Will,
>
> Thanks for the thorough explanation.
>
> On 11/18/19 10:15 AM, Will Deacon wrote:
>> Hi Luis,
>>
>> [+Mark for the valid_user_regs() part]
>>
>> On Tue, Nov 12, 2019 at 08:22:10PM -0300, Luis Machado wrote:
>>> I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP
>>> request by GDB won't execute the underlying instruction. As a
>>> consequence,
>>> the PC doesn't move, but we return a SIGTRAP just like we would for a
>>> regular successful PTRACE_SINGLESTEP request.
>>>
>>> Since there are no software breakpoints inserted at PC (we are actually
>>> stepping over a breakpoint, so GDB removes the breakpoint at PC before
>>> issuing a PTRACE_SINGLESTEP request), this is an odd behavior.
>>>
>>> Though not too harmful, i see this manifesting in the GDB testsuite
>>> (gdb.reverse/insn-reverse.exp), which throws the test off by making GDB
>>> think it is further in the instruction stream than it really is. In
>>> fact, we
>>> get lucky here and no FAIL's show up, only many more spurious PASSes.
>>
>> I managed to reproduce this locally and I think I've figured out what's
>> going on, although I'm not sure that the kernel is the best place to fix
>> it.
>>
>> Looking at the specific reproducer:
>>
>>> Execute gdb like so:
>>>
>>> gdb -ex "set displaced-stepping off" -ex "b load" -ex "run" -ex
>>> "record" -ex
>>> "si" -ex "rsi" -ex "record stop" insn-reverse
>>
>> So we've got a couple of instructions as follows (it doesn't actually
>> matter
>> what they are, so I've changed the LD1 in your binary for a NOP in
>> order to
>> avoid confusion with the "load" label not actually pointing at a load):
>>
>> 0x7b8: mov // "load"
>> 0x7bc: nop
>>
>> "b load" places a breakpoint at 0x7b8:
>>
>> ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
>>
>> We run to a software breakpoint on "load" (the mov instruction). We take
>> the trap and try to execute the "si", which means we need to remove the
>> breakpoint while we step over it:
>>
>> ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
>> [...]
>> ptrace(PTRACE_SINGLESTEP, 662, 0x1, 0) = 0
>>
>> This causes the kernel to arm the single-step state machine so that
>> MDSCR_EL1.SS == SPSR_EL1.SS == 1 (known as "active-not-pending"). Running
>> an instruction in userspace will transition to MDSCR_EL1.SS ==1 and
>> SPSR_EL1.SS == 0 (known as "active-pending"), which will cause the
>> trap to
>> trigger, at which point gdb puts the breakpoint instruction back since
>> the
>> step is complete:
>
> So, just to confirm my understanding, we have a couple bits controlling
> single-stepping in the kernel, one in MDSCR_EL1 and another in SPSR_EL1.
> GDB doesn't have direct access to any of those, correct?
>
> Instead, GDB has access to a SS bit in the reserved 21~22 range of CPSR.
>
> The transition from active-not-pending to active-pending takes place via
> a single PTRACE_SINGLESTEP request? Is that correct?
>
>>
>> ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
>>
>> This is where things start to go wrong. The "rsi" command attempts to
>> perform a reverse step, which means restoring the old state when we were
>> previously executing at 0x7b8. It starts by removing the breakpoint
>> again,
>> since we've already hit that:
>>
>> ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
>>
>> and then resets the CPU registers to their old values:
>>
>> (I don't know why it does this three times)
>> ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS,
>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>> ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS,
>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>> ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS,
>> [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
>>
>> The problem with this is that we have moved the PC back to 0x7b8 but
>> we have
>> also cleared SPSR_EL1.SS to 0. Internally, the kernel hasn't seen
>> stepping
>> get disabled (this usually happens by PTRACE_CONT calling
>> user_disable_single_step()) which means that MDSCR_EL1.SS remains set
>> to 1
>> and we're in the active-pending state! Consequently, we immediately
>> take a
>> step exception if a step operation is attempted >
>
> While trying to reproduce this, i was paying attention to the SS bit
> coming and going. But in the particular sequence of si/rsi, within the
> record boundaries, i see GDB just restored the original CPSR value to
> what it was before we processed the si command.
>
> From GDB's POV all state was restore to the way it was before and we're
> good to go.
>
> Is this not enough to restore state kernel-wise?
>
>> Now, we *could* consider hacking the TIF_SINGLESTEP check in
>> valid_user_regs() so that SPSR_EL1.SS is preserved when stepping is
>> active
>> but this is a user-visible change and may break things like stepping
>> out of
>> signal handlers. I would prefer that GDB manages the SS bit explicitly in
>> this scenario, by setting it to 1 when restoring the old state in the
>> reverse step, a bit like when it disables the old breakpoint. You can
>> emulate this by doing:
>
> I think we could let GDB control this when required, but I'm trying to
> understand the ramifications of letting GDB do so.
>
> For example, what if the user decides to alter the PC here and there,
> for debugging purposes. That is a use case that happens often, in order
> to go back or skip some parts of the code.
>
> Would we need to pay attention to the SS bit in those cases as well?
>
>>
>> (gdb) set $cpsr |= (1<<21)
>
> In particular, what does the switching of this bit accomplishes in the
> kernel? Would we be better off forcing the SS bit every time we do a
> single-step operation, for example?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-11-26 16:35 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-12 23:22 [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction Luis Machado
2019-11-18 13:15 ` Will Deacon
2019-11-18 14:54 ` Luis Machado
2019-11-26 16:35 ` Luis Machado [this message]
2019-12-10 20:00 ` Luis Machado
2020-02-13 12:01 ` Will Deacon
2020-02-13 17:07 ` Luis Machado
2020-02-14 15:45 ` Luis Machado
2020-02-18 8:44 ` Will Deacon
2020-02-18 10:33 ` Luis Machado
2020-02-26 13:01 ` Luis Machado
2020-02-20 13:02 ` Mark Rutland
2020-02-20 13:29 ` Will Deacon
2020-02-21 11:16 ` Mark Rutland
2020-05-27 14:39 ` Luis Machado
2020-05-31 9:52 ` Will Deacon
2020-01-13 18:13 ` Luis Machado
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=307ece3d-4e9d-21c4-0abf-9f4f3b313e74@linaro.org \
--to=luis.machado@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox