From: Will Deacon <will@kernel.org>
To: minyard@acm.org
Cc: Catalin Marinas <catalin.marinas@arm.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Corey Minyard <cminyard@mvista.com>
Subject: Re: [PATCH v2] arm64:kgdb: Fix kernel single-stepping
Date: Thu, 20 Feb 2020 14:22:14 +0000 [thread overview]
Message-ID: <20200220142214.GC14459@willie-the-truck> (raw)
In-Reply-To: <20200219152403.3495-1-minyard@acm.org>
On Wed, Feb 19, 2020 at 09:24:03AM -0600, minyard@acm.org wrote:
> From: Corey Minyard <cminyard@mvista.com>
>
> I was working on a single-step bug on kgdb on an ARM64 system, and I saw
> this scenario:
>
> * A single step is setup to return to el1
> * The ERET return to el1
> * An interrupt is pending and runs before the instruction
> * As soon as PSTATE.D (the debug disable bit) is cleared, the single
> step happens in that location, not where it should have.
>
> This appears to be due to PSTATE.SS not being cleared when the exception
> happens. Per section D.2.12.5 of the ARMv8 reference manual, that
> appears to be incorrect, it says "As part of exception entry, the PE
> does all of the following: ... Sets PSTATE.SS to 0."
Sorry, but I don't follow you here. If PSTATE.SS is not cleared, why would
you take the step exception?
> However, I appear to not be the first person who has noticed this. In
> the el0-only portion of the kernel_entry macro in entry.S, I found the
> following comment: "Ensure MDSCR_EL1.SS is clear, since we can unmask
> debug exceptions when scheduling." Exactly the same scenario, except
> coming from a userland single step, not a kernel one.
No, I think you might be conflating PSTATE.SS and MDSCR_EL1.SS.
> As I was studying this, though, I realized that the following scenario
> had an issue:
>
> * Kernel enables MDSCR.SS, MDSCR.KDE, MDSCR.MDE (unnecessary), and
> PSTATE.SS to enable a single step in el1, for kgdb or kprobes,
> on the current CPU's MDSCR register and the process' PSTATE.SS
> register.
> * Kernel returns from the exception with ERET.
> * An interrupt or page fault happens on the instruction, causing the
> instruction to not be run, but the exception handler runs.
> * The exception causes the task to migrate to a new core.
> * The return from the exception runs on a different processor now,
> where the MDSCR values are not set up for a single step.
> * The single step fails to happen.
>
> This is bad for kgdb, of course, but it seems really bad for kprobes if
> this happens.
I don't see how this can happen for kprobes. Have you managed to reproduce
the failure?
> To fix both these problems, rework the handling of single steps to clear
> things out upon entry to the kernel from el1, and then to set up single
> step when returning to el1, and not do the setup in debug-monitors.c.
> This means that single stepping does not use
> enable/disable_debug_monitors(); it is no longer necessary to track
> those flags for single stepping. This is much like single stepping is
> handled for el0. A new flag is added in pt_regs to enable single
> stepping from el1. Unfortunately, the old value of PSTATE.SS cannot be
> used for this because of the hardware bug mentioned earlier.
I don't think there's a hardware bug.
It sound like you're trying to make kernel debugging per-task instead
of per-cpu, but I don't think that's the right thing to do. What if I /want/
to debug an interrupt handler? For example, I might have a watchpoint on
something accessed by timer interrupt.
> As part of this, there is an interaction between single stepping and the
> other users of debug monitors with the MDSCR.KDE bit. That bit has to
> be set for both hardware breakpoints at el1 and single stepping at el1.
> A new variable was created to store the cpu-wide value of MDSCR.KDE; the
> single stepping code makes sure not to clear that bit on kernel entry if
> it's set in the per-cpu variable.
>
> After fixing this and doing some more testing, I ran into another issue:
>
> * Kernel enables the pt_regs single step
> * Kernel returns from the exception with ERET.
> * An interrupt or page fault happens on the instruction, causing the
> instruction to not be run, but the exception handler runs.
This sounds like you've broken debug; we should take the step exception
in the exception handler. That's the way this is supposed to work.
> There's no easy way to find the pt_regs that has the single step flag
> set. So a thread info flag was added so that the single step could be
> disabled in this case. Both that flag and the flag in pt_regs must be
> set to enable a single step.
Honestly, I get the feeling that you don't really understand the code
you're changing here and it's a tonne of effort to try to untangle what
you're doing. That's not necessarily your fault because the debug
architecture is a nightmare to comprehend, but I'm not keen to change it
unless we have a really good justification. I'm sure kgdb is riddled with
bugs but, as I said before, the fixes should be in kgdb, not by tearing
up the low-level debug code (which has the potential to break other users).
Maybe it would be easier if you tried to fix one problem per patch,
preferably with a way to reproduce the issue you're seeing each time?
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-02-20 14:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-19 15:24 [PATCH v2] arm64:kgdb: Fix kernel single-stepping minyard
2020-02-20 14:06 ` Daniel Thompson
2020-02-20 14:52 ` Corey Minyard
2020-02-20 14:21 ` Marc Zyngier
2020-02-20 14:50 ` Corey Minyard
2020-02-20 15:06 ` Marc Zyngier
2020-02-20 14:22 ` Will Deacon [this message]
2020-02-20 16:30 ` Corey Minyard
2020-02-20 21:30 ` Corey Minyard
2020-02-24 18:07 ` James Morse
2020-02-25 15:38 ` Corey Minyard
2020-02-25 17:55 ` James Morse
2020-02-26 2:58 ` Corey Minyard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200220142214.GC14459@willie-the-truck \
--to=will@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cminyard@mvista.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=minyard@acm.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox