public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	Dave Jones <davej@redhat.com>, Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH 4/5] x86: Allow nesting of the debug stack IDT setting
Date: Thu, 31 May 2012 13:17:59 -0700	[thread overview]
Message-ID: <4FC7D1F7.8090405@zytor.com> (raw)
In-Reply-To: <1338494431.13348.410.camel@gandalf.stny.rr.com>

On 05/31/2012 01:00 PM, Steven Rostedt wrote:
> 
> It doesn't ;-)
> 
> But we don't know if it is what it needs to be. Just because the counter
> is set to 1, doesn't mean that it was already set. Because we are
> dealing with NMIs, that are totally asynchronous, and the race of
> setting the counter and setting the idt.
> 
> Basically, (as you already know) the IST=0 means to use the same stack
> if we hit a breakpoint. But usually it's set to '4' which is an index
> into the TSS to find its per CPU stack.
> 
> If the IST is set to 4, and a breakpoint is hit, then the stack is set
> to a fixed address, even if you are currently using the same stack!
> 
> We need to prevent this from happening. There's two places that this can
> cause issues. The first is in the NMI handler, which is where this code
> was first developed. The idea was to allow NMIs to hit breakpoints. But
> if it were to do that after preempting a debug int3 handler, the
> breakpoint it hit would reset the stack and the NMI would start
> clobbering the stack of the code it preempted.
> 
> The NMI code on entering (and before it is allowed to hit any
> breakpoints) looks at the stack it preempted. If it is a debug stack,
> then it updates the IDT to have the int3 vector have a IST of 0 (keep
> the same stack when triggered). Then if the NMI hits a breakpoint, it
> just continues to use the NMI stack.
> 
> The int3 handler has a little trick that lets the int3 code reuse the
> stack. It modifies the TSS to point to another stack before calling the
> debug handler. If a NMI triggers now, or a breakpoint is hit again, it
> wont corrupt the stack.
> 
> 	subq $EXCEPTION_STKSZ, INIT_TSS_IST(\ist)
> 	call \do_sym
> 	addq $EXCEPTION_STKSZ, INIT_TSS_IST(\ist)
> 
> The debug stack is EXCEPTION_STKSZ * 2 size. Before entering the \do_sym
> (do_int3 in this case), it moves the TSS to point to another stack for
> the int3 handler.
> 
> The problem this patch set is trying to fix is the case for lockdep. The
> macro TRACE_IRQS_ON/OFF calls into lockdep code. And these are outside
> that add/sub TSS trick. If lockdep code hits a breakpoint than we reset
> back to the original stack address, and start clobbering the stack. This
> is the bug that Dave Jones was triggering.
> 
> Now we could do this add/sub TSS trick instead of loading idt for all
> the cases before calling lockdep in the debug handler. But this means
> the paranoid_exit will need to be turned into a macro and we would
> require it for each ist set.
> 
> Now, back to your original question. Why set it if it is already set.
> Well, it doesn't hurt to set it (except for the performance hit we
> take), but it does hurt if we should set it but don't, as that means we
> can reset the stack and clobber what we preempted.
> 
> I'm sure there's room to make this more efficient. But I'm currently
> trying to solve a kernel crash first, and then work on cleaning it up
> later.
> 

Ouch.  This is really way more complex than it has any excuse for being,
and it's the complexity that concerns me, not the performance.

I'd like a chart, or list, of the alternate stack environments we can be
in and what can transfer to what.  I think there might be an easier
solution that is more robust.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


  reply	other threads:[~2012-05-31 20:18 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-31  1:28 [PATCH 0/5] [GIT PULL] ftrace: Fix bug with function tracing and lockdep Steven Rostedt
2012-05-31  1:28 ` [PATCH 1/5] ftrace: Synchronize variable setting with breakpoints Steven Rostedt
2012-05-31 11:06   ` Peter Zijlstra
2012-05-31 14:08     ` Steven Rostedt
2012-05-31 15:16       ` Masami Hiramatsu
2012-05-31 15:29         ` Steven Rostedt
2012-05-31 17:28       ` Peter Zijlstra
2012-05-31 18:42         ` Steven Rostedt
2012-05-31 17:40       ` Peter Zijlstra
2012-05-31 17:44         ` Peter Zijlstra
2012-05-31 17:53         ` Steven Rostedt
2012-05-31 18:03           ` Peter Zijlstra
2012-05-31 18:50             ` Steven Rostedt
2012-05-31 19:06               ` Peter Zijlstra
2012-05-31 19:55                 ` Mathieu Desnoyers
2012-05-31 20:10                   ` Steven Rostedt
2012-05-31 20:26                     ` Peter Zijlstra
2012-05-31 20:37                       ` Steven Rostedt
2012-05-31 20:40                         ` Steven Rostedt
2012-05-31 20:40                         ` Peter Zijlstra
2012-05-31 20:49                           ` Steven Rostedt
2012-06-01  4:53                             ` Masami Hiramatsu
2012-06-01 11:37                               ` Steven Rostedt
2012-06-01 12:52                                 ` Masami Hiramatsu
2012-06-01  0:45                       ` Arnaldo Carvalho de Melo
2012-05-31  1:28 ` [PATCH 2/5] ftrace: Use breakpoint method to update ftrace caller Steven Rostedt
2012-05-31 11:18   ` Peter Zijlstra
2012-05-31 14:19     ` Steven Rostedt
2012-05-31  1:28 ` [PATCH 3/5] x86: Reset the debug_stack update counter Steven Rostedt
2012-05-31 19:19   ` H. Peter Anvin
2012-05-31 19:26     ` Steven Rostedt
2012-05-31  1:28 ` [PATCH 4/5] x86: Allow nesting of the debug stack IDT setting Steven Rostedt
2012-05-31 18:44   ` H. Peter Anvin
2012-05-31 18:58   ` H. Peter Anvin
2012-05-31 19:25     ` Steven Rostedt
2012-05-31 19:27       ` H. Peter Anvin
2012-05-31 20:00         ` Steven Rostedt
2012-05-31 20:17           ` H. Peter Anvin [this message]
2012-05-31 20:35             ` Steven Rostedt
2012-05-31 20:39               ` H. Peter Anvin
2012-05-31 20:56                 ` Steven Rostedt
2012-05-31 21:09                   ` H. Peter Anvin
2012-05-31 21:37                     ` Steven Rostedt
2012-05-31 21:38                       ` Steven Rostedt
2012-06-01  2:30       ` Steven Rostedt
2012-05-31  1:28 ` [PATCH 5/5] ftrace/x86: Do not change stacks in DEBUG when calling lockdep Steven Rostedt
2012-05-31  2:13 ` [PATCH 0/5] (for 3.5)[GIT PULL] ftrace: Fix bug with function tracing and lockdep Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC7D1F7.8090405@zytor.com \
    --to=hpa@zytor.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox