All of lore.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	Dave Jones <davej@redhat.com>, Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH 4/5] x86: Allow nesting of the debug stack IDT setting
Date: Thu, 31 May 2012 13:17:59 -0700	[thread overview]
Message-ID: <4FC7D1F7.8090405@zytor.com> (raw)
In-Reply-To: <1338494431.13348.410.camel@gandalf.stny.rr.com>

On 05/31/2012 01:00 PM, Steven Rostedt wrote:
> 
> It doesn't ;-)
> 
> But we don't know if it is what it needs to be. Just because the counter
> is set to 1, doesn't mean that it was already set. Because we are
> dealing with NMIs, that are totally asynchronous, and the race of
> setting the counter and setting the idt.
> 
> Basically, (as you already know) the IST=0 means to use the same stack
> if we hit a breakpoint. But usually it's set to '4' which is an index
> into the TSS to find its per CPU stack.
> 
> If the IST is set to 4, and a breakpoint is hit, then the stack is set
> to a fixed address, even if you are currently using the same stack!
> 
> We need to prevent this from happening. There's two places that this can
> cause issues. The first is in the NMI handler, which is where this code
> was first developed. The idea was to allow NMIs to hit breakpoints. But
> if it were to do that after preempting a debug int3 handler, the
> breakpoint it hit would reset the stack and the NMI would start
> clobbering the stack of the code it preempted.
> 
> The NMI code on entering (and before it is allowed to hit any
> breakpoints) looks at the stack it preempted. If it is a debug stack,
> then it updates the IDT to have the int3 vector have a IST of 0 (keep
> the same stack when triggered). Then if the NMI hits a breakpoint, it
> just continues to use the NMI stack.
> 
> The int3 handler has a little trick that lets the int3 code reuse the
> stack. It modifies the TSS to point to another stack before calling the
> debug handler. If a NMI triggers now, or a breakpoint is hit again, it
> wont corrupt the stack.
> 
> 	subq $EXCEPTION_STKSZ, INIT_TSS_IST(\ist)
> 	call \do_sym
> 	addq $EXCEPTION_STKSZ, INIT_TSS_IST(\ist)
> 
> The debug stack is EXCEPTION_STKSZ * 2 size. Before entering the \do_sym
> (do_int3 in this case), it moves the TSS to point to another stack for
> the int3 handler.
> 
> The problem this patch set is trying to fix is the case for lockdep. The
> macro TRACE_IRQS_ON/OFF calls into lockdep code. And these are outside
> that add/sub TSS trick. If lockdep code hits a breakpoint than we reset
> back to the original stack address, and start clobbering the stack. This
> is the bug that Dave Jones was triggering.
> 
> Now we could do this add/sub TSS trick instead of loading idt for all
> the cases before calling lockdep in the debug handler. But this means
> the paranoid_exit will need to be turned into a macro and we would
> require it for each ist set.
> 
> Now, back to your original question. Why set it if it is already set.
> Well, it doesn't hurt to set it (except for the performance hit we
> take), but it does hurt if we should set it but don't, as that means we
> can reset the stack and clobber what we preempted.
> 
> I'm sure there's room to make this more efficient. But I'm currently
> trying to solve a kernel crash first, and then work on cleaning it up
> later.
> 

Ouch.  This is really way more complex than it has any excuse for being,
and it's the complexity that concerns me, not the performance.

I'd like a chart, or list, of the alternate stack environments we can be
in and what can transfer to what.  I think there might be an easier
solution that is more robust.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


  reply	other threads:[~2012-05-31 20:18 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-31  1:28 [PATCH 0/5] [GIT PULL] ftrace: Fix bug with function tracing and lockdep Steven Rostedt
2012-05-31  1:28 ` [PATCH 1/5] ftrace: Synchronize variable setting with breakpoints Steven Rostedt
2012-05-31 11:06   ` Peter Zijlstra
2012-05-31 14:08     ` Steven Rostedt
2012-05-31 15:16       ` Masami Hiramatsu
2012-05-31 15:29         ` Steven Rostedt
2012-05-31 17:28       ` Peter Zijlstra
2012-05-31 18:42         ` Steven Rostedt
2012-05-31 17:40       ` Peter Zijlstra
2012-05-31 17:44         ` Peter Zijlstra
2012-05-31 17:53         ` Steven Rostedt
2012-05-31 18:03           ` Peter Zijlstra
2012-05-31 18:50             ` Steven Rostedt
2012-05-31 19:06               ` Peter Zijlstra
2012-05-31 19:55                 ` Mathieu Desnoyers
2012-05-31 20:10                   ` Steven Rostedt
2012-05-31 20:26                     ` Peter Zijlstra
2012-05-31 20:37                       ` Steven Rostedt
2012-05-31 20:40                         ` Steven Rostedt
2012-05-31 20:40                         ` Peter Zijlstra
2012-05-31 20:49                           ` Steven Rostedt
2012-06-01  4:53                             ` Masami Hiramatsu
2012-06-01 11:37                               ` Steven Rostedt
2012-06-01 12:52                                 ` Masami Hiramatsu
2012-06-01  0:45                       ` Arnaldo Carvalho de Melo
2012-05-31  1:28 ` [PATCH 2/5] ftrace: Use breakpoint method to update ftrace caller Steven Rostedt
2012-05-31 11:18   ` Peter Zijlstra
2012-05-31 14:19     ` Steven Rostedt
2012-05-31  1:28 ` [PATCH 3/5] x86: Reset the debug_stack update counter Steven Rostedt
2012-05-31 19:19   ` H. Peter Anvin
2012-05-31 19:26     ` Steven Rostedt
2012-05-31  1:28 ` [PATCH 4/5] x86: Allow nesting of the debug stack IDT setting Steven Rostedt
2012-05-31 18:44   ` H. Peter Anvin
2012-05-31 18:58   ` H. Peter Anvin
2012-05-31 19:25     ` Steven Rostedt
2012-05-31 19:27       ` H. Peter Anvin
2012-05-31 20:00         ` Steven Rostedt
2012-05-31 20:17           ` H. Peter Anvin [this message]
2012-05-31 20:35             ` Steven Rostedt
2012-05-31 20:39               ` H. Peter Anvin
2012-05-31 20:56                 ` Steven Rostedt
2012-05-31 21:09                   ` H. Peter Anvin
2012-05-31 21:37                     ` Steven Rostedt
2012-05-31 21:38                       ` Steven Rostedt
2012-06-01  2:30       ` Steven Rostedt
2012-05-31  1:28 ` [PATCH 5/5] ftrace/x86: Do not change stacks in DEBUG when calling lockdep Steven Rostedt
2012-05-31  2:13 ` [PATCH 0/5] (for 3.5)[GIT PULL] ftrace: Fix bug with function tracing and lockdep Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC7D1F7.8090405@zytor.com \
    --to=hpa@zytor.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.