All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>, X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Brian Gerst <brgerst@gmail.com>
Subject: Re: Dealing with the NMI mess
Date: Fri, 24 Jul 2015 19:10:18 +0200	[thread overview]
Message-ID: <20150724171018.GH3612@1wt.eu> (raw)
In-Reply-To: <CALCETrWjzU79ASDK+0RJQyCy6qTdM3FPTa4ZM0d5sVW66yhcug@mail.gmail.com>

On Fri, Jul 24, 2015 at 08:48:57AM -0700, Andy Lutomirski wrote:
> So by the time we detect that we've hit a watchpoint, the instruction
> that tripped it is done and we don't need RF.  Furthermore, after
> reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if
> we hit a watchpoint.  So this might be as simple as:
> 
> if ((dr6 && (0xf * DR_TRAP0) && (regs->flags & (X86_EFLAGS_RF |
> X86_EFLAGS_IF)) == X86_EFLAGS_RF && !user_mode(regs))
>   for (i = 0; i < 4; i++)
>     if (dr6 & (DR_TRAP0<<i)) {
>       /* hit a kernel breakpoint with IF clear */
>       dr7 &= ~(DR_GLOBAL_ENABLE << (i * DR_ENABLE_SHIFT));
>     }
> 
> I'm not saying that your code is wrong, but I think this is simpler
> and avoids poking at yet more per-cpu state from NMI context, which is
> kind of nice.
> 
> If you don't like the RF games above, it would also be straightforward
> to parse dr0..dr3 for each DR_TRAP bit that's set and see if it's a
> breakpoint.

Andy, section 5.8 of the SDM makes me think we could possibly abuse SYSRET
to emulate IRET, and then possibly simplify the flags processing. It says
that it takes the CPL3 code segment but nowhere it says that the target is
validated for effectively being userland, and further it suggests that it
doesn't validate anything :

  "It is the responsibility of the OS to ensure the descriptors in
   the GDT/LDT correspond to the selectors loaded by SYSCALL/SYSRET
   (consistent with the base, limit, and attribute values forced by
   the instructions)."

The OS has to set the RSP by itself before doing SYSRET, which opens a
race between "mov rsp" and "sysret", but if we only take that path once
we figure we come from NMI (using just IF+RSP), we know that IRQs and
NMIs are still disabled and cannot strike at this instant. Maybe MCEs
can, but they would execute within the NMI's stack just as if they were
triggered inside the NMI as well so I don't see a problem here.

I tried to imagine a case where kernel page faults, then NMI comes in,
then debug strikes and we have to return from debug to NMI then to fault
handler and I don't think we break the chain. Of course there are many
subtleties I can't grab because I don't understand all the details.

Do you think that could simplify things or that it's another stupid idea ?

Willy


  parent reply	other threads:[~2015-07-24 17:11 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-23 20:21 Dealing with the NMI mess Andy Lutomirski
2015-07-23 20:38 ` Linus Torvalds
2015-07-23 20:49   ` Andy Lutomirski
2015-07-23 21:08     ` Linus Torvalds
2015-07-23 21:31       ` Steven Rostedt
2015-07-23 21:46         ` Willy Tarreau
2015-07-23 21:46           ` Andy Lutomirski
2015-07-23 21:50             ` Willy Tarreau
2015-07-23 21:48         ` Linus Torvalds
2015-07-23 21:50           ` Andy Lutomirski
2015-07-23 21:59             ` Linus Torvalds
2015-07-24  8:13               ` Peter Zijlstra
2015-07-24  9:02                 ` Willy Tarreau
2015-07-24 11:58                 ` Steven Rostedt
2015-07-24 12:43                   ` Peter Zijlstra
2015-07-24 13:03                     ` Steven Rostedt
2015-07-24 13:21                       ` Willy Tarreau
2015-07-24 13:30                         ` Peter Zijlstra
2015-07-24 13:33                           ` Peter Zijlstra
2015-07-24 14:31                         ` Steven Rostedt
2015-07-24 14:59                           ` Willy Tarreau
2015-07-24 15:16                             ` Steven Rostedt
2015-07-24 15:26                               ` Willy Tarreau
2015-07-24 15:30                                 ` Peter Zijlstra
2015-07-24 15:33                                   ` Willy Tarreau
2015-07-24 18:29                                   ` Linus Torvalds
2015-07-24 18:41                                     ` Linus Torvalds
2015-07-24 19:05                                       ` Steven Rostedt
2015-07-24 19:55                                     ` Peter Zijlstra
2015-07-24 20:22                                       ` Linus Torvalds
2015-07-24 20:51                                         ` Peter Zijlstra
2015-07-24 21:07                                           ` Steven Rostedt
2015-07-24 21:08                                           ` Andy Lutomirski
2015-07-30 15:41                                             ` Paolo Bonzini
2015-07-30 21:22                                               ` Andy Lutomirski
2015-07-30 21:58                                                 ` Brian Gerst
2015-07-30 22:59                                                 ` Thomas Gleixner
2015-07-31  4:22                                                 ` Borislav Petkov
2015-07-31  5:11                                                   ` Andy Lutomirski
2015-07-31  7:51                                                     ` Paolo Bonzini
2015-07-31  8:03                                                     ` Borislav Petkov
2015-07-31  9:27                                                       ` Paolo Bonzini
2015-07-31 10:25                                                         ` Borislav Petkov
2015-07-31 10:26                                                           ` Paolo Bonzini
2015-07-31 10:32                                                             ` Borislav Petkov
2015-09-07  5:39                                                       ` Maciej W. Rozycki
2015-09-07  7:42                                                         ` Ingo Molnar
2015-09-07  8:19                                                           ` Maciej W. Rozycki
2015-09-07 10:19                                                             ` Paolo Bonzini
2015-09-07 17:01                                                               ` Maciej W. Rozycki
2015-09-07 17:22                                                                 ` Andy Lutomirski
2015-09-07 19:30                                                                   ` Maciej W. Rozycki
2015-09-07 21:56                                                                     ` Andy Lutomirski
2015-09-08 16:21                                                                       ` Maciej W. Rozycki
2015-07-24 23:53                                           ` Linus Torvalds
2015-07-24 15:34                                 ` Steven Rostedt
2015-07-24 15:49                                   ` Willy Tarreau
2015-07-24 15:48                 ` Andy Lutomirski
2015-07-24 16:02                   ` Steven Rostedt
2015-07-24 16:08                     ` Willy Tarreau
2015-07-24 16:31                       ` Steven Rostedt
2015-07-24 16:06                   ` Steven Rostedt
2015-07-24 16:25                   ` Willy Tarreau
2015-07-24 17:21                     ` Andy Lutomirski
2015-07-24 17:10                   ` Willy Tarreau [this message]
2015-07-24 17:20                     ` Andy Lutomirski
2015-07-30 15:54                       ` Paolo Bonzini
2015-07-24 17:21                     ` Willy Tarreau
2015-07-23 20:52   ` Willy Tarreau
2015-07-23 20:53     ` Andy Lutomirski
2015-07-23 21:07       ` Willy Tarreau
2015-07-23 21:13     ` Linus Torvalds
2015-07-23 21:18       ` Willy Tarreau
2015-07-23 21:20   ` Peter Zijlstra
2015-07-23 21:35     ` Linus Torvalds
2015-07-23 21:45       ` Andy Lutomirski
2015-07-23 21:54         ` Linus Torvalds
2015-07-23 21:59           ` Andy Lutomirski
2015-07-23 22:03             ` Linus Torvalds
2015-07-24 10:28             ` Peter Zijlstra
2015-07-24 11:06           ` Peter Zijlstra
2015-07-23 21:17 ` Peter Zijlstra
2015-07-23 21:20 ` Steven Rostedt
2015-07-23 21:46   ` Andy Lutomirski
2015-07-24 16:33 ` Raymond Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150724171018.GH3612@1wt.eu \
    --to=w@1wt.eu \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.