linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>, Jason Baron <jbaron@redhat.com>,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, ak@suse.de,
	roland@redhat.com, rth@redhat.com, mhiramat@redhat.com
Subject: Re: [PATCH 1/4] jump label - make init_kernel_text() global
Date: Wed, 7 Oct 2009 09:35:01 -0400	[thread overview]
Message-ID: <20091007133501.GD29632@Krystal> (raw)
In-Reply-To: <1254920198.1696.125.camel@gandalf.stny.rr.com>

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Tue, 2009-10-06 at 22:32 -0400, Mathieu Desnoyers wrote:
> 
> > 
> > Hi Steven,
> > 
> > OK, I'll make the explanation as straightforward as possible. I'll use a
> > race example to illustrate what we try to avoid by using the
> > breakpoint+ipi scheme. After that, I present the same scenario with the
> > breakpoint+ipi in place.
> > 
> > Each step shows what is executed, and what is the memory values seen by
> > the CPU. CPU A is doing the code patching, CPU B executing the code.
> > I intentionally left out some sfence required on CPU A for simplicity.)
> > 
> > Initially, let's say we have:
> > (1)  (2)
> > 0xeb 0xe5    (jmp to offset 0xe5)
> > 
> > And we want to change this to:
> > (1)  (2)
> > 0xeb 0xf0    (jmp to offset 0xf0)
> > 
> > (scenario "buggy")
> > 
> > CPU A       |       CPU B  (this is about as far as my ascii-art skills go)
> > -------------------------    ;)
> > 0xeb 0xe5     0xeb 0xe5
> > 0:            CPU B instruction pointer is earlier than (1)
> >               CPU B pipeline speculatively predicts branches,
> >               prefetches data, calculates speculated values.
> > 1:            CPU B loads 0xeb
> > 2:            CPU B loads 0xe5
> > 3:
> > Write to (2)
> > 0xeb 0xf0     0xeb 0xf0
> > 4:            CPU B instruction pointer gets to (1), needs to validate
> >               all the pipeline speculation.
> >               But ! The CPU does not expect code to change underneath.
> >               General protection fault (or any other fault.. random..)
> > 
> > 
> > Now with the breakpoint+ipi/mb() scheme:
> > (scenario A: CPU B does not hit the breakpoint)
> > 
> > CPU A       |       CPU B
> > -------------------------
> > 0xeb 0xe5     0xeb 0xe5
> > 0:            CPU B instruction pointer is earlier than (1)
> >               CPU B pipeline speculatively predicts branches,
> >               prefetches data, calculates speculated values.
> > 1:            CPU B loads 0xeb
> > 2:            CPU B loads 0xe5
> > 3:
> > Write to (1)
> > 0xcc 0xe5     0xcc 0xe5  # breakpoint inserted
> > 4: send IPI
> > 5:            mfence     # serializing instruction. Flushes CPU B's
> >                          # pipeline
> > 6:
> > Write to (2)
> > 0xcc 0xf0     0xcc 0xf0
> > 7:
> > Write to (1)
> > 0xeb 0xf0     0xeb 0xf0
> > 8:            CPU B instruction pointer gets to (1), needs to validate
> >               all the pipeline speculation. Because we flushed any
> >               speculation prior to the mfence, we're ok.
> > 
> > 
> > Now, I'll show why just using the breakpoint, without IPI, is
> > problematic:
> > 
> > CPU A       |       CPU B
> > -------------------------
> > 0xeb 0xe5     0xeb 0xe5
> > 0:            CPU B instruction pointer is earlier than (1)
> >               CPU B pipeline speculatively predicts branches,
> >               prefetches data, calculates speculated values.
> > 1:            CPU B loads 0xeb
> > 2:            CPU B loads 0xe5
> > 3:
> > Write to (1)
> > 0xcc 0xe5     0xcc 0xf0  # breakpoint inserted
> > 4:
> > Write to (2)
> > 0xcc 0xf0     0xeb 0xf0  # Silly CPU B. Did not see nor use the breakpoint.
> >                          # Same problem as scenario "buggy".
> > 5:
> > Write to (1)
> > 0xeb 0xf0     0xeb 0xf0
> > 4:            CPU B instruction pointer gets to (1), needs to validate
> >               all the pipeline speculation.
> >               But ! The CPU does not expect code to change underneath.
> >               General protection fault (or any other fault.. random..)
> > 
> > So, basically, we ensure that the only transitions CPU B will see are
> > either:
> > 
> > 0xeb 0xe5 -> 0xcc 0xe5 : OK, adding breakpoint
> > 0xcc 0xe5 -> 0xcc 0xf0 : OK, not using the operand anyway, it's a
> >                              breakpoint!
> > 0xcc 0xf0 -> 0xeb 0xf0 : OK, removing breakpoint
> > 
> > *but*, the transition we guarantee that CPU B will *never* see without
> > having a mfence executed between the old and the new version is:
> > 
> > 0xeb 0xe5 -> 0xeb 0xf0  <----- buggy.
> > 
> > Hope the explanation helps,
> 
> Thanks Mathieu,
> 
> This does help explain a lot.
> 
> So, basically the IPI is to make sure the int3 is seen by other CPUS

- I might add: and that the other CPU's instruction trace caches are
  flushed with a core serializing instruction -

> before you modify the jump. Otherwise you risk setting up the int3 and
> the other CPU does not see it but still executes the change to the jmp
> destination.

Yep.

> 
> I'm assuming that the int3 handler will make the process on CPU B jump
> to the next op (one not being modified).

Indeed.

> 
> Now we must get from Intel and AMD that it is OK to remove the int3.

Yep, that's what hpa is trying to get them to tell us.

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2009-10-07 13:35 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-24 23:17 [PATCH 0/4] jump label patches Jason Baron
2009-09-24 23:17 ` [PATCH 1/4] jump label - make init_kernel_text() global Jason Baron
2009-10-01 11:20   ` Ingo Molnar
2009-10-01 12:58     ` Mathieu Desnoyers
2009-10-01 20:39     ` Jason Baron
2009-10-03 10:43       ` Ingo Molnar
2009-10-03 12:39         ` Mathieu Desnoyers
2009-10-07  1:54           ` Steven Rostedt
2009-10-07  2:32             ` Mathieu Desnoyers
2009-10-07  3:10               ` Masami Hiramatsu
2009-10-07  3:23                 ` Mathieu Desnoyers
2009-10-07  3:29               ` Mathieu Desnoyers
2009-10-07 12:56               ` Steven Rostedt
2009-10-07 13:35                 ` Mathieu Desnoyers [this message]
2009-09-24 23:17 ` [PATCH 2/4] jump label - base patch Jason Baron
2009-09-25  0:49   ` Roland McGrath
2009-09-26 10:21     ` Steven Rostedt
2009-10-01 11:36   ` Ingo Molnar
2009-09-24 23:17 ` [PATCH 3/4] jump label - add module support Jason Baron
2009-09-24 23:18 ` [PATCH 4/4] jump label - tracepoint implementation Jason Baron
2009-10-06  5:39 ` [PATCH 0/4] jump label patches Roland McGrath
2009-10-06 14:07   ` Jason Baron
2009-10-06 23:24   ` Richard Henderson
2009-10-07  0:14     ` Roland McGrath
2009-10-07 15:35       ` Richard Henderson
2009-10-06  6:04 ` Roland McGrath
2009-10-06 14:09   ` Steven Rostedt
2009-10-06 14:13   ` Masami Hiramatsu
2009-10-06 14:30     ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091007133501.GD29632@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=ak@suse.de \
    --cc=jbaron@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rth@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).