All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>, Jason Baron <jbaron@redhat.com>,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, ak@suse.de,
	roland@redhat.com, rth@redhat.com, mhiramat@redhat.com
Subject: Re: [PATCH 1/4] jump label - make init_kernel_text() global
Date: Wed, 7 Oct 2009 09:35:01 -0400	[thread overview]
Message-ID: <20091007133501.GD29632@Krystal> (raw)
In-Reply-To: <1254920198.1696.125.camel@gandalf.stny.rr.com>

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Tue, 2009-10-06 at 22:32 -0400, Mathieu Desnoyers wrote:
> 
> > 
> > Hi Steven,
> > 
> > OK, I'll make the explanation as straightforward as possible. I'll use a
> > race example to illustrate what we try to avoid by using the
> > breakpoint+ipi scheme. After that, I present the same scenario with the
> > breakpoint+ipi in place.
> > 
> > Each step shows what is executed, and what is the memory values seen by
> > the CPU. CPU A is doing the code patching, CPU B executing the code.
> > I intentionally left out some sfence required on CPU A for simplicity.)
> > 
> > Initially, let's say we have:
> > (1)  (2)
> > 0xeb 0xe5    (jmp to offset 0xe5)
> > 
> > And we want to change this to:
> > (1)  (2)
> > 0xeb 0xf0    (jmp to offset 0xf0)
> > 
> > (scenario "buggy")
> > 
> > CPU A       |       CPU B  (this is about as far as my ascii-art skills go)
> > -------------------------    ;)
> > 0xeb 0xe5     0xeb 0xe5
> > 0:            CPU B instruction pointer is earlier than (1)
> >               CPU B pipeline speculatively predicts branches,
> >               prefetches data, calculates speculated values.
> > 1:            CPU B loads 0xeb
> > 2:            CPU B loads 0xe5
> > 3:
> > Write to (2)
> > 0xeb 0xf0     0xeb 0xf0
> > 4:            CPU B instruction pointer gets to (1), needs to validate
> >               all the pipeline speculation.
> >               But ! The CPU does not expect code to change underneath.
> >               General protection fault (or any other fault.. random..)
> > 
> > 
> > Now with the breakpoint+ipi/mb() scheme:
> > (scenario A: CPU B does not hit the breakpoint)
> > 
> > CPU A       |       CPU B
> > -------------------------
> > 0xeb 0xe5     0xeb 0xe5
> > 0:            CPU B instruction pointer is earlier than (1)
> >               CPU B pipeline speculatively predicts branches,
> >               prefetches data, calculates speculated values.
> > 1:            CPU B loads 0xeb
> > 2:            CPU B loads 0xe5
> > 3:
> > Write to (1)
> > 0xcc 0xe5     0xcc 0xe5  # breakpoint inserted
> > 4: send IPI
> > 5:            mfence     # serializing instruction. Flushes CPU B's
> >                          # pipeline
> > 6:
> > Write to (2)
> > 0xcc 0xf0     0xcc 0xf0
> > 7:
> > Write to (1)
> > 0xeb 0xf0     0xeb 0xf0
> > 8:            CPU B instruction pointer gets to (1), needs to validate
> >               all the pipeline speculation. Because we flushed any
> >               speculation prior to the mfence, we're ok.
> > 
> > 
> > Now, I'll show why just using the breakpoint, without IPI, is
> > problematic:
> > 
> > CPU A       |       CPU B
> > -------------------------
> > 0xeb 0xe5     0xeb 0xe5
> > 0:            CPU B instruction pointer is earlier than (1)
> >               CPU B pipeline speculatively predicts branches,
> >               prefetches data, calculates speculated values.
> > 1:            CPU B loads 0xeb
> > 2:            CPU B loads 0xe5
> > 3:
> > Write to (1)
> > 0xcc 0xe5     0xcc 0xf0  # breakpoint inserted
> > 4:
> > Write to (2)
> > 0xcc 0xf0     0xeb 0xf0  # Silly CPU B. Did not see nor use the breakpoint.
> >                          # Same problem as scenario "buggy".
> > 5:
> > Write to (1)
> > 0xeb 0xf0     0xeb 0xf0
> > 4:            CPU B instruction pointer gets to (1), needs to validate
> >               all the pipeline speculation.
> >               But ! The CPU does not expect code to change underneath.
> >               General protection fault (or any other fault.. random..)
> > 
> > So, basically, we ensure that the only transitions CPU B will see are
> > either:
> > 
> > 0xeb 0xe5 -> 0xcc 0xe5 : OK, adding breakpoint
> > 0xcc 0xe5 -> 0xcc 0xf0 : OK, not using the operand anyway, it's a
> >                              breakpoint!
> > 0xcc 0xf0 -> 0xeb 0xf0 : OK, removing breakpoint
> > 
> > *but*, the transition we guarantee that CPU B will *never* see without
> > having a mfence executed between the old and the new version is:
> > 
> > 0xeb 0xe5 -> 0xeb 0xf0  <----- buggy.
> > 
> > Hope the explanation helps,
> 
> Thanks Mathieu,
> 
> This does help explain a lot.
> 
> So, basically the IPI is to make sure the int3 is seen by other CPUS

- I might add: and that the other CPU's instruction trace caches are
  flushed with a core serializing instruction -

> before you modify the jump. Otherwise you risk setting up the int3 and
> the other CPU does not see it but still executes the change to the jmp
> destination.

Yep.

> 
> I'm assuming that the int3 handler will make the process on CPU B jump
> to the next op (one not being modified).

Indeed.

> 
> Now we must get from Intel and AMD that it is OK to remove the int3.

Yep, that's what hpa is trying to get them to tell us.

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2009-10-07 13:35 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-24 23:17 [PATCH 0/4] jump label patches Jason Baron
2009-09-24 23:17 ` [PATCH 1/4] jump label - make init_kernel_text() global Jason Baron
2009-10-01 11:20   ` Ingo Molnar
2009-10-01 12:58     ` Mathieu Desnoyers
2009-10-01 20:39     ` Jason Baron
2009-10-03 10:43       ` Ingo Molnar
2009-10-03 12:39         ` Mathieu Desnoyers
2009-10-07  1:54           ` Steven Rostedt
2009-10-07  2:32             ` Mathieu Desnoyers
2009-10-07  3:10               ` Masami Hiramatsu
2009-10-07  3:23                 ` Mathieu Desnoyers
2009-10-07  3:29               ` Mathieu Desnoyers
2009-10-07 12:56               ` Steven Rostedt
2009-10-07 13:35                 ` Mathieu Desnoyers [this message]
2009-09-24 23:17 ` [PATCH 2/4] jump label - base patch Jason Baron
2009-09-25  0:49   ` Roland McGrath
2009-09-26 10:21     ` Steven Rostedt
2009-10-01 11:36   ` Ingo Molnar
2009-09-24 23:17 ` [PATCH 3/4] jump label - add module support Jason Baron
2009-09-24 23:18 ` [PATCH 4/4] jump label - tracepoint implementation Jason Baron
2009-10-06  5:39 ` [PATCH 0/4] jump label patches Roland McGrath
2009-10-06 14:07   ` Jason Baron
2009-10-06 23:24   ` Richard Henderson
2009-10-07  0:14     ` Roland McGrath
2009-10-07 15:35       ` Richard Henderson
2009-10-06  6:04 ` Roland McGrath
2009-10-06 14:09   ` Steven Rostedt
2009-10-06 14:13   ` Masami Hiramatsu
2009-10-06 14:30     ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091007133501.GD29632@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=ak@suse.de \
    --cc=jbaron@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rth@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.