Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>,
	Arjan van de Ven <arjan@infradead.org>,
	rostedt@goodmis.org, Jason Baron <jbaron@redhat.com>,
	linux-kernel@vger.kernel.org, mingo@elte.hu, tglx@linutronix.de,
	andi@firstfloor.org, roland@redhat.com, rth@redhat.com
Subject: Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine
Date: Mon, 18 Jan 2010 16:32:54 -0500	[thread overview]
Message-ID: <20100118213254.GA26355@Krystal> (raw)
In-Reply-To: <4B54AD7C.9000505@zytor.com>

* H. Peter Anvin (hpa@zytor.com) wrote:
> On 01/18/2010 08:52 AM, Mathieu Desnoyers wrote:
> >>
> >> This really doesn't make much sense to me.  The whole basis for the int3
> >> scheme itself is that single-byte updates are atomic, so if single-byte
> >> updates can't work -- and as I stated, we at Intel OTC currently believe
> >> it safe -- then int3 can't work either.
> > 
> > The additional characteristic of the int3 instruction (compared to the
> > general case of a single-byte instruction) is that, when executed, it
> > will trigger a trap, run a trap handler and return to the original code,
> > typically with iret. This therefore implies that a serializing
> > instruction is executed before returning to the instructions following
> > the modification site when the breakpoint is hit.
> > 
> > So I hand out to Intel's expertise the question of whether single-byte
> > instruction modification is safe or not in the general case. I'm just
> > pointing out that I can very well imagine an aggressive superscalar
> > architecture for which pipeline structure would support single-byte int3
> > patching without any problem due to the implied serialization, but would
> > not support the general-case single-byte modification due to its lack of
> > serialization.
> > 
> 
> This is utter and complete nonsense.   You seem to think that everything
> is guaranteed to hit the breakpoint, which is obviously false.

What I discuss above is: what actually happens when the breakpoint is
hit.

I'm doing no assumption about whether it is hit or not. In the int3+IPI
broadcast scheme, every cpu receive an IPI between seeing the old and
new instructions. Only *some* cpus *may* hit the breakpoint that is put
there temporarily.

> Furthermore, until you have done the serialization, you're not
> guaranteed the *breakpoint* is seen,

Agreed,

> so you have the same condition.

Hrm ? Same as what exactly ? We have either the old instruction in place
or the breakpoint (before the serialization). After the serialization,
we have either the breakpoint or the new instruction.

What I am pointing out is that specifically turning a 1-byte instruction
into a breakpoint can be safer than turning it into another 1-byte
instruction directly, because *if* cpus hit the breakpoint, they *will*
issue a synchronizing instruction at that point (implied by the
breakpoint). This is not the case if you just modify the 1-byte
instruction in place.

> 
> > As we might have to port this algorithm to Itanium in a near future, I
> > prefer to stay on the safe side. Intel's "by the book" recommendation is
> > more or less that a serializing instruction must be executed on all CPUs
> > before new code is executed, without mention of single-vs-multi byte
> > instructions. The int3-based bypass follows this requirement, but the
> > single-byte code patching does not.
> > 
> > Unless there is a visible performance gain to special-case the
> > single-byte instruction, I would recommend to stick to the safest
> > solution, which follows Intel "official" guide-lines too.
> 
> No, it doesn't.  The only thing that follows the "official" guidelines
> is stop_machine.
> 
> As far as other architectures are concerned, other architectures can
> have very different and much stricter rules for I/D coherence.  Trying
> to extrapolate from the x86 rules is aggravated insanity.

I agree that official Intel guidelines for XMC only discuss the
stop_machine() scheme. OK then, let's see how patching single-byte
instructions deals with the official _uniprocessor_ self-modifying code
guidelines.

(ref. http://www.intel.com/Assets/PDF/specupdate/318586.pdf
7.1.3 Handling Self- and Cross-Modifying Code)

(* OPTION 1 *)
Store modified code (as data) into code segment;
Jump to new code or an intermediate location;
Execute new code;

(* OPTION 2 *)
Store modified code (as data) into code segment;
Execute a serializing instruction; (* For example, CPUID instruction *)
Execute new code;

As you can see, if we self-modify the code on a single cpu machine with
text_poke directly, even for a single-byte instruction, we _have_ to
guarantee that either a jump or a serializing instruction is issued
before the new code is executed.

What I discussed above was that int3 is a special-case, because it
generates a trap, and therefore jumps to a different location.

So, back to the case where we could "simply patch-in any single-byte
instruction in a SMP system", I argue that this is against the
uniprocessor part of the errata, which clearly also applies to SMP.

By the way, I've looked at the Itanium documents a few years ago, and
I have not seen any reason at that time why the breakpoint+IPI scheme
would not work if we additionally perform the appropriate I and D cache
flushes. The rest of the requirements are _very_ similar.

Thanks,

Mathieu

> 
> 	-hpa

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

next prev parent reply	other threads:[~2010-01-18 21:38 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-12 16:26 [RFC PATCH 0/8] jump label v4 Jason Baron
2010-01-12 16:26 ` [RFC PATCH 1/8] jump label v4 - kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Jason Baron
2010-01-12 16:26 ` [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine Jason Baron
2010-01-12 23:16   ` H. Peter Anvin
2010-01-13  2:06     ` Mathieu Desnoyers
2010-01-13  4:55       ` H. Peter Anvin
2010-01-13 14:30         ` Mathieu Desnoyers
2010-01-14  6:57           ` Masami Hiramatsu
2010-01-14 18:45           ` Masami Hiramatsu
2010-04-13 17:16             ` Mathieu Desnoyers
2010-01-13  5:38     ` Masami Hiramatsu
2010-01-14 15:32   ` Steven Rostedt
2010-01-14 15:36     ` H. Peter Anvin
2010-01-17 18:55       ` Mathieu Desnoyers
2010-01-17 19:16         ` Arjan van de Ven
2010-01-18 15:59           ` Masami Hiramatsu
2010-01-18 16:23             ` H. Peter Anvin
2010-01-18 16:52               ` Mathieu Desnoyers
2010-01-18 18:50                 ` H. Peter Anvin
2010-01-18 20:53                   ` Masami Hiramatsu
2010-01-18 21:18                     ` H. Peter Anvin
2010-01-18 21:32                   ` Mathieu Desnoyers [this message]
2010-01-18 16:31             ` Arjan van de Ven
2010-01-18 16:54               ` Mathieu Desnoyers
2010-01-18 18:21                 ` Masami Hiramatsu
2010-01-18 18:33                   ` Mathieu Desnoyers
2010-01-14 15:39     ` Mathieu Desnoyers
2010-01-14 16:23       ` Masami Hiramatsu
2010-01-14 16:42         ` Jason Baron
2010-01-12 16:26 ` [RFC PATCH 3/8] jump label v4 - move opcode definitions Jason Baron
2010-01-12 16:26 ` [RFC PATCH 4/8] jump label v4 - notifier atomic call chain notrace Jason Baron
2010-01-12 16:26 ` [RFC PATCH 5/8] jump label v4 - base patch Jason Baron
2010-01-12 16:26 ` [RFC PATCH 6/8] jump label v4 - x86 support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 7/8] jump label v4 - tracepoint support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 8/8] jump label v4 - add module support Jason Baron
  -- strict thread matches above, loose matches on Subject: below --
2010-01-17 22:56 [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100118213254.GA26355@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=andi@firstfloor.org \
    --cc=arjan@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rth@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.