Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "H. Peter Anvin" <hpa@zytor.com>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jason Baron <jbaron@redhat.com>,
	linux-kernel@vger.kernel.org, mingo@elte.hu, tglx@linutronix.de,
	rostedt@goodmis.org, andi@firstfloor.org, roland@redhat.com,
	rth@redhat.com, mhiramat@redhat.com
Subject: Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump	patching without stop_machine
Date: Tue, 12 Jan 2010 20:55:54 -0800	[thread overview]
Message-ID: <4B4D525A.5020101@zytor.com> (raw)
In-Reply-To: <20100113020610.GB29314@Krystal>

On 01/12/2010 06:06 PM, Mathieu Desnoyers wrote:
> * H. Peter Anvin (hpa@zytor.com) wrote:
>> On 01/12/2010 08:26 AM, Jason Baron wrote:
>>> Add text_poke_fixup() which takes a fixup address to where a processor
>>> jumps if it hits the modifying address while code modifying.
>>> text_poke_fixup() does following steps for this purpose.
>>>
>>>  1. Setup int3 handler for fixup.
>>>  2. Put a breakpoint (int3) on the first byte of modifying region,
>>>     and synchronize code on all CPUs.
>>>  3. Modify other bytes of modifying region, and synchronize code on all CPUs.
>>>  4. Modify the first byte of modifying region, and synchronize code
>>>     on all CPUs.
>>>  5. Clear int3 handler.
>>>
>>
>> We (Intel OTC) have been able to get an *unofficial* answer as to the
>> validity of this procedure; specifically as it applies to Intel hardware
>> (obviously).  We are working on getting an officially approved answer,
>> but as far as we currently know, the procedure as outlined above should
>> work on all Intel hardware.  In fact, we believe the synchronization in
>> step 3 is in fact unnecessary (as the synchronization in step 4 provides
>> sufficient guard.)
> 
> Hi Peter,
> 
> This is great news! Thanks to Intel OTC and yourself for looking into
> this. In the immediate values patches, I am doing the synchronization at
> the end of step (3) to ensure that all remote CPUs issue read memory
> barriers, so the stores to the instruction are done in this order:
> 
> spin lock
> store int3 to 1st byte
> smp_wmb()
> sync all cores
> store new instruction in all but 1st byte
> smp_wmb()
> issue smp_rmb() on all cores (a sync all cores has this effect)
> store new instruction to 1st byte
> send IPI to all cores (or call synchronize_sched()) to wait for all
>   breakpoint handlers to complete.
> spin unlock
> 
> So the question is: are these wmb/rmb pairs actually needed ?  As the
> instruction fetch is not performed by instructions per se, I doubt a
> rmb() will have any effect on them. I always prefer to stay on the safe
> side, but it wouldn't hurt to know.
> 

I don't think the smp_rmb() has any function.

However, you're being quite inconsistent in your terminology here.  The
assumption above is that the "synchronize code on all CPU" step is
sending an IPI to all cores and waiting for it to return, so that each
core has executed IPI/IRET before continuation.

It is *not* necessary to wait for the breakpoint handlers to return, as
long as they will get to IRET eventually, since IRET is a jump and a
serializing instruction.

> Hrm. Assuming we have a spinlock protecting all this, given that we
> synchronize all cores at step (4) _after_ removing the breakpoint, and
> given that the breakpoint handler is an interrupt gate (thus executes
> with interrupts off), I am inclined to think that sending the IPIs at
> the end of step (4) (and waiting for them to complete) should be enough
> to ensure that all in-flight breakpoint handlers for this site have
> completed their execution. This would mean that we only have to keep
> track of a single site at a time. Or am I missing something ?

Yes: the whole point was that you can omit the synchronization in step 4
if you leave the breakpoint handler in place (I said "omit step 5", but
that wasn't really what I meant.)

That means that at the cost of two compares in the standard #BP handler,
we can get away with only one IPI per atomic instruction poke.

	-hpa



-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

next prev parent reply	other threads:[~2010-01-13  4:56 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-12 16:26 [RFC PATCH 0/8] jump label v4 Jason Baron
2010-01-12 16:26 ` [RFC PATCH 1/8] jump label v4 - kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Jason Baron
2010-01-12 16:26 ` [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine Jason Baron
2010-01-12 23:16   ` H. Peter Anvin
2010-01-13  2:06     ` Mathieu Desnoyers
2010-01-13  4:55       ` H. Peter Anvin [this message]
2010-01-13 14:30         ` Mathieu Desnoyers
2010-01-14  6:57           ` Masami Hiramatsu
2010-01-14 18:45           ` Masami Hiramatsu
2010-04-13 17:16             ` Mathieu Desnoyers
2010-01-13  5:38     ` Masami Hiramatsu
2010-01-14 15:32   ` Steven Rostedt
2010-01-14 15:36     ` H. Peter Anvin
2010-01-17 18:55       ` Mathieu Desnoyers
2010-01-17 19:16         ` Arjan van de Ven
2010-01-18 15:59           ` Masami Hiramatsu
2010-01-18 16:23             ` H. Peter Anvin
2010-01-18 16:52               ` Mathieu Desnoyers
2010-01-18 18:50                 ` H. Peter Anvin
2010-01-18 20:53                   ` Masami Hiramatsu
2010-01-18 21:18                     ` H. Peter Anvin
2010-01-18 21:32                   ` Mathieu Desnoyers
2010-01-18 16:31             ` Arjan van de Ven
2010-01-18 16:54               ` Mathieu Desnoyers
2010-01-18 18:21                 ` Masami Hiramatsu
2010-01-18 18:33                   ` Mathieu Desnoyers
2010-01-14 15:39     ` Mathieu Desnoyers
2010-01-14 16:23       ` Masami Hiramatsu
2010-01-14 16:42         ` Jason Baron
2010-01-12 16:26 ` [RFC PATCH 3/8] jump label v4 - move opcode definitions Jason Baron
2010-01-12 16:26 ` [RFC PATCH 4/8] jump label v4 - notifier atomic call chain notrace Jason Baron
2010-01-12 16:26 ` [RFC PATCH 5/8] jump label v4 - base patch Jason Baron
2010-01-12 16:26 ` [RFC PATCH 6/8] jump label v4 - x86 support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 7/8] jump label v4 - tracepoint support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 8/8] jump label v4 - add module support Jason Baron
  -- strict thread matches above, loose matches on Subject: below --
2010-01-17 22:56 [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B4D525A.5020101@zytor.com \
    --to=hpa@zytor.com \
    --cc=andi@firstfloor.org \
    --cc=jbaron@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rth@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox