Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Masami Hiramatsu <mhiramat@redhat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Jason Baron <jbaron@redhat.com>,
	linux-kernel@vger.kernel.org, mingo@elte.hu, tglx@linutronix.de,
	rostedt@goodmis.org, andi@firstfloor.org, roland@redhat.com,
	rth@redhat.com
Subject: Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump	patching without stop_machine
Date: Thu, 14 Jan 2010 01:57:12 -0500	[thread overview]
Message-ID: <4B4EC048.4090107@redhat.com> (raw)
In-Reply-To: <20100113143050.GB30875@Krystal>

Mathieu Desnoyers wrote:
> * H. Peter Anvin (hpa@zytor.com) wrote:
>> On 01/12/2010 06:06 PM, Mathieu Desnoyers wrote:
>>> * H. Peter Anvin (hpa@zytor.com) wrote:
>>>> On 01/12/2010 08:26 AM, Jason Baron wrote:
>>>>> Add text_poke_fixup() which takes a fixup address to where a processor
>>>>> jumps if it hits the modifying address while code modifying.
>>>>> text_poke_fixup() does following steps for this purpose.
>>>>>
>>>>>  1. Setup int3 handler for fixup.
>>>>>  2. Put a breakpoint (int3) on the first byte of modifying region,
>>>>>     and synchronize code on all CPUs.
>>>>>  3. Modify other bytes of modifying region, and synchronize code on all CPUs.
>>>>>  4. Modify the first byte of modifying region, and synchronize code
>>>>>     on all CPUs.
>>>>>  5. Clear int3 handler.
>>>>>
>>>>
>>>> We (Intel OTC) have been able to get an *unofficial* answer as to the
>>>> validity of this procedure; specifically as it applies to Intel hardware
>>>> (obviously).  We are working on getting an officially approved answer,
>>>> but as far as we currently know, the procedure as outlined above should
>>>> work on all Intel hardware.  In fact, we believe the synchronization in
>>>> step 3 is in fact unnecessary (as the synchronization in step 4 provides
>>>> sufficient guard.)
>>>
>>> Hi Peter,
>>>
>>> This is great news! Thanks to Intel OTC and yourself for looking into
>>> this. In the immediate values patches, I am doing the synchronization at
>>> the end of step (3) to ensure that all remote CPUs issue read memory
>>> barriers, so the stores to the instruction are done in this order:
>>>
>>> spin lock
>>> store int3 to 1st byte
>>> smp_wmb()
>>> sync all cores
>>> store new instruction in all but 1st byte
>>> smp_wmb()
>>> issue smp_rmb() on all cores (a sync all cores has this effect)
>>> store new instruction to 1st byte
>>> send IPI to all cores (or call synchronize_sched()) to wait for all
>>>   breakpoint handlers to complete.
>>> spin unlock
>>>
>>> So the question is: are these wmb/rmb pairs actually needed ?  As the
>>> instruction fetch is not performed by instructions per se, I doubt a
>>> rmb() will have any effect on them. I always prefer to stay on the safe
>>> side, but it wouldn't hurt to know.
>>>
>>
>> I don't think the smp_rmb() has any function.
> 
> OK, that's good to know.
> 
>>
>> However, you're being quite inconsistent in your terminology here.  The
>> assumption above is that the "synchronize code on all CPU" step is
>> sending an IPI to all cores and waiting for it to return, so that each
>> core has executed IPI/IRET before continuation.
> 
> To be strictly correct, we cannot assume that the IPI handler issues IRET
> before signaling its completion. It's rather the other way around.
> This is why I add a smp_mb() in the IPI handler for the "synchronize
> code on all CPUs" step.
> 
>>
>> It is *not* necessary to wait for the breakpoint handlers to return, as
>> long as they will get to IRET eventually, since IRET is a jump and a
>> serializing instruction.
> 
> Ah, I see. So the added smp_mb() would not be needed then, as long as we
> know that the other CPUs either are currently running the IPI handler or
> have executed it. IOW: they will execute IRET very soon or they just
> executed it since the int3 have been written. I am a bit concerned about
> NMIs coming in this race window, but as they need to have started after
> we have put the breakpoint, that should be OK. (note: entry_*.S
> modifications are needed to support nesting breakpoint handlers in NMIs)
> 
>>
>>> Hrm. Assuming we have a spinlock protecting all this, given that we
>>> synchronize all cores at step (4) _after_ removing the breakpoint, and
>>> given that the breakpoint handler is an interrupt gate (thus executes
>>> with interrupts off), I am inclined to think that sending the IPIs at
>>> the end of step (4) (and waiting for them to complete) should be enough
>>> to ensure that all in-flight breakpoint handlers for this site have
>>> completed their execution. This would mean that we only have to keep
>>> track of a single site at a time. Or am I missing something ?
>>
>> Yes: the whole point was that you can omit the synchronization in step 4
>> if you leave the breakpoint handler in place (I said "omit step 5", but
>> that wasn't really what I meant.)

Hmm, in that case, how can we reuse the breakpoint handler for another
text poke site? Even if we leave the handler, I think we need to clear
fixup information for next poking...

>>
>> That means that at the cost of two compares in the standard #BP handler,
>> we can get away with only one IPI per atomic instruction poke.
> 
> OK. That makes sense now.

So, let me check the actual replacement steps.

(1) lock text_mutex
(2) setup breakpoint fixup addresses (source and destination)
(3) store int3 to 1st byte, and smp_wmb()
(4) send IPI and issue smp_mb() (or cpuid) with for sync all cores.
(5) store new instruction except 1st byte, and smp_wmb()
(6) store 1st byte of new instruction
(7) send IPI to all cores for waiting for all running breakpoint handlers.
(8) clear fixup addresses
(9) unlock text_mutex

Is this correct?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

next prev parent reply	other threads:[~2010-01-14  6:54 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-12 16:26 [RFC PATCH 0/8] jump label v4 Jason Baron
2010-01-12 16:26 ` [RFC PATCH 1/8] jump label v4 - kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Jason Baron
2010-01-12 16:26 ` [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine Jason Baron
2010-01-12 23:16   ` H. Peter Anvin
2010-01-13  2:06     ` Mathieu Desnoyers
2010-01-13  4:55       ` H. Peter Anvin
2010-01-13 14:30         ` Mathieu Desnoyers
2010-01-14  6:57           ` Masami Hiramatsu [this message]
2010-01-14 18:45           ` Masami Hiramatsu
2010-04-13 17:16             ` Mathieu Desnoyers
2010-01-13  5:38     ` Masami Hiramatsu
2010-01-14 15:32   ` Steven Rostedt
2010-01-14 15:36     ` H. Peter Anvin
2010-01-17 18:55       ` Mathieu Desnoyers
2010-01-17 19:16         ` Arjan van de Ven
2010-01-18 15:59           ` Masami Hiramatsu
2010-01-18 16:23             ` H. Peter Anvin
2010-01-18 16:52               ` Mathieu Desnoyers
2010-01-18 18:50                 ` H. Peter Anvin
2010-01-18 20:53                   ` Masami Hiramatsu
2010-01-18 21:18                     ` H. Peter Anvin
2010-01-18 21:32                   ` Mathieu Desnoyers
2010-01-18 16:31             ` Arjan van de Ven
2010-01-18 16:54               ` Mathieu Desnoyers
2010-01-18 18:21                 ` Masami Hiramatsu
2010-01-18 18:33                   ` Mathieu Desnoyers
2010-01-14 15:39     ` Mathieu Desnoyers
2010-01-14 16:23       ` Masami Hiramatsu
2010-01-14 16:42         ` Jason Baron
2010-01-12 16:26 ` [RFC PATCH 3/8] jump label v4 - move opcode definitions Jason Baron
2010-01-12 16:26 ` [RFC PATCH 4/8] jump label v4 - notifier atomic call chain notrace Jason Baron
2010-01-12 16:26 ` [RFC PATCH 5/8] jump label v4 - base patch Jason Baron
2010-01-12 16:26 ` [RFC PATCH 6/8] jump label v4 - x86 support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 7/8] jump label v4 - tracepoint support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 8/8] jump label v4 - add module support Jason Baron
  -- strict thread matches above, loose matches on Subject: below --
2010-01-17 22:56 [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B4EC048.4090107@redhat.com \
    --to=mhiramat@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rth@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox