From: Masami Hiramatsu <mhiramat@redhat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Jason Baron <jbaron@redhat.com>,
linux-kernel@vger.kernel.org, mingo@elte.hu, tglx@linutronix.de,
rostedt@goodmis.org, andi@firstfloor.org, roland@redhat.com,
rth@redhat.com
Subject: Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine
Date: Thu, 14 Jan 2010 01:57:12 -0500 [thread overview]
Message-ID: <4B4EC048.4090107@redhat.com> (raw)
In-Reply-To: <20100113143050.GB30875@Krystal>
Mathieu Desnoyers wrote:
> * H. Peter Anvin (hpa@zytor.com) wrote:
>> On 01/12/2010 06:06 PM, Mathieu Desnoyers wrote:
>>> * H. Peter Anvin (hpa@zytor.com) wrote:
>>>> On 01/12/2010 08:26 AM, Jason Baron wrote:
>>>>> Add text_poke_fixup() which takes a fixup address to where a processor
>>>>> jumps if it hits the modifying address while code modifying.
>>>>> text_poke_fixup() does following steps for this purpose.
>>>>>
>>>>> 1. Setup int3 handler for fixup.
>>>>> 2. Put a breakpoint (int3) on the first byte of modifying region,
>>>>> and synchronize code on all CPUs.
>>>>> 3. Modify other bytes of modifying region, and synchronize code on all CPUs.
>>>>> 4. Modify the first byte of modifying region, and synchronize code
>>>>> on all CPUs.
>>>>> 5. Clear int3 handler.
>>>>>
>>>>
>>>> We (Intel OTC) have been able to get an *unofficial* answer as to the
>>>> validity of this procedure; specifically as it applies to Intel hardware
>>>> (obviously). We are working on getting an officially approved answer,
>>>> but as far as we currently know, the procedure as outlined above should
>>>> work on all Intel hardware. In fact, we believe the synchronization in
>>>> step 3 is in fact unnecessary (as the synchronization in step 4 provides
>>>> sufficient guard.)
>>>
>>> Hi Peter,
>>>
>>> This is great news! Thanks to Intel OTC and yourself for looking into
>>> this. In the immediate values patches, I am doing the synchronization at
>>> the end of step (3) to ensure that all remote CPUs issue read memory
>>> barriers, so the stores to the instruction are done in this order:
>>>
>>> spin lock
>>> store int3 to 1st byte
>>> smp_wmb()
>>> sync all cores
>>> store new instruction in all but 1st byte
>>> smp_wmb()
>>> issue smp_rmb() on all cores (a sync all cores has this effect)
>>> store new instruction to 1st byte
>>> send IPI to all cores (or call synchronize_sched()) to wait for all
>>> breakpoint handlers to complete.
>>> spin unlock
>>>
>>> So the question is: are these wmb/rmb pairs actually needed ? As the
>>> instruction fetch is not performed by instructions per se, I doubt a
>>> rmb() will have any effect on them. I always prefer to stay on the safe
>>> side, but it wouldn't hurt to know.
>>>
>>
>> I don't think the smp_rmb() has any function.
>
> OK, that's good to know.
>
>>
>> However, you're being quite inconsistent in your terminology here. The
>> assumption above is that the "synchronize code on all CPU" step is
>> sending an IPI to all cores and waiting for it to return, so that each
>> core has executed IPI/IRET before continuation.
>
> To be strictly correct, we cannot assume that the IPI handler issues IRET
> before signaling its completion. It's rather the other way around.
> This is why I add a smp_mb() in the IPI handler for the "synchronize
> code on all CPUs" step.
>
>>
>> It is *not* necessary to wait for the breakpoint handlers to return, as
>> long as they will get to IRET eventually, since IRET is a jump and a
>> serializing instruction.
>
> Ah, I see. So the added smp_mb() would not be needed then, as long as we
> know that the other CPUs either are currently running the IPI handler or
> have executed it. IOW: they will execute IRET very soon or they just
> executed it since the int3 have been written. I am a bit concerned about
> NMIs coming in this race window, but as they need to have started after
> we have put the breakpoint, that should be OK. (note: entry_*.S
> modifications are needed to support nesting breakpoint handlers in NMIs)
>
>>
>>> Hrm. Assuming we have a spinlock protecting all this, given that we
>>> synchronize all cores at step (4) _after_ removing the breakpoint, and
>>> given that the breakpoint handler is an interrupt gate (thus executes
>>> with interrupts off), I am inclined to think that sending the IPIs at
>>> the end of step (4) (and waiting for them to complete) should be enough
>>> to ensure that all in-flight breakpoint handlers for this site have
>>> completed their execution. This would mean that we only have to keep
>>> track of a single site at a time. Or am I missing something ?
>>
>> Yes: the whole point was that you can omit the synchronization in step 4
>> if you leave the breakpoint handler in place (I said "omit step 5", but
>> that wasn't really what I meant.)
Hmm, in that case, how can we reuse the breakpoint handler for another
text poke site? Even if we leave the handler, I think we need to clear
fixup information for next poking...
>>
>> That means that at the cost of two compares in the standard #BP handler,
>> we can get away with only one IPI per atomic instruction poke.
>
> OK. That makes sense now.
So, let me check the actual replacement steps.
(1) lock text_mutex
(2) setup breakpoint fixup addresses (source and destination)
(3) store int3 to 1st byte, and smp_wmb()
(4) send IPI and issue smp_mb() (or cpuid) with for sync all cores.
(5) store new instruction except 1st byte, and smp_wmb()
(6) store 1st byte of new instruction
(7) send IPI to all cores for waiting for all running breakpoint handlers.
(8) clear fixup addresses
(9) unlock text_mutex
Is this correct?
Thank you,
--
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
e-mail: mhiramat@redhat.com
next prev parent reply other threads:[~2010-01-14 6:54 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-12 16:26 [RFC PATCH 0/8] jump label v4 Jason Baron
2010-01-12 16:26 ` [RFC PATCH 1/8] jump label v4 - kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Jason Baron
2010-01-12 16:26 ` [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine Jason Baron
2010-01-12 23:16 ` H. Peter Anvin
2010-01-13 2:06 ` Mathieu Desnoyers
2010-01-13 4:55 ` H. Peter Anvin
2010-01-13 14:30 ` Mathieu Desnoyers
2010-01-14 6:57 ` Masami Hiramatsu [this message]
2010-01-14 18:45 ` Masami Hiramatsu
2010-04-13 17:16 ` Mathieu Desnoyers
2010-01-13 5:38 ` Masami Hiramatsu
2010-01-14 15:32 ` Steven Rostedt
2010-01-14 15:36 ` H. Peter Anvin
2010-01-17 18:55 ` Mathieu Desnoyers
2010-01-17 19:16 ` Arjan van de Ven
2010-01-18 15:59 ` Masami Hiramatsu
2010-01-18 16:23 ` H. Peter Anvin
2010-01-18 16:52 ` Mathieu Desnoyers
2010-01-18 18:50 ` H. Peter Anvin
2010-01-18 20:53 ` Masami Hiramatsu
2010-01-18 21:18 ` H. Peter Anvin
2010-01-18 21:32 ` Mathieu Desnoyers
2010-01-18 16:31 ` Arjan van de Ven
2010-01-18 16:54 ` Mathieu Desnoyers
2010-01-18 18:21 ` Masami Hiramatsu
2010-01-18 18:33 ` Mathieu Desnoyers
2010-01-14 15:39 ` Mathieu Desnoyers
2010-01-14 16:23 ` Masami Hiramatsu
2010-01-14 16:42 ` Jason Baron
2010-01-12 16:26 ` [RFC PATCH 3/8] jump label v4 - move opcode definitions Jason Baron
2010-01-12 16:26 ` [RFC PATCH 4/8] jump label v4 - notifier atomic call chain notrace Jason Baron
2010-01-12 16:26 ` [RFC PATCH 5/8] jump label v4 - base patch Jason Baron
2010-01-12 16:26 ` [RFC PATCH 6/8] jump label v4 - x86 support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 7/8] jump label v4 - tracepoint support Jason Baron
2010-01-12 16:26 ` [RFC PATCH 8/8] jump label v4 - add module support Jason Baron
-- strict thread matches above, loose matches on Subject: below --
2010-01-17 22:56 [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jump patching without stop_machine H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B4EC048.4090107@redhat.com \
--to=mhiramat@redhat.com \
--cc=andi@firstfloor.org \
--cc=hpa@zytor.com \
--cc=jbaron@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=roland@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rth@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.