From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Andi Kleen <andi@firstfloor.org>
Cc: jbeulich@novell.com, "S. P. Prasanna" <prasanna@in.ibm.com>,
linux-kernel@vger.kernel.org, patches@x86-64.org,
Jeremy Fitzhardinge <jeremy@goop.org>
Subject: Re: new text patching for review
Date: Fri, 20 Jul 2007 10:36:33 -0400 [thread overview]
Message-ID: <20070720143633.GB29979@Krystal> (raw)
In-Reply-To: <20070720082833.GC19833@one.firstfloor.org>
* Andi Kleen (andi@firstfloor.org) wrote:
> On Thu, Jul 19, 2007 at 07:49:12PM -0400, Mathieu Desnoyers wrote:
> > * Andi Kleen (andi@firstfloor.org) wrote:
> > > Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> writes:
> > >
> > > > * Andi Kleen (ak@suse.de) wrote:
> > > > >
> > > > > > Ewwwwwwwwwww.... you plan to run this in SMP ? So you actually go byte
> > > > > > by byte changing pieces of instructions non atomically and doing
> > > > > > non-Intel's errata friendly XMC. You are really looking for trouble
> > > > > > there :) Two distinct errors can occur:
> > > > >
> > > > > In this case it is ok because this only happens when transitioning
> > > > > from 1 CPU to 2 CPUs or vice versa and in both cases the other CPUs
> > > > > are essentially stopped.
> > > > >
> > > >
> > > > I agree that it's ok with SMP, but another problem arises: it's not only
> > > > a matter of being protected from SMP access, but also a matter of
> > > > reentrancy wrt interrupt handlers.
> > > >
> > > > i.e.: if, as we are patching nops non atomically, we have a non maskable
> > > > interrupt coming which calls get_cycles_sync() which uses the
> > >
> > > Hmm, i didn't think NMI handlers called that. e.g. nmi watchdog just
> > > uses jiffies.
> > >
> > > get_cycles_sync patching happens only relatively early at boot, so oprofile
> > > cannot be running yet.
> >
> > Actually, the nmi handler does use the get_cycles(), and also uses the
> >
> > spinlock code:
> >
> > arch/i386/kernel/nmi.c:
> > __kprobes int nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
> > ...
> > static DEFINE_SPINLOCK(lock); /* Serialise the printks */
> > spin_lock(&lock);
> > printk("NMI backtrace for cpu %d\n", cpu);
> > ...
> > spin_unlock(&lock);
> >
> > If A - we change the spinlock code non atomically it would break.
>
> It only has its lock prefixes twiggled, which should be ok.
>
Yes, this is a special case where both the lock prefixed instructions
and the non lock prefixed instructions are valid, so it does not matter
if a thread is preempted right after executing the NOP turned into a
0xf0. However, if such case happens when passing from UP to SMP, a
thread could be scheduled in and try to access to a spinlock with the
non-locked instruction. There should be some kind of "teardown" to make
sure that no such case can happen.
> > B - printk reads the TSC to get a timestamp, it breaks:
> > it calls:
> > printk_clock(void) -> sched_clock(); -> get_cycles_sync() on x86_64.
>
> Are we reading the same source? sched_clock has never used get_cycles_sync(),
> just ordinary get_cycles() which is not patched. In fact it mostly
> used rdtscll() directly.
>
Yes, you are right.. I am thinking more about other clients, such as a
tracer, which could want the precision given by get_cycles_sync() and
may execute in NMI context. It does not apply to the current kernel
source. It's just that reading a timestamp counter is an operation so
common that it should not come with restrictions about which context it
could be called from due to the alternatives mechanism.
> The main problem is alternative() nopify, e.g. for prefetches which
> could hide in every list_for_each; but from a quick look the current
> early NMI code doesn't do that.
Yup.. well.. my tracer will ;) I use a list_for_each_rcu() to iterate on
active traces. That's another example of a very basic piece of
infrastructure for which we don't want to bother about alternatives
patching when using it.
>
> > Yeah, that's a mess. That's why I always consider patching the code
> > in a way that will let the NMI handler run through it in a sane manner
> > _while_ the code is being patched. It implies _at least_ to do the
> > updates atomically with atomic aligned memory writes that keeps the site
> > being patched in a coherent state. Using a int3-based bypass is also
> > required on Intel because of the erratum regarding instruction cache.
>
> That's only for cross modifying code, no?
>
No. It also applies to UP modification. Since it is hard to insure that
no unmaskable interrupt handler will run on top of you, it can help to
leave the code in a valid state at every moment.
> > > This cannot happen for the current code:
> > > - full alternative patching happen only at boot when the other CPUs
> > > are not running
> >
> > Should be checked if NMIs and MCEs are active at that moment.
>
> They are probably both.
>
> I guess we could disable them again. I will cook up a patch.
>
I guess we could, although I wouldn't recommend doing it on a live
system, only at boot time.
> > I see the mb()/rmb()/wmb() also uses alternatives, they should be
> > checked for boot-time racing against NMIs and MCEs.
>
> Patch above would take care of it.
>
> >
> > init/main.c:start_kernel()
> >
> > parse_args() (where the nmi watchdog is enabled it seems) would probably
> > execute the smp-alt-boot and nmi_watchdog arguments in the order in which
> > they are given as kernel arguments. So I guess it could race.
>
> Not sure I see your point here. How can arguments race?
>
I thought parse_args() started the NMIs, but it seems to just take the
arguments and saves them for later.
> >
> > the "mce" kernel argument is also parsed in parse_args(), which leads to
> > the same problem.
>
> ?
>
Same as above.
>
> >
> > > For the immediate value patching it also cannot happen because
> > > you'll never modify multiple instructions and all immediate values
> > > can be changed atomically.
> > >
> >
> > Exactly, I always make sure that the immediate value within the
> > instruction is aligned (so a 5 bytes movl must have an offset of +3
> > compared to a 4 bytes alignment).
>
> The x86 architecture doesn't require alignment for atomic updates.
>
You mean for atomicity wrt the local SMP or cross-cpus ?
> > Make sure this API is used only to modify code meeting these
> > requirements (those are the ones I remember from the top of my head):
>
> Umm, that's far too complicated. Nobody will understand it anyways.
> I'll cook up something simpler.
>
ok :)
Mathieu
> -Andi
>
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2007-07-20 14:41 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-19 9:05 new text patching for review Andi Kleen
2007-07-19 13:38 ` Mathieu Desnoyers
2007-07-19 13:46 ` Andi Kleen
2007-07-19 17:35 ` Mathieu Desnoyers
2007-07-19 21:14 ` Andi Kleen
2007-07-19 20:30 ` Jeremy Fitzhardinge
2007-07-19 20:46 ` [patches] " Andi Kleen
2007-07-19 20:51 ` Jeremy Fitzhardinge
2007-07-19 21:06 ` Andi Kleen
2007-07-19 21:08 ` Jeremy Fitzhardinge
2007-07-19 23:53 ` Mathieu Desnoyers
2007-08-20 0:55 ` Non atomic unaligned writes Mathieu Desnoyers
2007-08-20 5:03 ` Arjan van de Ven
2007-08-20 10:23 ` Andi Kleen
2007-07-19 23:51 ` new text patching for review Mathieu Desnoyers
2007-07-19 23:49 ` Mathieu Desnoyers
2007-07-20 1:15 ` Zachary Amsden
2007-07-20 7:37 ` Andi Kleen
2007-07-20 15:17 ` Mathieu Desnoyers
2007-07-21 6:19 ` Andi Kleen
2007-07-20 8:28 ` Andi Kleen
2007-07-20 14:36 ` Mathieu Desnoyers [this message]
2007-07-20 0:37 ` Zachary Amsden
2007-07-20 8:23 ` Andi Kleen
2007-08-10 19:00 ` Mathieu Desnoyers
-- strict thread matches above, loose matches on Subject: below --
2007-07-19 12:25 Jan Beulich
2007-07-19 12:41 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070720143633.GB29979@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=andi@firstfloor.org \
--cc=jbeulich@novell.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=patches@x86-64.org \
--cc=prasanna@in.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.