All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <compudj@krystal.dyndns.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Arjan van de Ven <arjan@infradead.org>
Subject: Re: [PATCH 0/3] ring-buffer: less locking and only disable preemption
Date: Sat, 4 Oct 2008 18:27:13 -0400	[thread overview]
Message-ID: <20081004222713.GA1813@Krystal> (raw)
In-Reply-To: <20081004174121.GA1337@elte.hu>

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > The dynamic function tracer is another issue. The problem with NMIs 
> > > has nothing to do with locking, or corrupting the buffers. It has to 
> > > do with the dynamic code modification.  Whenever we modify code, we 
> > > must guarantee that it will not be executed on another CPU.
> > > 
> > > Kstop_machine serves this purpose rather well. We can modify code 
> > > without worrying it will be executed on another CPU, except for NMIs. 
> > > The problem now comes where an NMI can come in and execute the code 
> > > being modified. That's why I put in all the notrace, lines. But it 
> > > gets difficult because of nmi_notifier can call all over the kernel.  
> > > Perhaps, we can simply disable the nmi-notifier when we are doing the 
> > > kstop_machine call?
> > 
> > that would definitely be one way to reduce the cross section, but not 
> > enough i'm afraid. For example in the nmi_watchdog=2 case we call into 
> > various lapic functions and paravirt lapic handlers which makes it all 
> > spread to 3-4 paravirtualization flavors ...
> > 
> > sched_clock()'s notrace aspects were pretty manageable, but this in 
> > its current form is not.
> 
> there's a relatively simple method that would solve all these 
> impact-size problems.
> 
> We cannot stop NMIs (and MCEs, etc.), but we can make kernel code 
> modifications atomic, by adding the following thin layer ontop of it:
> 
>    #define MAX_CODE_SIZE 10
> 
>    int redo_len;
>    u8 *redo_vaddr;
> 
>    u8 redo_buffer[MAX_CODE_SIZE];
> 
>    atomic_t __read_mostly redo_pending;
> 
> and use it in do_nmi():
> 
>    if (unlikely(atomic_read(&redo_pending)))
> 	modify_code_redo();
> 
> i.e. when we modify code, we first fill in the redo_buffer[], redo_vaddr 
> and redo_len[], then we set redo_pending flag. Then we modify the kernel 
> code, and clear the redo_pending flag.
> 
> If an NMI (or MCE) handler intervenes, it will notice the pending 
> 'transaction' and will copy redo_buffer[] to the (redo_vaddr,len) 
> location and will continue.
> 
> So as far as non-maskable contexts are concerned, kernel code patching 
> becomes an atomic operation. do_nmi() has to be marked notrace but 
> that's all and easy to maintain.
> 
> Hm?
> 

The comment at the beginning of 
http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=blob;f=arch/x86/kernel/immediate.c;h=87a25db0efbd8f73d3d575e48541f2a179915da5;hb=b6148ea934f42e730571f41aa5a1a081a93995b5

explains that code modification on x86 SMP systems is not only a matter
of atomicity, but also a matter of not changing the code underneath a
running CPU which is making assumptions that it won't change underneath
without issuing a synchronizing instruction before the new code is used
by the CPU. The scheme you propose here takes care of atomicity, but
does not take care of the synchronization problem. A sync_core() would
probably be required when such modification is detected.

Also, speaking of plain atomicity, you scheme does not seem to protect
against NMIs running on a different CPU, because the non-atomic change
could race with such NMI.

Mathieu

> 	Ingo
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2008-10-04 22:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-04  6:00 [PATCH 0/3] ring-buffer: less locking and only disable preemption Steven Rostedt
2008-10-04  6:00 ` [PATCH 1/3] ring-buffer: move page indexes into page headers Steven Rostedt
2008-10-04  6:00 ` [PATCH 2/3] ring-buffer: make reentrant Steven Rostedt
2008-10-04  6:01 ` [PATCH 3/3] ftrace: make some tracers reentrant Steven Rostedt
2008-10-04  8:40 ` [PATCH 0/3] ring-buffer: less locking and only disable preemption Ingo Molnar
2008-10-04 14:34   ` Steven Rostedt
2008-10-04 14:44     ` Ingo Molnar
2008-10-04 17:41       ` Ingo Molnar
2008-10-04 22:27         ` Mathieu Desnoyers [this message]
2008-10-04 23:21           ` Steven Rostedt
2008-10-06 17:10             ` Mathieu Desnoyers
2008-10-05 10:13           ` Ingo Molnar
2008-10-06 13:53             ` Mathieu Desnoyers
2008-10-04 16:33     ` Mathieu Desnoyers
2008-10-04 17:18       ` Steven Rostedt
2008-10-06 17:13         ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081004222713.GA1813@Krystal \
    --to=compudj@krystal.dyndns.org \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.