public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	lkml <linux-kernel@vger.kernel.org>,
	systemtap <systemtap@sources.redhat.com>,
	DLE <dle-develop@lists.sourceforge.net>,
	Jim Keniston <jkenisto@us.ibm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Christoph Hellwig <hch@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Anders Kaseorg <andersk@ksplice.com>,
	Tim Abbott <tabbott@ksplice.com>,
	Andi Kleen <andi@firstfloor.org>, Jason Baron <jbaron@redhat.com>
Subject: Re: [PATCH -tip v3&10 07/18] x86: Add text_poke_smp for SMP cross modifying code
Date: Thu, 25 Feb 2010 10:33:05 -0500	[thread overview]
Message-ID: <20100225153305.GC12635@Krystal> (raw)
In-Reply-To: <20100225133438.6725.80273.stgit@localhost6.localdomain6>

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Add generic text_poke_smp for SMP which uses stop_machine()
> to synchronize modifying code.
> This stop_machine() method is officially described at "7.1.3
> Handling Self- and Cross-Modifying Code" on the intel's
> software developer's manual 3A.
> 
> Since stop_machine() can't protect code against NMI/MCE, this
> function can not modify those handlers. And also, this function
> is basically for modifying multibyte-single-instruction. For
> modifying multibyte-multi-instructions, we need another special
> trap & detour code.
> 
> This code originaly comes from immediate values with stop_machine()
> version. Thanks Jason and Mathieu!
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Jim Keniston <jkenisto@us.ibm.com>
> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Anders Kaseorg <andersk@ksplice.com>
> Cc: Tim Abbott <tabbott@ksplice.com>
> Cc: Andi Kleen <andi@firstfloor.org>
> Cc: Jason Baron <jbaron@redhat.com>
> ---
> 
>  arch/x86/include/asm/alternative.h |    4 ++
>  arch/x86/kernel/alternative.c      |   60 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 63 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index f1e253c..b09ec55 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -165,10 +165,12 @@ static inline void apply_paravirt(struct paravirt_patch_site *start,
>   * invalid instruction possible) or if the instructions are changed from a
>   * consistent state to another consistent state atomically.
>   * More care must be taken when modifying code in the SMP case because of
> - * Intel's errata.
> + * Intel's errata. text_poke_smp() takes care that errata, but still
> + * doesn't support NMI/MCE handler code modifying.
>   * On the local CPU you need to be protected again NMI or MCE handlers seeing an
>   * inconsistent instruction while you patch.
>   */
>  extern void *text_poke(void *addr, const void *opcode, size_t len);
> +extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
>  
>  #endif /* _ASM_X86_ALTERNATIVE_H */
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index e6ea034..635e4f4 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -7,6 +7,7 @@
>  #include <linux/mm.h>
>  #include <linux/vmalloc.h>
>  #include <linux/memory.h>
> +#include <linux/stop_machine.h>
>  #include <asm/alternative.h>
>  #include <asm/sections.h>
>  #include <asm/pgtable.h>
> @@ -572,3 +573,62 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>  	local_irq_restore(flags);
>  	return addr;
>  }
> +
> +/*
> + * Cross-modifying kernel text with stop_machine().
> + * This code originally comes from immediate value.
> + */
> +static atomic_t stop_machine_first;
> +static int wrote_text;
> +
> +struct text_poke_params {
> +	void *addr;
> +	const void *opcode;
> +	size_t len;
> +};
> +
> +static int __kprobes stop_machine_text_poke(void *data)
> +{
> +	struct text_poke_params *tpp = data;
> +
> +	if (atomic_dec_and_test(&stop_machine_first)) {
> +		text_poke(tpp->addr, tpp->opcode, tpp->len);
> +		smp_wmb();	/* Make sure other cpus see that this has run */
> +		wrote_text = 1;
> +	} else {
> +		while (!wrote_text)
> +			smp_rmb();
> +		sync_core();

Hrm, there is a problem in there. The last loop, when wrote_text becomes
true, does not perform any smp_mb(), so you end up in a situation where
cpus in the "else" branch may never issue any memory barrier. I'd rather
do:

+static volatile int wrote_text;

...

+static int __kprobes stop_machine_text_poke(void *data)
+{
+	struct text_poke_params *tpp = data;
+
+	if (atomic_dec_and_test(&stop_machine_first)) {
+		text_poke(tpp->addr, tpp->opcode, tpp->len);
+		smp_wmb();	/* order text_poke stores before store to wrote_text */
+		wrote_text = 1;
+	} else {
+		while (!wrote_text)
+			cpu_relax();
+		smp_mb();	/* order wrote_text load before following execution */
+	}

If you don't like the "volatile int" definition of wrote_text, then we
should probably use the ACCESS_ONCE() macro instead.

Thanks,

Mathieu

> +	}
> +
> +	flush_icache_range((unsigned long)tpp->addr,
> +			   (unsigned long)tpp->addr + tpp->len);
> +	return 0;
> +}
> +
> +/**
> + * text_poke_smp - Update instructions on a live kernel on SMP
> + * @addr: address to modify
> + * @opcode: source of the copy
> + * @len: length to copy
> + *
> + * Modify multi-byte instruction by using stop_machine() on SMP. This allows
> + * user to poke/set multi-byte text on SMP. Only non-NMI/MCE code modifying
> + * should be allowed, since stop_machine() does _not_ protect code against
> + * NMI and MCE.
> + *
> + * Note: Must be called under get_online_cpus() and text_mutex.
> + */
> +void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
> +{
> +	struct text_poke_params tpp;
> +
> +	tpp.addr = addr;
> +	tpp.opcode = opcode;
> +	tpp.len = len;
> +	atomic_set(&stop_machine_first, 1);
> +	wrote_text = 0;
> +	stop_machine(stop_machine_text_poke, (void *)&tpp, NULL);
> +	return addr;
> +}
> +
> 
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America), Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
Operating System Efficiency Consultant
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2010-02-25 15:33 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-25 13:33 [PATCH -tip v3&10 00/18] perf-probe updates - optprobe, elfutils and lazy matching Masami Hiramatsu
2010-02-25 13:33 ` [PATCH -tip v3&10 01/18] kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Masami Hiramatsu
2010-02-25 15:10   ` Mathieu Desnoyers
2010-02-25 19:27   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:33 ` [PATCH -tip v3&10 02/18] kprobes: Introduce generic insn_slot framework Masami Hiramatsu
2010-02-25 15:21   ` Mathieu Desnoyers
2010-03-02  2:55     ` Masami Hiramatsu
2010-03-03  0:18       ` Masami Hiramatsu
2010-03-03  0:32         ` Mathieu Desnoyers
2010-03-03  0:35           ` Masami Hiramatsu
2010-02-25 19:28   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:34 ` [PATCH -tip v3&10 03/18] kprobes: Introduce kprobes jump optimization Masami Hiramatsu
2010-02-25 19:28   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:34 ` [PATCH -tip v3&10 04/18] kprobes: Jump optimization sysctl interface Masami Hiramatsu
2010-02-25 19:28   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:34 ` [PATCH -tip v3&10 05/18] kprobes/x86: Boost probes when reentering Masami Hiramatsu
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:34 ` [PATCH -tip v3&10 06/18] kprobes/x86: Cleanup save/restore registers Masami Hiramatsu
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:34 ` [PATCH -tip v3&10 07/18] x86: Add text_poke_smp for SMP cross modifying code Masami Hiramatsu
2010-02-25 15:33   ` Mathieu Desnoyers [this message]
2010-02-26  3:53     ` Masami Hiramatsu
2010-03-03  0:48       ` Mathieu Desnoyers
2010-03-03  0:56         ` Masami Hiramatsu
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:34 ` [PATCH -tip v3&10 08/18] kprobes/x86: Support kprobes jump optimization on x86 Masami Hiramatsu
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 09/18] kprobes: Add documents of jump optimization Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 10/18] perf probe: Do not show --line option without dwarf support Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 11/18] perf probe: Update perf probe document Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 12/18] perf probe: Fix bugs in line range finder Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 13/18] perf probe: Rename probe finder functions Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 14/18] perf probe: Use elfutils-libdw for analyzing debuginfo Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 15/18] perf probe: Use libdw callback routines Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:35 ` [PATCH -tip v3&10 16/18] perf probe: Check function address range strictly in line finder Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:36 ` [PATCH -tip v3&10 17/18] perf probe: show more lines after last line Masami Hiramatsu
2010-02-25 19:32   ` [tip:perf/probes] perf probe: Show " tip-bot for Masami Hiramatsu
2010-02-25 13:36 ` [PATCH -tip v3&10 18/18] perf probe: Add lazy line matching support Masami Hiramatsu
2010-02-25 19:32   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100225153305.GC12635@Krystal \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ananth@in.ibm.com \
    --cc=andersk@ksplice.com \
    --cc=andi@firstfloor.org \
    --cc=dle-develop@lists.sourceforge.net \
    --cc=fweisbec@gmail.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=jkenisto@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=systemtap@sources.redhat.com \
    --cc=tabbott@ksplice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox