All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@redhat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	akpm <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Jim Keniston <jkenisto@us.ibm.com>
Subject: Re: irq-disabled vs vmap vs text_poke
Date: Mon, 16 Feb 2009 21:00:35 -0500	[thread overview]
Message-ID: <499A1A43.6000009@redhat.com> (raw)
In-Reply-To: <20090216172441.GA12576@Krystal>

Mathieu Desnoyers wrote:
> * Nick Piggin (npiggin@suse.de) wrote:
>> On Mon, Feb 16, 2009 at 10:04:43AM -0500, Masami Hiramatsu wrote:
>>>>>>>> BTW, what about using map_vm_area() in text_poke() instead of vmap()?
>>>>>>>> Since text_poke() just maps text pages to alias pages temporarily,
>>>>>>>> I think we don't need to use delayed vunmap().
>> [...] 
>>
>>> Here is the patch which replace v(un)map with (un)map_vm_area.
>> I don't quite understand the point of this... delayed vunmap() is
>> just an implementation detail of vmap subsystem. Callers should not
>> have to care.
>>
> 
> AFAIK, map_vm_area/unmap_vm_area is faster than vmap/vunmap.  This is
> the point of this patch. Masami, could you provide a quick benchmark of
> text_poke()/seconds before and after this optimization is applied to
> confirm this ?

Sure, here is the result of calling text_poke() 2^14 times.

<Without this patch>
Total: 3634133356(cycles), 221809(cycles/text_poke)
Total: 3699532690(cycles), 225801(cycles/text_poke)
Total: 3249855588(cycles), 198355(cycles/text_poke)

<With this patch>
Total: 483467579(cycles), 29508(cycles/text_poke)
Total: 497441301(cycles), 30361(cycles/text_poke)
Total: 497604548(cycles), 30371(cycles/text_poke)

BTW, this is not only for performance, but also simplicity and its need.
Vmap may allocate new vm_area. However, since text_poke() just needs to
map pages temporarily (yeah, very short time), we don't want to call
kmalloc or any other memory allocators.
And since text_poke() makes WRITABLE aliases of READ-ONLY pages, we
want to purge these pages ASAP.
So, I think just reserving a small vm_area for text_poke() and
reusing it is enough.

Thanks,

> 
> Thanks,
> 
> Mathieu
> 
>>  
>>> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
>>> ---
>>>  arch/x86/include/asm/alternative.h |    1 +
>>>  arch/x86/kernel/alternative.c      |   25 ++++++++++++++++++++-----
>>>  init/main.c                        |    3 +++
>>>  3 files changed, 24 insertions(+), 5 deletions(-)
>>>
>>> Index: linux-2.6/arch/x86/include/asm/alternative.h
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/include/asm/alternative.h
>>> +++ linux-2.6/arch/x86/include/asm/alternative.h
>>> @@ -177,6 +177,7 @@ extern void add_nops(void *insns, unsign
>>>   * The _early version expects the memory to already be RW.
>>>   */
>>>
>>> +extern void text_poke_init(void);
>>>  extern void *text_poke(void *addr, const void *opcode, size_t len);
>>>  extern void *text_poke_early(void *addr, const void *opcode, size_t len);
>>>
>>> Index: linux-2.6/arch/x86/kernel/alternative.c
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/kernel/alternative.c
>>> +++ linux-2.6/arch/x86/kernel/alternative.c
>>> @@ -485,6 +485,16 @@ void *text_poke_early(void *addr, const
>>>  	return addr;
>>>  }
>>>
>>> +static struct vm_struct *text_poke_area[2];
>>> +static DEFINE_SPINLOCK(text_poke_lock);
>>> +
>>> +void __init text_poke_init(void)
>>> +{
>>> +	text_poke_area[0] = get_vm_area(PAGE_SIZE, VM_ALLOC);
>>> +	text_poke_area[1] = get_vm_area(2 * PAGE_SIZE, VM_ALLOC);
>>> +	BUG_ON(!text_poke_area[0] || !text_poke_area[1]);
>>> +}
>>> +
>>>  /**
>>>   * text_poke - Update instructions on a live kernel
>>>   * @addr: address to modify
>>> @@ -501,8 +511,9 @@ void *__kprobes text_poke(void *addr, co
>>>  	unsigned long flags;
>>>  	char *vaddr;
>>>  	int nr_pages = 2;
>>> -	struct page *pages[2];
>>> -	int i;
>>> +	struct page *pages[2], **pgp = pages;
>>> +	int i, ret;
>>> +	struct vm_struct *vma;
>>>
>>>  	if (!core_kernel_text((unsigned long)addr)) {
>>>  		pages[0] = vmalloc_to_page(addr);
>>> @@ -515,12 +526,16 @@ void *__kprobes text_poke(void *addr, co
>>>  	BUG_ON(!pages[0]);
>>>  	if (!pages[1])
>>>  		nr_pages = 1;
>>> -	vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
>>> -	BUG_ON(!vaddr);
>>> +	spin_lock(&text_poke_lock);
>>> +	vma = text_poke_area[nr_pages-1];
>>> +	ret = map_vm_area(vma, PAGE_KERNEL, &pgp);
>>> +	BUG_ON(ret);
>>> +	vaddr = vma->addr;
>>>  	local_irq_save(flags);
>>>  	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
>>>  	local_irq_restore(flags);
>>> -	vunmap(vaddr);
>>> +	unmap_kernel_range((unsigned long)vma->addr, (unsigned 
>>> long)vma->size);
>>> +	spin_unlock(&text_poke_lock);
>>>  	sync_core();
>>>  	/* Could also do a CLFLUSH here to speed up CPU recovery; but
>>>  	   that causes hangs on some VIA CPUs. */
>>> Index: linux-2.6/init/main.c
>>> ===================================================================
>>> --- linux-2.6.orig/init/main.c
>>> +++ linux-2.6/init/main.c
>>> @@ -676,6 +676,9 @@ asmlinkage void __init start_kernel(void
>>>  	taskstats_init_early();
>>>  	delayacct_init();
>>>
>>> +#ifdef CONFIG_X86
>>> +	text_poke_init();
>>> +#endif
>>>  	check_bugs();
>>>
>>>  	acpi_early_init(); /* before LAPIC and SMP init */
>>>
>>> -- 
>>> Masami Hiramatsu
>>>
>>> Software Engineer
>>> Hitachi Computer Products (America) Inc.
>>> Software Solutions Division
>>>
>>> e-mail: mhiramat@redhat.com
> 

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


  reply	other threads:[~2009-02-17  2:00 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-13 12:50 irq-disabled vs vmap vs text_poke Peter Zijlstra
2009-02-13 12:55 ` Nick Piggin
2009-02-13 13:02   ` Peter Zijlstra
2009-02-13 13:04     ` Ingo Molnar
2009-02-13 14:25       ` Mathieu Desnoyers
2009-02-13 14:33         ` Peter Zijlstra
2009-02-13 14:43           ` Mathieu Desnoyers
2009-02-13 18:05             ` Ingo Molnar
2009-02-13 13:05     ` Peter Zijlstra
2009-02-13 13:09       ` Nick Piggin
2009-02-13 14:18 ` Mathieu Desnoyers
2009-02-13 14:28   ` Peter Zijlstra
2009-02-13 16:32   ` Masami Hiramatsu
2009-02-13 16:38     ` Peter Zijlstra
2009-02-13 16:55     ` Mathieu Desnoyers
2009-02-13 18:14       ` Masami Hiramatsu
2009-02-13 18:57         ` Mathieu Desnoyers
2009-02-13 21:41           ` Masami Hiramatsu
2009-02-16 15:04             ` Masami Hiramatsu
2009-02-16 15:31               ` Nick Piggin
2009-02-16 17:24                 ` Mathieu Desnoyers
2009-02-17  2:00                   ` Masami Hiramatsu [this message]
2009-02-17  3:03                     ` Nick Piggin
2009-02-17  8:31                       ` Peter Zijlstra
2009-02-17 17:13                         ` Masami Hiramatsu
2009-02-17 16:48                       ` Masami Hiramatsu
2009-02-17 17:02                         ` Steven Rostedt
2009-02-17 17:18                           ` Mathieu Desnoyers
2009-02-17 17:24                             ` Steven Rostedt
2009-02-17 17:28                               ` Mathieu Desnoyers
2009-02-17 17:48                                 ` Steven Rostedt
2009-02-13 14:27 ` [PATCH] x86: text_poke might sleep Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=499A1A43.6000009@redhat.com \
    --to=mhiramat@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ananth@in.ibm.com \
    --cc=jkenisto@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.