All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>, "Theodore Ts'o" <tytso@mit.edu>,
	Chris Mason <chris.mason@oracle.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Linux Ext4 <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] x86_64/lib: improve the performance of memmove
Date: Thu, 16 Sep 2010 17:29:32 +0800	[thread overview]
Message-ID: <4C91E37C.2060309@cn.fujitsu.com> (raw)
In-Reply-To: <20100916104008.3e1e34b2@basil.nowhere.org>

On Thu, 16 Sep 2010 10:40:08 +0200, Andi Kleen wrote:
> On Thu, 16 Sep 2010 15:16:31 +0800
> Miao Xie<miaox@cn.fujitsu.com>  wrote:
>
>> On Thu, 16 Sep 2010 08:48:25 +0200 (cest), Andi Kleen wrote:
>>>> When the dest and the src do overlap and the memory area is large,
>>>> memmove of
>>>> x86_64 is very inefficient, and it led to bad performance, such as
>>>> btrfs's file
>>>> deletion performance. This patch improved the performance of
>>>> memmove on x86_64
>>>> by using __memcpy_bwd() instead of byte copy when doing large
>>>> memory area copy
>>>> (len>   64).
>>>
>>>
>>> I still don't understand why you don't simply use a backwards
>>> string copy (with std) ? That should be much simpler and
>>> hopefully be as optimized for kernel copies on recent CPUs.
>>
>> But according to the comment of memcpy, some CPUs don't support "REP"
>> instruction,
>
> I think you misread the comment. REP prefixes are in all x86 CPUs.
> On some very old systems it wasn't optimized very well,
> but it probably doesn't make too much sense to optimize for those.
> On newer CPUs in fact REP should be usually faster than
> an unrolled loop.
>
> I'm not sure how optimized the backwards copy is, but most likely
> it is optimized too.
>
> Here's an untested patch that implements backwards copy with string
> instructions. Could you run it through your test harness?

Ok, I'll do it.

> +
> +/*
> + * Copy memory backwards (for memmove)
> + * rdi target
> + * rsi source
> + * rdx count
> + */
> +
> +ENTRY(memcpy_backwards):

s/://

> +	CFI_STARTPROC
> +	std
> +	movq %rdi, %rax
> +	movl %edx, %ecx
> +	add  %rdx, %rdi
> +	add  %rdx, %rsi

-	add  %rdx, %rdi
-	add  %rdx, %rsi
+	addq  %rdx, %rdi
+	addq  %rdx, %rsi

Besides that, the address that %rdi/%rsi pointed to is over the end of
mempry area that going to be copied, so we must tune the address to be
correct.
+	leaq -8(%rdi), %rdi
+	leaq -8(%rsi), %rsi

> +	shrl $3, %ecx
> +	andl $7, %edx
> +	rep movsq

The same as above.
+	leaq 8(%rdi), %rdi
+	leaq 8(%rsi), %rsi
+	decq %rsi
+	decq %rdi

> +	movl %edx, %ecx
> +	rep movsb
> +	cld
> +	ret
> +	CFI_ENDPROC
> +ENDPROC(memcpy_backwards)
> +	
> diff --git a/arch/x86/lib/memmove_64.c b/arch/x86/lib/memmove_64.c
> index 0a33909..6c00304 100644
> --- a/arch/x86/lib/memmove_64.c
> +++ b/arch/x86/lib/memmove_64.c
> @@ -5,16 +5,16 @@
>   #include<linux/string.h>
>   #include<linux/module.h>
>
> +extern void asmlinkage memcpy_backwards(void *dst, const void *src,
> +				        size_t count);

The type of the return value must be "void *".

Thanks
Miao

> +
>   #undef memmove
>   void *memmove(void *dest, const void *src, size_t count)
>   {
>   	if (dest<  src) {
>   		return memcpy(dest, src, count);
>   	} else {
> -		char *p = dest + count;
> -		const char *s = src + count;
> -		while (count--)
> -			*--p = *--s;
> +		return memcpy_backwards(dest, src, count);
>   	}
>   	return dest;
>   }
>

  reply	other threads:[~2010-09-16  9:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-16  6:48 [PATCH] x86_64/lib: improve the performance of memmove Andi Kleen
2010-09-16  7:16 ` Miao Xie
2010-09-16  8:40   ` Andi Kleen
2010-09-16  9:29     ` Miao Xie [this message]
2010-09-16 10:11       ` Andi Kleen
2010-09-16 10:47         ` Miao Xie
2010-09-16 11:47           ` Miao Xie
2010-09-17  0:55   ` ykzhao
2010-09-17  3:37     ` Miao Xie
  -- strict thread matches above, loose matches on Subject: below --
2010-09-16 12:13 George Spelvin
2010-09-16  6:31 Miao Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C91E37C.2060309@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=chris.mason@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.