public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: "Ma, Ling" <ling.ma@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Andi Kleen <andi@firstfloor.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Zhao, Yakui" <yakui.zhao@intel.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH V2 -tip] lib,x86_64: improve the performance of memcpy() for unaligned copy
Date: Tue, 19 Oct 2010 10:53:36 +0800	[thread overview]
Message-ID: <4CBD0830.4030301@cn.fujitsu.com> (raw)
In-Reply-To: <C10D3FB0CD45994C8A51FEC1227CE22F15D77722D0@shsmsx502.ccr.corp.intel.com>

On Mon, 18 Oct 2010 16:01:13 +0800, Ma, Ling wrote:
>>> rep_good will cause memcpy jump to memcpy_c, so not run this patch,
>>> we may continue to do further optimization on it later.
>
>> Yes, but in fact, the performance of memcpy_c is not better on some micro-architecture(such as:
>> Wolfdale-3M, ), especially in the unaligned cases, so we need do optimization for it, and I think
>> the first step of optimization is optimizing the original code of memcpy().
>
> As mentioned above , we will optimize further memcpy_c soon.
> Two reasons :
>    1. movs instruction need long lantency to startup
>    2. movs instruction is not good for unaligned case.
>
>>> BTW the improvement is only from core2 shift register optimization,
>>> but for most previous cpus shift register is very sensitive because of decode stage.
>>> I have test Atom, Opteron, and Nocona, new patch is still better.
>
>> I think we can add a flag to make this improvement only valid for Core2 or other CPU like it,
>> just like X86_FEATURE_REP_GOOD.
>
> We should optimize core2 in memcpy_c function in future, I think.

But there is a problem, the length of new instruction must be less or equal the length of
original instruction if we use alternatives, but IT seems the length of core2's optimization
instruction may be greater than the original instruction. So I think we can't optimize core2
in memcpy_c function, just in memcpy function.

Regards
Miao

  reply	other threads:[~2010-10-19  2:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-08  7:28 [PATCH V2 -tip] lib,x86_64: improve the performance of memcpy() for unaligned copy Miao Xie
2010-10-08  7:42 ` Ma, Ling
2010-10-08  9:02   ` Miao Xie
2010-10-13 21:31     ` H. Peter Anvin
2010-10-14  1:14       ` Ma, Ling
2010-10-14 19:43       ` Ma, Ling
2010-10-18  6:23         ` Miao Xie
2010-10-18  6:27           ` Ma, Ling
2010-10-18  6:34             ` Miao Xie
2010-10-18  6:43               ` Ma, Ling
2010-10-18  7:42                 ` Miao Xie
2010-10-18  8:01                   ` Ma, Ling
2010-10-19  2:53                     ` Miao Xie [this message]
2010-10-19  4:06                       ` Ma, Ling
2010-10-18  3:12       ` Miao Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CBD0830.4030301@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=ling.ma@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=yakui.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox