public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@muc.de>
To: dean gaudet <dean-list-linux-kernel@arctic.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>, linux-kernel@vger.kernel.org
Subject: Re: [RFC] x86-64: Use SSE for copy_page and clear_page
Date: 30 May 2005 21:32:25 +0200
Date: Mon, 30 May 2005 21:32:25 +0200	[thread overview]
Message-ID: <20050530193225.GC25794@muc.de> (raw)
In-Reply-To: <Pine.LNX.4.62.0505301209010.25345@twinlark.arctic.org>

On Mon, May 30, 2005 at 12:11:23PM -0700, dean gaudet wrote:
> On Mon, 30 May 2005, dean gaudet wrote:
> 
> > On Mon, 30 May 2005, Benjamin LaHaise wrote:
> > 
> > > Below is a patch that uses 128 bit SSE instructions for copy_page and 
> > > clear_page.  This is an improvement on P4 systems as can be seen by 
> > > running the test program at http://www.kvack.org/~bcrl/xmm64.c to get 
> > > results like:
> > 
> > it looks like the patch uses SSE2 instructions (pxor, movdqa, movntdq)... 
> > if you use xorps, movaps, movntps then it works on SSE processors as well.
> 
> oh and btw... on x86-64 you might want to look at using movnti with 64-bit 
> registers... the memory datapath on these processors is actually 64-bits 
> wide, and the 128-bit stores are broken into two 64-bit pieces internally 
> anyhow.  the advantage of using movnti over movntdq/movntps is that you 
> don't have to save/restore the xmm register set.

Any use of write combining for copy_page/clear_page is a bad idea.
The problem is that write combining always forces the destination
out of cache.  While it gives you better microbenchmarks your real workloads
suffer because they eat lot more additional cache misses when
accessing the fresh pages.

Don't go down that path please.

At least on Opteron I did quite some tests and the existing setup
with just rep ; movsq for C stepping or later or the unrolled loop 
for earlier CPUs worked best overall. On P4 I haven't do any benchmarks;
however it might be a good idea to check if rep ; movsq would be 
a win there too (if yes it could be enabled there)

-Andi

  reply	other threads:[~2005-05-30 19:32 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-30 18:16 [RFC] x86-64: Use SSE for copy_page and clear_page Benjamin LaHaise
2005-05-30 18:45 ` Jeff Garzik
2005-05-30 19:06 ` dean gaudet
2005-05-30 19:11   ` dean gaudet
2005-05-30 19:32     ` Andi Kleen [this message]
2005-05-31  8:37       ` Denis Vlasenko
2005-05-31  9:15         ` Denis Vlasenko
2005-05-31  9:23           ` Andi Kleen
2005-05-31 13:59             ` Benjamin LaHaise
2005-06-01  6:22               ` Denis Vlasenko
2005-06-01  6:47                 ` Denis Vlasenko
2005-06-01  7:22             ` michael
2005-06-01  7:48               ` Andi Kleen
2005-06-01  7:48               ` Denis Vlasenko
2005-06-01 21:46                 ` dean gaudet
2005-06-01  8:01               ` Nick Piggin
2005-05-30 19:38 ` Andi Kleen
2005-05-30 20:05   ` Michael Thonke
2005-05-30 20:14     ` Benjamin LaHaise
2005-05-30 20:42       ` Michael Thonke
2005-05-31  7:11     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050530193225.GC25794@muc.de \
    --to=ak@muc.de \
    --cc=bcrl@kvack.org \
    --cc=dean-list-linux-kernel@arctic.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox