All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@amd64.org>
To: "Valdis.Kletnieks@vt.edu" <Valdis.Kletnieks@vt.edu>
Cc: Borislav Petkov <bp@alien8.de>, Ingo Molnar <mingo@elte.hu>,
	melwyn lobo <linux.melwyn@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: x86 memcpy performance
Date: Tue, 16 Aug 2011 14:16:04 +0200	[thread overview]
Message-ID: <20110816121604.GA29251@aftab> (raw)
In-Reply-To: <6296.1313462075@turing-police.cc.vt.edu>

[-- Attachment #1: Type: text/plain, Size: 2448 bytes --]

On Mon, Aug 15, 2011 at 10:34:35PM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Sun, 14 Aug 2011 11:59:10 +0200, Borislav Petkov said:
> 
> > Benchmarking with 10000 iterations, average results:
> > size    XM              MM              speedup
> > 119     540.58          449.491         0.8314969419
> 
> > 12273   2307.86         4042.88         1.751787902
> > 13924   2431.8          4224.48         1.737184756
> > 14335   2469.4          4218.82         1.708440514
> > 15018   2675.67         1904.07         0.711622886
> > 16374   2989.75         5296.26         1.771470902
> > 24564   4262.15         7696.86         1.805863077
> > 27852   4362.53         3347.72         0.7673805572
> > 28672   5122.8          7113.14         1.388524413
> > 30033   4874.62         8740.04         1.792967931
> 
> The numbers for 15018 and 27852 are *way* odd for the MM case. I don't feel
> really good about this till we understand what happened for those two cases.

Yep.

> Also, anytime I see "10000 iterations", I ask myself if the benchmark
> rigging took proper note of hot/cold cache issues. That *may* explain
> the two oddball results we see above - but not knowing more about how
> it was benched, it's hard to say.

Yeah, the more scrutiny this gets the better. So I've cleaned up my
setup and have attached it.

xm_mem.c does the benchmarking and in bench_memcpy() there's the
sse_memcpy call which is the SSE memcpy implementation using inline asm.
It looks like gcc produces pretty crappy code here because if I replace
the sse_memcpy call with xm_memcpy() from xm_memcpy.S - this is the
same function but in pure asm - I get much better numbers, sometimes
even over 2x. It all depends on the alignment of the buffers though.
Also, those numbers don't include the context saving/restoring which the
kernel does for us.

7491    1509.89         2346.94         1.554378381
8170    2166.81         2857.78         1.318890326
12277   2659.03         4179.31         1.571744176
13907   2571.24         4125.7          1.604558427
14319   2638.74         5799.67         2.19789466	<----
14993   2752.42         4413.85         1.603625603
16371   3479.11         5562.65         1.59887055

So please take a look and let me know what you think.

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

[-- Attachment #2: sse_memcpy.tar.bz2 --]
[-- Type: application/octet-stream, Size: 3508 bytes --]

  reply	other threads:[~2011-08-16 12:16 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-12 17:59 x86 memcpy performance melwyn lobo
2011-08-12 18:33 ` Andi Kleen
2011-08-12 19:52 ` Ingo Molnar
2011-08-14  9:59   ` Borislav Petkov
2011-08-14 11:13     ` Denys Vlasenko
2011-08-14 12:40       ` Borislav Petkov
2011-08-15 13:27         ` melwyn lobo
2011-08-15 13:44         ` Denys Vlasenko
2011-08-16  2:34     ` Valdis.Kletnieks
2011-08-16 12:16       ` Borislav Petkov [this message]
2011-09-01 15:15         ` Maarten Lankhorst
2011-09-01 16:18           ` Linus Torvalds
2011-09-08  8:35             ` Borislav Petkov
2011-09-08 10:58               ` Maarten Lankhorst
2011-09-09  8:14                 ` Borislav Petkov
2011-09-09 10:12                   ` Maarten Lankhorst
2011-09-09 11:23                     ` Maarten Lankhorst
2011-09-09 13:42                       ` Borislav Petkov
2011-09-09 14:39                   ` Linus Torvalds
2011-09-09 15:35                     ` Borislav Petkov
2011-12-05 12:20                       ` melwyn lobo
2011-12-05 12:54           ` melwyn lobo
2011-12-05 14:36             ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2011-08-15 14:55 Borislav Petkov
2011-08-15 14:59 ` Andy Lutomirski
2011-08-15 15:29   ` Borislav Petkov
2011-08-15 15:36     ` Andrew Lutomirski
2011-08-15 16:12       ` Borislav Petkov
2011-08-15 17:04         ` Andrew Lutomirski
2011-08-15 18:49           ` Borislav Petkov
2011-08-15 19:11             ` Andrew Lutomirski
2011-08-15 20:05               ` Borislav Petkov
2011-08-15 20:08                 ` Andrew Lutomirski
2011-08-15 16:12       ` H. Peter Anvin
2011-08-15 16:58         ` Andrew Lutomirski
2011-08-15 18:26           ` H. Peter Anvin
2011-08-15 18:35             ` Andrew Lutomirski
2011-08-15 18:52               ` H. Peter Anvin
2011-08-16  7:19 ` melwyn lobo
2011-08-16  7:43   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110816121604.GA29251@aftab \
    --to=bp@amd64.org \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux.melwyn@gmail.com \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.