All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willy TARREAU <willy@w.ods.org>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: linux-kernel@vger.kernel.org, arjanv@redhat.com
Subject: Re: [CFT] faster athlon/duron memory copy implementation
Date: Thu, 24 Oct 2002 22:44:04 +0200	[thread overview]
Message-ID: <20021024204404.GA486@pcw.home.local> (raw)
In-Reply-To: <3DB82ABF.8030706@colorfullife.com>

On Thu, Oct 24, 2002 at 07:15:43PM +0200, Manfred Spraul wrote:
> AMD recommends to perform memory copies with backward read operations 
> instead of prefetch.
> 
> http://208.15.46.63/events/gdc2002.htm
> 
> Attached is a test app that compares several memory copy implementations.
> Could you run it and report the results to me, together with cpu, 
> chipset and memory type?
> 
> Please run 2 or 3 times.

Dual Athlon XP 1800+ on ASUS A7M266-D (760MPX), 512 MB of PC2100 in two identical banks.
I observed a noticeable slowdown several minutes later (after typing this mail),
see below.

willy@pcw:c$ ./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $

copy_page() tests
copy_page function 'warm up run'         took 16402 cycles per page
copy_page function '2.4 non MMX'         took 17886 cycles per page
copy_page function '2.4 MMX fallback'    took 17956 cycles per page
copy_page function '2.4 MMX version'     took 16382 cycles per page
copy_page function 'faster_copy'         took 9807 cycles per page
copy_page function 'even_faster'         took 10205 cycles per page
copy_page function 'no_prefetch'         took 8457 cycles per page
willy@pcw:c$ ./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $

copy_page() tests
copy_page function 'warm up run'         took 16552 cycles per page
copy_page function '2.4 non MMX'         took 17744 cycles per page
copy_page function '2.4 MMX fallback'    took 17713 cycles per page
copy_page function '2.4 MMX version'     took 16427 cycles per page
copy_page function 'faster_copy'         took 9823 cycles per page
copy_page function 'even_faster'         took 10266 cycles per page
copy_page function 'no_prefetch'         took 8451 cycles per page
willy@pcw:c$ ./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $

copy_page() tests
copy_page function 'warm up run'         took 16409 cycles per page
copy_page function '2.4 non MMX'         took 17547 cycles per page
copy_page function '2.4 MMX fallback'    took 17516 cycles per page
copy_page function '2.4 MMX version'     took 16354 cycles per page
copy_page function 'faster_copy'         took 9807 cycles per page
copy_page function 'even_faster'         took 10219 cycles per page
copy_page function 'no_prefetch'         took 8442 cycles per page

--- several minutes later ---

willy@pcw:c$ ./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $

copy_page() tests
copy_page function 'warm up run'         took 18140 cycles per page
copy_page function '2.4 non MMX'         took 20370 cycles per page
copy_page function '2.4 MMX fallback'    took 20361 cycles per page
copy_page function '2.4 MMX version'     took 18086 cycles per page
copy_page function 'faster_copy'         took 10231 cycles per page
copy_page function 'even_faster'         took 10457 cycles per page
copy_page function 'no_prefetch'         took 8456 cycles per page

=> it seems that the memory areas have changed and that it is a bit
slower now. But as you can see, no_prefetch is stable. Only "common"
functions get slower.

So I tried to allocate hundreds of MB of RAM to swap a bit, then free it.
The results look better again :

willy@pcw:c$ ./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $

copy_page() tests
copy_page function 'warm up run'         took 16135 cycles per page
copy_page function '2.4 non MMX'         took 17863 cycles per page
copy_page function '2.4 MMX fallback'    took 17866 cycles per page
copy_page function '2.4 MMX version'     took 16057 cycles per page
copy_page function 'faster_copy'         took 9669 cycles per page
copy_page function 'even_faster'         took 10176 cycles per page
copy_page function 'no_prefetch'         took 8433 cycles per page

=> "common" implementations seem to really suffer from physical location.

Other data :
------------

willy@pcw:c$ cat /proc/pci
  Bus  0, device   0, function  0:
    Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 17).
      Master Capable.  Latency=32.
      Prefetchable 32 bit memory at 0xfc000000 [0xfdffffff].
      Prefetchable 32 bit memory at 0xfb800000 [0xfb800fff].
      I/O at 0xe800 [0xe803].

willy@pcw:c$ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 6
model name      : AMD Athlon(TM) MP 1800+
stepping        : 2
cpu MHz         : 1546.000
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips        : 3080.19

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 6
model name      : AMD Athlon(TM) MP 1800+
stepping        : 2
cpu MHz         : 1546.000
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips        : 3086.74


Cheers,
Willy


  parent reply	other threads:[~2002-10-24 20:38 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-24 17:15 [CFT] faster athlon/duron memory copy implementation Manfred Spraul
2002-10-24 17:37 ` Robert Love
2002-10-24 18:05   ` Zach Brown
2002-10-24 17:41 ` Andreas Steinmetz
2002-10-24 17:48 ` Matthias Welk
2002-10-24 19:01   ` erich
2002-10-24 19:11     ` Arjan van de Ven
2002-10-24 19:38     ` Manfred Spraul
2002-10-25  0:59       ` Panagiotis Papadakos
2002-10-24 17:53 ` Roger Luethi
2002-10-24 18:10 ` Daniel Egger
2002-10-24 19:15   ` Florin Iucha
2002-10-24 19:28   ` Manfred Spraul
2002-10-24 19:38     ` Dave Jones
2002-10-24 19:43     ` Ken Witherow
2002-10-25 13:08     ` Daniel Egger
2002-10-24 18:17 ` Eric Lammerts
2002-10-24 18:26 ` David Rees
2002-10-24 18:35 ` Josh McKinney
2002-10-24 18:36 ` Dave Jones
2002-10-24 18:43 ` Simon Fowler
2002-10-24 18:50   ` Simon Fowler
2002-10-24 18:56   ` Dave Jones
2002-10-24 18:48 ` Ernst Herzberg
2002-10-24 20:09   ` Ed Sweetman
2002-10-24 20:13     ` Robert Love
2002-10-24 20:31       ` Ed Sweetman
2002-10-24 20:49         ` Dave Jones
2002-10-24 20:26     ` Dave Jones
2002-10-25  9:19       ` Måns Rullgård
2002-10-24 19:11 ` Marcus Libäck
2002-10-24 19:19 ` Brian Gerst
2002-10-24 19:31 ` Matthias Schniedermeyer
2002-10-24 19:33 ` Pascal Schmidt
2002-10-24 19:39 ` Olaf Dietsche
2002-10-24 20:27 ` Mike Civil
2002-10-24 20:44 ` Willy TARREAU [this message]
2002-10-24 21:46 ` Josh Fryman
2002-10-24 22:18 ` Tim Schmielau
2002-10-24 23:09 ` Hirokazu Takahashi
2002-10-24 23:37 ` Ryan Cumming
2002-10-25  0:10 ` Matthias Andree
2002-10-25  8:35 ` venom
2002-10-25 13:31 ` Denis Vlasenko
2002-10-26 12:11 ` Jurjen Oskam
  -- strict thread matches above, loose matches on Subject: below --
2002-10-24 18:27 Shawn Starr
2002-10-24 20:51 Dieter Nützel
2002-10-24 21:01 ` Dieter Nützel
2002-10-24 21:16 ` Willy TARREAU
2002-10-24 22:01 Harm Verhagen
2002-10-25 16:29 Jorge Bernal "Koke"

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20021024204404.GA486@pcw.home.local \
    --to=willy@w.ods.org \
    --cc=arjanv@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.