linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: erich@uruk.org
Cc: Matthias Welk <matthias.welk@fokus.gmd.de>,
	arjanv@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [CFT] faster athlon/duron memory copy implementation
Date: Thu, 24 Oct 2002 21:38:38 +0200	[thread overview]
Message-ID: <3DB84C3E.1070709@colorfullife.com> (raw)
In-Reply-To: E184nEw-00071m-00@trillium-hollow.org

erich@uruk.org wrote:

>>copy_page() tests
>>copy_page function 'warm up run'         took 18081 cycles per page
>>copy_page function '2.4 non MMX'         took 19487 cycles per page
>>copy_page function '2.4 MMX fallback'    took 19403 cycles per page
>>copy_page function '2.4 MMX version'     took 18086 cycles per page
>>copy_page function 'faster_copy'         took 11372 cycles per page
>>copy_page function 'even_faster'         took 11183 cycles per page
>>copy_page function 'no_prefetch'         took 7815 cycles per page
>>1020 [maw] (buruk) /tmp/athlon # athlon_test
>>    
>>
>
>
>Whoa!  Hmm.
>
>If I'm reading this right, with a processor speed of 1.666 GHz,
>you're getting:
>
>    (4096 bytes / 7815 clocks) * 1.666 GHz  =  873 MB/sec
>
>The perfect peak performance of your setup, if the cache implements
>standard write-allocate behavior (the target cache line is read before it
>is written because the write logic doesn't know you're going to overwrite
>the whole line in cases like this), should be:
>  
>
There is no write allocate.

There are 2 optimizations for bulk memory copy:
- avoid the write allocate. Possible with the mmx or sse non-temporal 
cache hints
    * already in the kernel. Difference between MMX and faster_copy
- avoid dram page misses, and stream from the memory chips with maximum 
efficiency.
    * new optimization. "prefetch" is a hint for the cpu that the 
program might need the memory
        If I understand the AMD document correctly, then this is not 
what's needed for bulk
        memory copy: we know that we'll need that cacheline. Thus a real 
read, to force the cpu to
        fetch the cacheline, even if all read buffers are occupied.

--
    Manfred



  parent reply	other threads:[~2002-10-24 19:32 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-24 17:15 [CFT] faster athlon/duron memory copy implementation Manfred Spraul
2002-10-24 17:37 ` Robert Love
2002-10-24 18:05   ` Zach Brown
2002-10-24 17:41 ` Andreas Steinmetz
2002-10-24 17:48 ` Matthias Welk
2002-10-24 19:01   ` erich
2002-10-24 19:11     ` Arjan van de Ven
2002-10-24 19:38     ` Manfred Spraul [this message]
2002-10-25  0:59       ` Panagiotis Papadakos
2002-10-24 17:53 ` Roger Luethi
2002-10-24 18:10 ` Daniel Egger
2002-10-24 19:15   ` Florin Iucha
2002-10-24 19:28   ` Manfred Spraul
2002-10-24 19:38     ` Dave Jones
2002-10-24 19:43     ` Ken Witherow
2002-10-25 13:08     ` Daniel Egger
2002-10-24 18:17 ` Eric Lammerts
2002-10-24 18:26 ` David Rees
2002-10-24 18:35 ` Josh McKinney
2002-10-24 18:36 ` Dave Jones
2002-10-24 18:43 ` Simon Fowler
2002-10-24 18:50   ` Simon Fowler
2002-10-24 18:56   ` Dave Jones
2002-10-24 18:48 ` Ernst Herzberg
2002-10-24 20:09   ` Ed Sweetman
2002-10-24 20:13     ` Robert Love
2002-10-24 20:31       ` Ed Sweetman
2002-10-24 20:49         ` Dave Jones
2002-10-24 20:26     ` Dave Jones
2002-10-25  9:19       ` Måns Rullgård
2002-10-24 19:11 ` Marcus Libäck
2002-10-24 19:19 ` Brian Gerst
2002-10-24 19:31 ` Matthias Schniedermeyer
2002-10-24 19:33 ` Pascal Schmidt
2002-10-24 19:39 ` Olaf Dietsche
2002-10-24 20:27 ` Mike Civil
2002-10-24 20:44 ` Willy TARREAU
2002-10-24 21:46 ` Josh Fryman
2002-10-24 22:18 ` Tim Schmielau
2002-10-24 23:09 ` Hirokazu Takahashi
2002-10-24 23:37 ` Ryan Cumming
2002-10-25  0:10 ` Matthias Andree
2002-10-25  8:35 ` venom
2002-10-25 13:31 ` Denis Vlasenko
2002-10-26 12:11 ` Jurjen Oskam
  -- strict thread matches above, loose matches on Subject: below --
2002-10-24 18:27 Shawn Starr
2002-10-24 20:51 Dieter Nützel
2002-10-24 21:01 ` Dieter Nützel
2002-10-24 21:16 ` Willy TARREAU
2002-10-24 22:01 Harm Verhagen
2002-10-25 16:29 Jorge Bernal "Koke"

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3DB84C3E.1070709@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=arjanv@redhat.com \
    --cc=erich@uruk.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthias.welk@fokus.gmd.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).