From: Seth Goldberg <bergsoft@home.com>
To: Rogier Wolff <R.E.Wolff@BitWizard.nl>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>, linux-kernel@vger.kernel.org
Subject: Re: Athlon possible fixes
Date: Sat, 05 May 2001 15:44:31 -0700 [thread overview]
Message-ID: <3AF4824F.8964E53B@home.com> (raw)
In-Reply-To: <200105051626.SAA16651@cave.bitwizard.nl>
>
> As all this is trying to avoid bus turnarounds (i.e. switching from
> reading to writing), wouldn't it be fastest to just trust that the CPU
> has at least 4k worth of cache? (and hope for the best that we don't
> get interrupted in the meanwhile).
>
> void copy_page (char *dest, char *source)
> {
> long *dst = (long *)dest,
> *src=(long *)source,
> *end= (long *)(source+PAGE_SIZE);
> #if 1
> register int i;
> long t=0;
> static long tt;
>
> for (i=0;i<PAGE_SIZE/sizeof (long);i += cache_line_size()/sizeof(long))
> /* Actually the innards of this loop should be:
> (void) from[i];
> however, the compiler will probably optimize that away. */
> t += src[i];
>
> tt = t;
> #endif
> while (src < end)
> *dst++ = *src++;
>
> }
>
> So, this is 15 lines of C, and it'd be interesting to benchmark this
> against the assembly.
>
Well you asked for it :) :
clear_page by 'normal_clear_page' took 12196 cycles (318.1 MB/s)
clear_page by 'slow_zero_page' took 12207 cycles (317.9 MB/s)
clear_page by 'fast_clear_page' took 29272 cycles (132.6 MB/s)
clear_page by 'faster_clear_page' took 4831 cycles (803.1 MB/s)
copy_page by 'normal_copy_page' took 12607 cycles (307.8 MB/s)
copy_page by 'slow_copy_page' took 13617 cycles (285.0 MB/s)
copy_page by 'fast_copy_page' took 9531 cycles (407.1 MB/s)
copy_page by 'faster_copy' took 5585 cycles (694.7 MB/s)
copy_page by 'even_faster' took 5621 cycles (690.3 MB/s)
copy_page by 'even_faster_nopre' took 5837 cycles (664.8 MB/s)
copy_page by 'c_source' took 17296 cycles (224.3
MB/s)
The last one is yours :). I'd assume this is because the compiler is
not
using mmx instructions for this. (the nopre is a routine I added to
check
the speed with only a single prefetch instruction. When I tried adding
the routing with the single prefetch instruction to mmx.c and
recompiling
and rebooted, the system stayed up a lot longer, but it still crashed (I
was in Xwindows and the crash was partially written to the log file)
after around 3 minutes of work in X.
next prev parent reply other threads:[~2001-05-05 22:45 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-05-05 7:35 Athlon possible fixes Alan Cox
2001-05-05 16:26 ` Rogier Wolff
2001-05-05 16:42 ` Kurt Roeckx
2001-05-05 22:44 ` Seth Goldberg [this message]
2001-05-06 16:44 ` Jussi Laako
2001-05-06 17:41 ` Zilvinas Valinskas
2001-05-06 18:16 ` Christian Bornträger
2001-05-06 19:23 ` Marek Pętlicki
2001-05-07 18:54 ` Jussi Laako
2001-05-11 20:09 ` Jussi Laako
2001-05-11 20:22 ` Alan Cox
2001-05-12 9:51 ` Jussi Laako
2001-05-06 2:23 ` Chris Wedgwood
2001-05-06 12:51 ` Alan Cox
2001-05-06 13:00 ` Chris Wedgwood
2001-05-11 4:02 ` Ralf Baechle
-- strict thread matches above, loose matches on Subject: below --
2001-05-12 18:31 Ishikawa
2001-05-12 23:02 ` Alan Cox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3AF4824F.8964E53B@home.com \
--to=bergsoft@home.com \
--cc=R.E.Wolff@BitWizard.nl \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.