From: Jens Maurer <Jens.Maurer@gmx.net>
To: "David S. Miller" <davem@redhat.com>
Cc: linux-kernel@vger.kernel.org, arjanv@redhat.com
Subject: Re: [PATCH] Use x86 SSE instructions for clear_page, copy_page
Date: Sun, 22 Aug 2004 22:49:37 +0200 [thread overview]
Message-ID: <412906E1.4070003@gmx.net> (raw)
In-Reply-To: <20040817193338.16f57197.davem@redhat.com>
David S. Miller wrote:
> Time kernel builds on an otherwise totally idle machine.
> Do multiple runs so that the cache gets loaded and you're
> testing the memory accesses rather than disk reads.
I've applied Denis Vlasenko's patch that allows runtime
switching between "rep stosl" and SSE instructions.
I'm using this script to perform the kernel build:
rm mm/*.o kernel/*.o
make SUBDIRS="mm kernel"
Hardware: Pentium III 850 MHz with 256 MB RAM
Results of "time" using "rep stosl":
real 1m19.038s 1m18.099s 1m18.612s 1m18.630s
user 1m07.258s 1m07.056s 1m07.198s 1m07.052s
sys 0m10.062s 0m10.205s 0m10.141s 0m10.209s
oprofile result for "rep stosl" (obtained on a separate run):
samples % app name symbol name
497537 71.9851 cc1 (no symbols)
37358 5.4051 vmlinux-2.6-sse zero_page_std
6933 1.0031 vmlinux-2.6-sse __copy_to_user_ll
6010 0.8695 libc-2.3.3.so _int_malloc
5940 0.8594 vmlinux-2.6-sse copy_page_std
5262 0.7613 vmlinux-2.6-sse mark_offset_tsc
Results using SSE instructions (writes do not use
the caches):
real 1m16.724s 1m16.281s 1m16.681s 1m16.681s 1m16.207s
user 1m07.938s 1m07.752s 1m07.905s 1m07.916s 1m07.846s
sys 0m07.517s 0m07.866s 0m07.715s 0m07.753s 0m07.684s
oprofile result for SSE (obtained on a separate run):
501243 74.5966 cc1 (no symbols)
21533 3.2046 vmlinux-2.6-sse zero_page_sse
8166 1.2153 vmlinux-2.6-sse __copy_to_user_ll
5797 0.8627 libc-2.3.3.so _int_malloc
5344 0.7953 vmlinux-2.6-sse copy_page_sse
4757 0.7080 libc-2.3.3.so __GI_memset
4490 0.6682 vmlinux-2.6-sse mark_offset_tsc
Interpretation:
Real time is about 2 sec lower, user CPU is about 0.7-0.9 sec
higher, system CPU is about 2.3 sec lower.
Results from test runs fluctuate by about 0.2 sec, thus the
measured differences appear to be significant, and in favor
of using SSE instructions for clear_page(). copy_page()
was not used enough (compared to clear_page) to give a clear
picture.
It looks like a future patch should allow for independent
"rep stosl"/SSE configuration for clear_page() and copy_page().
Jens Maurer
next prev parent reply other threads:[~2004-08-22 20:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-08-17 6:13 [PATCH] Use x86 SSE instructions for clear_page, copy_page Jens Maurer
2004-08-17 7:27 ` Arjan van de Ven
2004-08-17 8:10 ` Andrey Panin
2004-08-17 8:11 ` Arjan van de Ven
2004-08-17 22:40 ` Jens Maurer
2004-08-18 2:33 ` David S. Miller
2004-08-22 20:49 ` Jens Maurer [this message]
2004-08-18 7:00 ` Ingo Molnar
2004-08-18 7:11 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=412906E1.4070003@gmx.net \
--to=jens.maurer@gmx.net \
--cc=arjanv@redhat.com \
--cc=davem@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox