From: Andrew Morton <akpm@zip.com.au>
To: Hanna Linder <hannal@us.ibm.com>
Cc: "Martin J. Bligh" <Martin.Bligh@us.ibm.com>,
Keith Mannthey <mannthey@us.ibm.com>,
haveblue@us.ibm.com, lse-tech@lists.sourceforge.net,
linux-kernel@vger.kernel.org
Subject: Re: scalable kmap (was Re: vm lock contention reduction) (fwd)
Date: Wed, 10 Jul 2002 20:06:22 -0700 [thread overview]
Message-ID: <3D2CF62E.949F20B4@zip.com.au> (raw)
In-Reply-To: 40740000.1026339488@w-hlinder
Hanna Linder wrote:
>
> ...
> Andrew and Martin,
>
> I ran this updated patch on 2.5.25 with dbench on
> the 8-way with 4 Gb of memory compared to clean 2.5.25.
> I saw a significant improvement in throughput about 15%
> (averaged over 5 runs each).
Thanks, Hanna.
The kernel compile test isn't a particularly heavy user of
copy_to_user(), whereas with RAM-only dbench, copy_*_user()
is almost the only thing it does. So that makes sense.
Tried dbench on the 2.5G 4xPIII Xeon: no improvement at all.
This thing seems to have quite poor memory bandwidth - maybe
250 megabyte/sec downhill with the wind at its tail.
> Included is the pretty picture (akpm-2525.png) the
> data that picture came from (akpm-2525.data) and the raw
> results of the runs with min/max and timing results
> (2525akpmkmaphi and 2525clnhi).
> I believe the drop at around 64 clients is caused by
> memory swapping leading to increased disk accesses since the
> time increased by 200% in direct correlation with the decreased
> throughput.
Yes. The test went to disk. There are two reasons why
it will do this:
1: Some dirty data was in memory for more than 30-35 seconds or
2: More than 40% of memory is dirty.
In your case, the 64-client run was taking 32 seconds. After that
the disks lit up. Once that happens, dbench isn't a very good
benchmark. It's an excellent benchmark when it's RAM-only
though. Very repeatable and hits lots of code paths which matter.
You can run more clients before the disk I/O cuts in by
increasing /proc/sys/vm/dirty_expire_centisecs and
/proc/sys/vm/dirty_*_ratio.
The patch you tested only uses the atomic kmap across generic_file_read.
It is reasonable to hope that another 15% or morecan be gained by holding
an atomic kmap across writes as well. On your machine ;)
Here's what oprofile says about `dbench 40' with that patch:
c0140f1c 402 0.609543 __block_commit_write
c013dfd4 413 0.626222 vfs_write
c01402cc 431 0.653515 __find_get_block
c013a895 472 0.715683 .text.lock.highmem
c017fe30 494 0.749041 ext2_get_block
c012cef0 564 0.85518 unlock_page
c013ee80 564 0.85518 fget
c01079f4 571 0.865794 apic_timer_interrupt
c01e8ecc 594 0.900669 radix_tree_lookup
c013da90 597 0.905218 generic_file_llseek
c01514b4 607 0.92038 __d_lookup
c0106ff8 687 1.04168 system_call
c013a02c 874 1.32523 kunmap_high
c0148388 922 1.39801 link_path_walk
c0140b00 1097 1.66336 __block_prepare_write
c01346d0 1138 1.72552 rmqueue
c01127ac 1243 1.88473 smp_apic_timer_interrupt
c0139eb8 1514 2.29564 kmap_high
c0105368 6188 9.38272 poll_idle
c012d8a8 9564 14.5017 file_read_actor
c012ea70 21326 32.3361 generic_file_write
Not taking a kmap in generic_file_write is a biggish patch - it
means changing the prepare_write/commit_write API and visiting
all filesystems. The API change would be: core kernel no longer
holds a kmap across prepare/commit. If the filesystem wants one
for its own purposes then it gets to do it for itself, possibly in
its prepare_write().
I think I'd prefer to get some additional testing and understanding
before undertaking that work. It arguably makes sense as a small
cleanup/speedup anyway, but that's not a burning issue.
hmm. I'll do just ext2, and we can take another look then.
-
next prev parent reply other threads:[~2002-07-11 2:56 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <237170000.1026317715@flay>
2002-07-10 22:18 ` scalable kmap (was Re: vm lock contention reduction) (fwd) Hanna Linder
2002-07-11 3:06 ` Andrew Morton [this message]
2002-07-11 5:19 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D2CF62E.949F20B4@zip.com.au \
--to=akpm@zip.com.au \
--cc=Martin.Bligh@us.ibm.com \
--cc=hannal@us.ibm.com \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=mannthey@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.