From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Benny Halevy <bhalevy@panasas.com>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
LKML <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
aneesh.kumar@linux.vnet.ibm.com
Subject: Re: iozone write 50% regression in kernel 2.6.24-rc1
Date: Mon, 12 Nov 2007 17:48:20 +0100 [thread overview]
Message-ID: <1194886100.9713.13.camel@twins> (raw)
In-Reply-To: <47386BC4.3050403@panasas.com>
[-- Attachment #1: Type: text/plain, Size: 4959 bytes --]
On Mon, 2007-11-12 at 17:05 +0200, Benny Halevy wrote:
> On Nov. 12, 2007, 15:26 +0200, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > Single socket, dual core opteron, 2GB memory
> > Single SATA disk, ext3
> >
> > x86_64 kernel and userland
> >
> > (dirty_background_ratio, dirty_ratio) tunables
> >
> > ---- (5,10) - default
> >
> > 2.6.23.1-42.fc8 #1 SMP
> >
> > 524288 4 59580 60356
> > 524288 4 59247 61101
> > 524288 4 61030 62831
> >
> > 2.6.24-rc2 #28 SMP PREEMPT
> >
> > 524288 4 49277 56582
> > 524288 4 50728 61056
> > 524288 4 52027 59758
> > 524288 4 51520 62426
> >
> >
> > ---- (20,40) - similar to your 8GB
> >
> > 2.6.23.1-42.fc8 #1 SMP
> >
> > 524288 4 225977 447461
> > 524288 4 232595 496848
> > 524288 4 220608 478076
> > 524288 4 203080 445230
> >
> > 2.6.24-rc2 #28 SMP PREEMPT
> >
> > 524288 4 54043 83585
> > 524288 4 69949 516253
> > 524288 4 72343 491416
> > 524288 4 71775 492653
> >
> > ---- (60,80) - overkill
> >
> > 2.6.23.1-42.fc8 #1 SMP
> >
> > 524288 4 208450 491892
> > 524288 4 216262 481135
> > 524288 4 221892 543608
> > 524288 4 202209 574725
> > 524288 4 231730 452482
> >
> > 2.6.24-rc2 #28 SMP PREEMPT
> >
> > 524288 4 49091 86471
> > 524288 4 65071 217566
> > 524288 4 72238 492172
> > 524288 4 71818 492433
> > 524288 4 71327 493954
> >
> >
> > While I see that the write speed as reported under .24 ~70MB/s is much
> > lower than the one reported under .23 ~200MB/s, I find it very hard to
> > believe my poor single SATA disk could actually do the 200MB/s for
> > longer than its cache 8/16 MB (not sure).
> >
> > vmstat shows that actual IO is done, even though the whole 512MB could
> > fit in cache, hence my suspicion that the ~70MB/s is the most realistic
> > of the two.
>
> Even 70 MB/s seems too high. What throughput do you see for the
> raw disk partition/
>
> Also, are the numbers above for successive runs?
> It seems like you're seeing some caching effects so
> I'd recommend using a file larger than your cache size and
> the -e and -c options (to include fsync and close in timings)
> to try to eliminate them.
------ iozone -i 0 -r 4k -s 512m -e -c
.23 (20,40)
524288 4 31750 33560
524288 4 29786 32114
524288 4 29115 31476
.24 (20,40)
524288 4 25022 32411
524288 4 25375 31662
524288 4 26407 33871
------ iozone -i 0 -r 4k -s 4g -e -c
.23 (20,40)
4194304 4 39699 35550
4194304 4 40225 36099
.24 (20,40)
4194304 4 39961 41656
4194304 4 39244 39673
Yanmin, for that benchmark you ran, what was it meant to measure?
From what I can make of it its just write cache benching.
One thing I don't understand is how the write numbers are so much lower
than the rewrite numbers. The iozone code (which gives me headaches,
damn what a mess) seems to suggest that the only thing that is different
is the lack of block allocation.
Linus posted a patch yesterday fixing up a regression in the ext3 bitmap
block allocator, /me goes apply that patch and rerun the tests.
> > ---- (20,40) - similar to your 8GB
> >
> > 2.6.23.1-42.fc8 #1 SMP
> >
> > 524288 4 225977 447461
> > 524288 4 232595 496848
> > 524288 4 220608 478076
> > 524288 4 203080 445230
> >
> > 2.6.24-rc2 #28 SMP PREEMPT
> >
> > 524288 4 54043 83585
> > 524288 4 69949 516253
> > 524288 4 72343 491416
> > 524288 4 71775 492653
2.6.24-rc2 +
patches/wu-reiser.patch
patches/writeback-early.patch
patches/bdi-task-dirty.patch
patches/bdi-sysfs.patch
patches/sched-hrtick.patch
patches/sched-rt-entity.patch
patches/sched-watchdog.patch
patches/linus-ext3-blockalloc.patch
524288 4 179657 487676
524288 4 173989 465682
524288 4 175842 489800
Linus' patch is the one that makes the difference here. So I'm unsure
how you bisected it down to:
04fbfdc14e5f48463820d6b9807daa5e9c92c51f
These results seem to point to
7c9e69faa28027913ee059c285a5ea8382e24b5d
as being the offending patch.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2007-11-12 16:48 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-09 9:47 iozone write 50% regression in kernel 2.6.24-rc1 Zhang, Yanmin
2007-11-09 9:54 ` Peter Zijlstra
2007-11-12 2:14 ` Zhang, Yanmin
2007-11-12 9:45 ` Peter Zijlstra
2007-11-12 9:51 ` Zhang, Yanmin
2007-11-12 13:26 ` Peter Zijlstra
[not found] ` <47386BC4.3050403@panasas.com>
2007-11-12 16:48 ` Peter Zijlstra [this message]
2007-11-13 2:19 ` Zhang, Yanmin
2007-11-13 8:34 ` Zhang, Yanmin
2007-11-13 18:32 ` Peter Zijlstra
2007-11-12 17:25 ` Mark Lord
2007-11-13 1:49 ` Zhang, Yanmin
-- strict thread matches above, loose matches on Subject: below --
2007-11-09 12:36 Martin Knoblauch
2007-11-12 0:45 ` Zhang, Yanmin
2007-11-12 12:58 Martin Knoblauch
2007-11-13 2:04 ` Zhang, Yanmin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1194886100.9713.13.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=bhalevy@panasas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.