From: Andrea Arcangeli <andrea@suse.de>
To: Jeff Garzik <jgarzik@pobox.com>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>,
Jens Axboe <axboe@suse.de>,
William Lee Irwin III <wli@holomorphy.com>,
Nick Piggin <nickpiggin@yahoo.com.au>,
linux-ide@vger.kernel.org,
Linux Kernel <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH] speed up SATA
Date: Mon, 29 Mar 2004 02:55:02 +0200 [thread overview]
Message-ID: <20040329005502.GG3039@dualathlon.random> (raw)
In-Reply-To: <406720A7.1050501@pobox.com>
On Sun, Mar 28, 2004 at 01:59:51PM -0500, Jeff Garzik wrote:
> Bartlomiej Zolnierkiewicz wrote:
> >On Sunday 28 of March 2004 20:30, Jens Axboe wrote:
> >>Making something user tunable is usually not the best idea, if you can
> >>deduct these things automagically instead. So whether this is the best
> >>idea, depends on which way you want to go.
> >
> >
> >I think it's the best idea for now, long-term we are better with automagic.
>
>
> Mostly agreed:
>
> Like I mentioned in the last message, the IO scheduler and the VM should
this is not an I/O scheduler or VM issue.
the max size of a request is something that should be set internally to
the blkdev layer (at a lower level than the I/O scheduler or the VM
layer).
The point is that if you run read contigously from disk with a 1M or 32M
request size, the wall time speed difference will be maybe 0.01% or so.
Running 100 irqs per second or 3 irq per second doesn't make any
measurable difference. Same goes for keeping the I/O pipeline full, 1M
is more than enough to go at the speed of the storage with minimal cpu
overhead. we waste 900 irqs per second just in the timer irq and
another 900 irqs per second per-cpu in the per-cpu local interrupts in
smp.
In 2.4 reaching 512k DMA units that helped a lot, but going past 512k
didn't help in my measurements. 1M maybe these days is needed (as Jens
suggested) but >1M still sounds overkill and I completely agree with
Jens about that.
If one day things will change and the harddisk will require 32M large
DMA transactions to keep up with the speed of the disk, the thing should
be still solved during disk discovery inside the blkdev layer. The
"automagic" suggestions discussed by Jamie and Jens should be just
benchmarks internal to the blkdev layer, trying to read contigously
first with 1M then 2M then 4M etc.. until the speed difference goes
below 1% or whatever similar "autotune" algorithm.
But definitely this is not an I/O scheduler or VM issue, it's all about
discovering the minimal DMA transaction size that provides peak bulk I/O
performance for a certain device. The smaller the size, the better the
latencies and the less ram will be pinned at the same time (i.e. think a
64M machine writing at 32M chunks at time).
Of course if we'll ever deal with hardware where 32M requests makes a
difference, then we may have to add overrides to the I/O scheduler to
lower the max_requests (i.e. like my obsolete max_bomb_segments did).
But I expect that by default the contigous I/O will use the max_sector
choosen by the blkdev layer (not choosen by VM or I/O scheduler) to
guarantee the best bulk I/O performance as usual (the I/O scheduler
option would be just an optional override). the max_sectors is just
about using a sane DMA transaction size, good enough to run at
disk-speed without measurable cpu overhead, but without being too big so
that it provides sane latencies. Overkill huge DMA transactions might
even stall the cpu when accessing the mem bus (though I'm not an
hardware guru so this is just a guess).
So far there was no need to autotune it, and settings like 512k were
optimal.
Don't take me wrong, I find extremely great that you now can raise the
IDE request size to a value like 512k, the 128k limit was the ugliest
thing of IDE ever, but you provided zero evidence that going past 512k
is beneficial at all, and your bootup log showing 32M is all but
exciting, I'd be a lot more excited to see 512k there.
I expect that the boost from 128k to 512k is very significant, but I
expect that from 512k to 32M there will be just a total waste of latency
with zero performance gain in throughput. So unless you measure any
speed difference from 512k to 32M I recommend to set it to 512k for the
short term like most other driver does for the same reasons.
next prev parent reply other threads:[~2004-03-29 0:55 UTC|newest]
Thread overview: 115+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-27 22:37 [PATCH] speed up SATA Jeff Garzik
2004-03-27 23:04 ` Stefan Smietanowski
2004-03-27 23:11 ` Jeff Garzik
2004-03-28 7:23 ` Stefan Smietanowski
2004-03-28 15:37 ` Bartlomiej Zolnierkiewicz
2004-03-27 23:32 ` Bartlomiej Zolnierkiewicz
2004-03-27 23:36 ` Jeff Garzik
2004-03-27 23:40 ` Jeff Garzik
2004-03-28 0:13 ` Bartlomiej Zolnierkiewicz
2004-03-28 0:08 ` Jeff Garzik
2004-03-29 11:42 ` Pavel Machek
2004-03-27 23:37 ` Nick Piggin
2004-03-27 23:44 ` Jeff Garzik
2004-03-27 23:47 ` Nick Piggin
2004-03-27 23:59 ` Jeff Garzik
2004-03-28 14:10 ` Jens Axboe
2004-03-28 17:31 ` Jeff Garzik
2004-03-28 17:35 ` Jens Axboe
2004-03-28 17:48 ` Jeff Garzik
2004-03-28 17:54 ` Jens Axboe
2004-03-28 18:08 ` Jamie Lokier
2004-03-28 18:15 ` Jens Axboe
2004-03-28 18:55 ` Jeff Garzik
2004-03-29 8:09 ` Jens Axboe
2004-03-29 12:41 ` Jamie Lokier
2004-03-29 12:44 ` Jens Axboe
2004-03-29 12:50 ` Jamie Lokier
2004-03-29 13:05 ` Arjan van de Ven
2004-03-29 13:08 ` Jens Axboe
2004-03-30 8:13 ` Kurt Garloff
2004-03-30 11:40 ` Jens Axboe
2004-03-29 17:19 ` Craig I. Hagan
2004-03-29 18:19 ` Jeff Garzik
2004-03-28 19:06 ` Jeff Garzik
2004-03-28 18:12 ` William Lee Irwin III
2004-03-28 18:17 ` Jens Axboe
2004-03-28 18:30 ` Bartlomiej Zolnierkiewicz
2004-03-28 18:30 ` Jens Axboe
2004-03-28 18:45 ` Bartlomiej Zolnierkiewicz
2004-03-28 18:59 ` Jeff Garzik
2004-03-28 20:32 ` Andrew Morton
2004-03-28 20:45 ` Jeff Garzik
2004-03-29 0:55 ` Andrea Arcangeli [this message]
2004-03-29 4:02 ` Jeff Garzik
2004-03-29 13:04 ` Andrea Arcangeli
2004-03-29 19:45 ` Jeff Garzik
2004-03-30 11:09 ` Jens Axboe
2004-03-30 15:54 ` Timothy Miller
2004-03-30 16:20 ` Jeff Garzik
2004-03-30 18:05 ` Timothy Miller
2004-03-30 17:50 ` Jeff Garzik
2004-03-30 18:19 ` Timothy Miller
2004-03-29 4:29 ` Wim Coekaerts
2004-03-29 7:32 ` Denis Vlasenko
2004-03-29 8:13 ` Jens Axboe
2004-03-29 13:05 ` Andrea Arcangeli
2004-03-29 4:31 ` William Lee Irwin III
2004-03-29 4:57 ` Jeff Garzik
2004-03-28 19:52 ` Nuno Silva
2004-03-28 20:02 ` Jeff Garzik
2004-03-28 0:06 ` Jeff Garzik
2004-03-28 0:15 ` Nick Piggin
2004-03-28 0:49 ` Jeff Garzik
2004-03-28 1:02 ` Andrew Morton
2004-03-28 1:09 ` Jeff Garzik
2004-03-28 13:59 ` Jens Axboe
2004-03-28 17:29 ` Jeff Garzik
2004-03-28 17:31 ` Jens Axboe
2004-03-28 13:51 ` Jamie Lokier
2004-03-28 17:24 ` Jeff Garzik
2004-03-28 17:36 ` Jamie Lokier
2004-03-28 17:54 ` Jeff Garzik
2004-03-28 20:50 ` Eric D. Mudama
2004-04-02 10:11 ` Jeremy Higdon
2004-04-02 16:11 ` Jamie Lokier
2004-04-03 10:48 ` Jeremy Higdon
2004-04-03 13:49 ` Jamie Lokier
2004-03-28 17:40 ` Jens Axboe
2004-03-28 17:49 ` Jeff Garzik
2004-03-28 17:55 ` Jens Axboe
2004-03-28 18:04 ` Jeff Garzik
2004-03-28 18:09 ` Jens Axboe
2004-03-28 20:12 ` Jeff Garzik
2004-03-28 20:54 ` Eric D. Mudama
2004-03-28 7:32 ` Stefan Smietanowski
2004-03-28 20:25 ` Jeff Garzik
2004-03-28 21:16 ` Stefan Smietanowski
2004-03-28 21:26 ` Jeff Garzik
2004-03-28 14:08 ` Jens Axboe
2004-03-28 17:38 ` Jeff Garzik
2004-03-28 17:45 ` Jens Axboe
2004-03-28 20:21 ` Jeff Garzik
2004-03-28 0:07 ` Andrew Morton
2004-03-28 0:21 ` Nick Piggin
2004-03-28 4:40 ` Eric D. Mudama
2004-03-28 6:56 ` Nick Piggin
2004-03-28 20:33 ` Eric D. Mudama
2004-03-28 20:59 ` Eric D. Mudama
2004-03-29 1:30 ` Nick Piggin
2004-03-29 5:24 ` Eric D. Mudama
2004-03-29 13:03 ` Jamie Lokier
2004-03-29 11:36 ` Pavel Machek
2004-03-29 18:46 ` David Lang
2004-03-29 20:13 ` Jeff Garzik
2004-03-30 5:55 ` Eric D. Mudama
2004-03-30 11:54 ` Marc Bevand
2004-03-30 13:07 ` Jens Axboe
2004-03-30 13:48 ` Marc Bevand
2004-03-30 13:49 ` Jens Axboe
2004-03-30 15:31 ` Jeff Garzik
2004-03-30 17:42 ` Jeff Garzik
2004-03-31 9:12 ` Marc Bevand
-- strict thread matches above, loose matches on Subject: below --
2004-03-31 5:47 Marcus Hartig
2004-03-31 6:56 ` Jeff Garzik
2004-03-31 16:07 ` Marcus Hartig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040329005502.GG3039@dualathlon.random \
--to=andrea@suse.de \
--cc=B.Zolnierkiewicz@elka.pw.edu.pl \
--cc=akpm@osdl.org \
--cc=axboe@suse.de \
--cc=jgarzik@pobox.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox