From: Jeff Garzik <jgarzik@pobox.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>,
Jens Axboe <axboe@suse.de>,
William Lee Irwin III <wli@holomorphy.com>,
Nick Piggin <nickpiggin@yahoo.com.au>,
linux-ide@vger.kernel.org,
Linux Kernel <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH] speed up SATA
Date: Sun, 28 Mar 2004 23:02:43 -0500 [thread overview]
Message-ID: <40679FE3.3080007@pobox.com> (raw)
In-Reply-To: <20040329005502.GG3039@dualathlon.random>
Andrea Arcangeli wrote:
> On Sun, Mar 28, 2004 at 01:59:51PM -0500, Jeff Garzik wrote:
>
>>Bartlomiej Zolnierkiewicz wrote:
>>
>>>On Sunday 28 of March 2004 20:30, Jens Axboe wrote:
>>>
>>>>Making something user tunable is usually not the best idea, if you can
>>>>deduct these things automagically instead. So whether this is the best
>>>>idea, depends on which way you want to go.
>>>
>>>
>>>I think it's the best idea for now, long-term we are better with automagic.
>>
>>
>>Mostly agreed:
>>
>>Like I mentioned in the last message, the IO scheduler and the VM should
>
>
> this is not an I/O scheduler or VM issue.
This involves the interaction of three: blkdev layer, IO scheduler, and VM.
VM: initiates most of the writeback, and is often the main initiator of
large requests. The VM thresholds also serve to keep request size
manageable. See e.g.
http://marc.theaimsgroup.com/?l=linux-kernel&m=108043321326801&w=2
IO scheduler: the place to make the decision about whether the request
latency is meeting expectations, etc. It should be straightforward to
use a windowing algorithm to slowly increase the request size until (a)
latency limits are reached, (b) hardware limits are reached, or (c) VM
thresholds are reached.
Ultimately there must be some -global- management of I/O, otherwise VM
cannot survive, e.g. 128k requests on 1000 disks :)
> the max size of a request is something that should be set internally to
> the blkdev layer (at a lower level than the I/O scheduler or the VM
> layer).
Yes, I agree.
My point is there are two maximums:
1) the hardware limit
2) the limit that "makes sense", e.g. 512k or 1M for most
The driver should only care about #1, and should be "told" #2.
A very, very, very minimal implementation could be this:
--- 1.138/include/linux/blkdev.h Fri Mar 12 04:33:07 2004
+++ edited/include/linux/blkdev.h Sun Mar 28 22:44:15 2004
@@ -607,6 +607,24 @@
extern void drive_stat_acct(struct request *, int, int);
+#define BLK_DISK_MAX_SECTORS 2048
+#define BLK_FLOPPY_MAX_SECTORS 64
Hardcoding such a maximum in the driver is inflexible and IMO incorrect.
> If one day things will change and the harddisk will require 32M large
> DMA transactions to keep up with the speed of the disk, the thing should
> be still solved during disk discovery inside the blkdev layer. The
32M is probably too large, but 1M is probably too small for:
a RAID array with 33 disks, that presents itself as a single SATA disk.
solid-state storage: battery-backed RAM.
These things like bigger requests, and were designed to solve a lot of
the latency problems in hardware.
> "automagic" suggestions discussed by Jamie and Jens should be just
> benchmarks internal to the blkdev layer, trying to read contigously
> first with 1M then 2M then 4M etc.. until the speed difference goes
> below 1% or whatever similar "autotune" algorithm.
Yes, agreed.
My main goal is to -not- worry about this in the low-level driver. If
you and Jens think 1M requests are maximum for disks, then put that in
the _blkdev_ layer not my driver :)
Long term, I would like to see something like
--- 1.138/include/linux/blkdev.h Fri Mar 12 04:33:07 2004
+++ edited/include/linux/blkdev.h Sun Mar 28 23:01:42 2004
@@ -337,7 +337,8 @@
*/
unsigned long nr_requests; /* Max # of requests */
- unsigned short max_sectors;
+ unsigned short max_sectors; /* blk layer-chosen */
+ unsigned short max_hw_sectors; /* hardware limit */
unsigned short max_phys_segments;
unsigned short max_hw_segments;
next prev parent reply other threads:[~2004-03-29 4:04 UTC|newest]
Thread overview: 115+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-27 22:37 [PATCH] speed up SATA Jeff Garzik
2004-03-27 23:04 ` Stefan Smietanowski
2004-03-27 23:11 ` Jeff Garzik
2004-03-28 7:23 ` Stefan Smietanowski
2004-03-28 15:37 ` Bartlomiej Zolnierkiewicz
2004-03-27 23:32 ` Bartlomiej Zolnierkiewicz
2004-03-27 23:36 ` Jeff Garzik
2004-03-27 23:40 ` Jeff Garzik
2004-03-28 0:13 ` Bartlomiej Zolnierkiewicz
2004-03-28 0:08 ` Jeff Garzik
2004-03-29 11:42 ` Pavel Machek
2004-03-27 23:37 ` Nick Piggin
2004-03-27 23:44 ` Jeff Garzik
2004-03-27 23:47 ` Nick Piggin
2004-03-27 23:59 ` Jeff Garzik
2004-03-28 14:10 ` Jens Axboe
2004-03-28 17:31 ` Jeff Garzik
2004-03-28 17:35 ` Jens Axboe
2004-03-28 17:48 ` Jeff Garzik
2004-03-28 17:54 ` Jens Axboe
2004-03-28 18:08 ` Jamie Lokier
2004-03-28 18:15 ` Jens Axboe
2004-03-28 18:55 ` Jeff Garzik
2004-03-29 8:09 ` Jens Axboe
2004-03-29 12:41 ` Jamie Lokier
2004-03-29 12:44 ` Jens Axboe
2004-03-29 12:50 ` Jamie Lokier
2004-03-29 13:05 ` Arjan van de Ven
2004-03-29 13:08 ` Jens Axboe
2004-03-30 8:13 ` Kurt Garloff
2004-03-30 11:40 ` Jens Axboe
2004-03-29 17:19 ` Craig I. Hagan
2004-03-29 18:19 ` Jeff Garzik
2004-03-28 19:06 ` Jeff Garzik
2004-03-28 18:12 ` William Lee Irwin III
2004-03-28 18:17 ` Jens Axboe
2004-03-28 18:30 ` Bartlomiej Zolnierkiewicz
2004-03-28 18:30 ` Jens Axboe
2004-03-28 18:45 ` Bartlomiej Zolnierkiewicz
2004-03-28 18:59 ` Jeff Garzik
2004-03-28 20:32 ` Andrew Morton
2004-03-28 20:45 ` Jeff Garzik
2004-03-29 0:55 ` Andrea Arcangeli
2004-03-29 4:02 ` Jeff Garzik [this message]
2004-03-29 13:04 ` Andrea Arcangeli
2004-03-29 19:45 ` Jeff Garzik
2004-03-30 11:09 ` Jens Axboe
2004-03-30 15:54 ` Timothy Miller
2004-03-30 16:20 ` Jeff Garzik
2004-03-30 18:05 ` Timothy Miller
2004-03-30 17:50 ` Jeff Garzik
2004-03-30 18:19 ` Timothy Miller
2004-03-29 4:29 ` Wim Coekaerts
2004-03-29 7:32 ` Denis Vlasenko
2004-03-29 8:13 ` Jens Axboe
2004-03-29 13:05 ` Andrea Arcangeli
2004-03-29 4:31 ` William Lee Irwin III
2004-03-29 4:57 ` Jeff Garzik
2004-03-28 19:52 ` Nuno Silva
2004-03-28 20:02 ` Jeff Garzik
2004-03-28 0:06 ` Jeff Garzik
2004-03-28 0:15 ` Nick Piggin
2004-03-28 0:49 ` Jeff Garzik
2004-03-28 1:02 ` Andrew Morton
2004-03-28 1:09 ` Jeff Garzik
2004-03-28 13:59 ` Jens Axboe
2004-03-28 17:29 ` Jeff Garzik
2004-03-28 17:31 ` Jens Axboe
2004-03-28 13:51 ` Jamie Lokier
2004-03-28 17:24 ` Jeff Garzik
2004-03-28 17:36 ` Jamie Lokier
2004-03-28 17:54 ` Jeff Garzik
2004-03-28 20:50 ` Eric D. Mudama
2004-04-02 10:11 ` Jeremy Higdon
2004-04-02 16:11 ` Jamie Lokier
2004-04-03 10:48 ` Jeremy Higdon
2004-04-03 13:49 ` Jamie Lokier
2004-03-28 17:40 ` Jens Axboe
2004-03-28 17:49 ` Jeff Garzik
2004-03-28 17:55 ` Jens Axboe
2004-03-28 18:04 ` Jeff Garzik
2004-03-28 18:09 ` Jens Axboe
2004-03-28 20:12 ` Jeff Garzik
2004-03-28 20:54 ` Eric D. Mudama
2004-03-28 7:32 ` Stefan Smietanowski
2004-03-28 20:25 ` Jeff Garzik
2004-03-28 21:16 ` Stefan Smietanowski
2004-03-28 21:26 ` Jeff Garzik
2004-03-28 14:08 ` Jens Axboe
2004-03-28 17:38 ` Jeff Garzik
2004-03-28 17:45 ` Jens Axboe
2004-03-28 20:21 ` Jeff Garzik
2004-03-28 0:07 ` Andrew Morton
2004-03-28 0:21 ` Nick Piggin
2004-03-28 4:40 ` Eric D. Mudama
2004-03-28 6:56 ` Nick Piggin
2004-03-28 20:33 ` Eric D. Mudama
2004-03-28 20:59 ` Eric D. Mudama
2004-03-29 1:30 ` Nick Piggin
2004-03-29 5:24 ` Eric D. Mudama
2004-03-29 13:03 ` Jamie Lokier
2004-03-29 11:36 ` Pavel Machek
2004-03-29 18:46 ` David Lang
2004-03-29 20:13 ` Jeff Garzik
2004-03-30 5:55 ` Eric D. Mudama
2004-03-30 11:54 ` Marc Bevand
2004-03-30 13:07 ` Jens Axboe
2004-03-30 13:48 ` Marc Bevand
2004-03-30 13:49 ` Jens Axboe
2004-03-30 15:31 ` Jeff Garzik
2004-03-30 17:42 ` Jeff Garzik
2004-03-31 9:12 ` Marc Bevand
-- strict thread matches above, loose matches on Subject: below --
2004-03-31 5:47 Marcus Hartig
2004-03-31 6:56 ` Jeff Garzik
2004-03-31 16:07 ` Marcus Hartig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40679FE3.3080007@pobox.com \
--to=jgarzik@pobox.com \
--cc=B.Zolnierkiewicz@elka.pw.edu.pl \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=axboe@suse.de \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox