From: Andrew Morton <akpm@linux-foundation.org>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
<linux-fsdevel@vger.kernel.org>,
Li Shaohua <shaohua.li@intel.com>,
Clemens Ladisch <clemens@ladisch.de>,
Jens Axboe <jens.axboe@oracle.com>,
Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH 1/8] block: limit default readahead size for small devices
Date: Mon, 21 Nov 2011 14:52:47 -0800 [thread overview]
Message-ID: <20111121145247.0e37dc36.akpm@linux-foundation.org> (raw)
In-Reply-To: <20111121093846.121502745@intel.com>
On Mon, 21 Nov 2011 17:18:20 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> Linus reports a _really_ small & slow (505kB, 15kB/s) USB device,
> on which blkid runs unpleasantly slow. He manages to optimize the blkid
> reads down to 1kB+16kB, but still kernel read-ahead turns it into 48kB.
>
> lseek 0, read 1024 => readahead 4 pages (start of file)
I'm disturbed that the code did a 4 page (16kbyte?) readahead after an
lseek. Given the high probability that the next read will occur after
a second lseek, that's a mistake.
Was an lseek to offset 0 special-cased?
> lseek 1536, read 16384 => readahead 8 pages (page contiguous)
>
> The readahead heuristics involved here are reasonable ones in general.
> So it's good to fix blkid with fadvise(RANDOM), as Linus already did.
>
> For the kernel part, Linus suggests:
> So maybe we could be less aggressive about read-ahead when the size of
> the device is small? Turning a 16kB read into a 64kB one is a big deal,
> when it's about 15% of the whole device!
>
> This looks reasonable: smaller device tend to be slower (USB sticks as
> well as micro/mobile/old hard disks).
Spose so. Obviously there are other characteristics which should be
considered when choosing a readaahead size, but one of them can be disk
size and that's what this change does.
In a better world, userspace would run a
work-out-what-readahead-size-to-use script each time a distro is
installed and when new storage devices are added/detected. Userspace
would then remember that readahead size for subsequent bootups.
In the real world, we shovel guaranteed-to-be-wrong guesswork into the
kernel and everyone just uses the results. Sigh.
> --- linux-next.orig/block/genhd.c 2011-10-31 00:13:51.000000000 +0800
> +++ linux-next/block/genhd.c 2011-11-18 11:27:08.000000000 +0800
> @@ -623,6 +623,26 @@ void add_disk(struct gendisk *disk)
> WARN_ON(retval);
>
> disk_add_events(disk);
> +
> + /*
> + * Limit default readahead size for small devices.
> + * disk size readahead size
> + * 1M 8k
> + * 4M 16k
> + * 16M 32k
> + * 64M 64k
> + * 256M 128k
> + * 1G 256k
> + * 4G 512k
> + * 16G 1024k
> + * 64G 2048k
> + * 256G 4096k
> + */
> + if (get_capacity(disk)) {
> + unsigned long size = get_capacity(disk) >> 9;
get_capacity() returns sector_t. This expression will overflow with a
2T disk. I'm not sure if we successfully support 2T disks on 32-bit
machines, but changes like this will guarantee that we don't :)
> + size = 1UL << (ilog2(size) / 2);
I think there's a rounddown_pow_of_two() hiding in that expression?
> + bdi->ra_pages = min(bdi->ra_pages, size);
I don't have a clue why that min() is in there. It needs a comment,
please.
> + }
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-11-21 22:52 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-21 9:18 [PATCH 0/8] readahead stats/tracing, backwards prefetching and more Wu Fengguang
2011-11-21 9:18 ` [PATCH 1/8] block: limit default readahead size for small devices Wu Fengguang
2011-11-21 10:00 ` Christoph Hellwig
2011-11-21 11:24 ` Wu Fengguang
2011-11-21 12:47 ` Andi Kleen
2011-11-21 14:46 ` Jeff Moyer
2011-11-21 22:52 ` Andrew Morton [this message]
2011-11-22 14:23 ` Jeff Moyer
2011-11-23 12:18 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 2/8] readahead: make default readahead size a kernel parameter Wu Fengguang
2011-11-21 10:01 ` Christoph Hellwig
2011-11-21 11:35 ` Wu Fengguang
2011-11-24 22:28 ` Jan Kara
2011-11-25 0:36 ` Dave Chinner
2011-11-28 2:39 ` Wu Fengguang
2011-11-30 13:04 ` Christian Ehrhardt
2011-11-30 13:29 ` Wu Fengguang
2011-11-30 16:09 ` Jan Kara
2011-11-21 13:16 ` Namhyung Kim
2011-11-21 13:24 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 3/8] readahead: replace ra->mmap_miss with ra->ra_flags Wu Fengguang
2011-11-21 11:04 ` Steven Whitehouse
2011-11-21 11:42 ` Wu Fengguang
2011-11-21 23:01 ` Andrew Morton
2011-11-23 12:47 ` Wu Fengguang
2011-11-23 20:31 ` Andrew Morton
2011-11-29 3:42 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 4/8] readahead: record readahead patterns Wu Fengguang
2011-11-21 23:19 ` Andrew Morton
2011-11-29 2:40 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 5/8] readahead: add /debug/readahead/stats Wu Fengguang
2011-11-21 14:17 ` Andi Kleen
2011-11-22 14:14 ` Wu Fengguang
2011-11-21 23:29 ` Andrew Morton
2011-11-21 23:32 ` Andi Kleen
2011-11-29 3:23 ` Wu Fengguang
2011-11-29 4:49 ` Andrew Morton
2011-11-29 6:41 ` Wu Fengguang
2011-11-29 12:29 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 6/8] readahead: add debug tracing event Wu Fengguang
2011-11-21 14:01 ` Steven Rostedt
2011-11-21 9:18 ` [PATCH 7/8] readahead: basic support for backwards prefetching Wu Fengguang
2011-11-21 23:33 ` Andrew Morton
2011-11-29 3:08 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 8/8] readahead: dont do start-of-file readahead after lseek() Wu Fengguang
2011-11-21 23:36 ` Andrew Morton
2011-11-22 14:18 ` Wu Fengguang
2011-11-21 9:56 ` [PATCH 0/8] readahead stats/tracing, backwards prefetching and more Christoph Hellwig
2011-11-21 12:00 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111121145247.0e37dc36.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=clemens@ladisch.de \
--cc=fengguang.wu@intel.com \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).