All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux Memory Management List <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 01/11] readahead: limit readahead size for small devices
Date: Wed, 3 Feb 2010 09:23:00 +0100	[thread overview]
Message-ID: <20100203082259.GO5733@kernel.dk> (raw)
In-Reply-To: <20100203061350.GA22890@localhost>

On Wed, Feb 03 2010, Wu Fengguang wrote:
> On Wed, Feb 03, 2010 at 03:38:26AM +0800, Jens Axboe wrote:
> > On Tue, Feb 02 2010, Wu Fengguang wrote:
> > > Linus reports a _really_ small & slow (505kB, 15kB/s) USB device,
> > > on which blkid runs unpleasantly slow. He manages to optimize the blkid
> > > reads down to 1kB+16kB, but still kernel read-ahead turns it into 48kB.
> > > 
> > >      lseek 0,    read 1024   => readahead 4 pages (start of file)
> > >      lseek 1536, read 16384  => readahead 8 pages (page contiguous)
> > > 
> > > The readahead heuristics involved here are reasonable ones in general.
> > > So it's good to fix blkid with fadvise(RANDOM), as Linus already did.
> > > 
> > > For the kernel part, Linus suggests:
> > >   So maybe we could be less aggressive about read-ahead when the size of
> > >   the device is small? Turning a 16kB read into a 64kB one is a big deal,
> > >   when it's about 15% of the whole device!
> > > 
> > > This looks reasonable: smaller device tend to be slower (USB sticks as
> > > well as micro/mobile/old hard disks).
> > > 
> > > Given that the non-rotational attribute is not always reported, we can
> > > take disk size as a max readahead size hint. We use a formula that
> > > generates the following concrete limits:
> > > 
> > >         disk size    readahead size
> > >      (scale by 4)      (scale by 2)
> > >                2M            	 4k
> > >                8M                8k
> > >               32M               16k
> > >              128M               32k
> > >              512M               64k
> > >                2G              128k
> > >                8G              256k
> > >               32G              512k
> > >              128G             1024k
> > 
> > I'm not sure the size part makes a ton of sense. You can have really
> > fast small devices, and large slow devices. One real world example are
> > the Sun FMod SSD devices, which are only 22GB in size but are faster
> > than the Intel X25-E SLC disks.
> > 
> > What makes it even worse for these devices is that they are often
> > attached to fatter controllers than ahci, where command overhead is
> > larger.
> 
> Ah, good to know about this fast 22GB SSD.
> 
> > Running your script on such a device yields (I enlarged the read-count
> > by 2, makes it more reproducible):
> > 
> > MARVELL SD88SA02 MP1F
> > 
> > rasize	1st             2nd
> > ------------------------------------------------------------------
> >   4k	 41 MB/s	 41 MB/s
> >  16k	 85 MB/s	 81 MB/s
> >  32k	102 MB/s	109 MB/s
> >  64k	125 MB/s	144 MB/s
> > 128k	183 MB/s	185 MB/s
> > 256k	216 MB/s	216 MB/s
> > 512k	216 MB/s	236 MB/s
> > 1024k	251 MB/s	252 MB/s
> >   2M	258 MB/s	258 MB/s
> >   4M	266 MB/s	266 MB/s
> >   8M	266 MB/s	266 MB/s
> > 
> > So for that device, 1M-2M looks like the sweet spot, with even needing
> > 4-8M to fully reach full throughput.
> 
> Thanks for the data! I updated the formula to (16GB device => 1MB
> readahead). However the limit in this patch is only true for <4GB
> devices, since the default readahead size is merely 512KB.

Thanks Wu, you can add my acked-by.

-- 
Jens Axboe


WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <jens.axboe@oracle.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux Memory Management List <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 01/11] readahead: limit readahead size for small devices
Date: Wed, 3 Feb 2010 09:23:00 +0100	[thread overview]
Message-ID: <20100203082259.GO5733@kernel.dk> (raw)
In-Reply-To: <20100203061350.GA22890@localhost>

On Wed, Feb 03 2010, Wu Fengguang wrote:
> On Wed, Feb 03, 2010 at 03:38:26AM +0800, Jens Axboe wrote:
> > On Tue, Feb 02 2010, Wu Fengguang wrote:
> > > Linus reports a _really_ small & slow (505kB, 15kB/s) USB device,
> > > on which blkid runs unpleasantly slow. He manages to optimize the blkid
> > > reads down to 1kB+16kB, but still kernel read-ahead turns it into 48kB.
> > > 
> > >      lseek 0,    read 1024   => readahead 4 pages (start of file)
> > >      lseek 1536, read 16384  => readahead 8 pages (page contiguous)
> > > 
> > > The readahead heuristics involved here are reasonable ones in general.
> > > So it's good to fix blkid with fadvise(RANDOM), as Linus already did.
> > > 
> > > For the kernel part, Linus suggests:
> > >   So maybe we could be less aggressive about read-ahead when the size of
> > >   the device is small? Turning a 16kB read into a 64kB one is a big deal,
> > >   when it's about 15% of the whole device!
> > > 
> > > This looks reasonable: smaller device tend to be slower (USB sticks as
> > > well as micro/mobile/old hard disks).
> > > 
> > > Given that the non-rotational attribute is not always reported, we can
> > > take disk size as a max readahead size hint. We use a formula that
> > > generates the following concrete limits:
> > > 
> > >         disk size    readahead size
> > >      (scale by 4)      (scale by 2)
> > >                2M            	 4k
> > >                8M                8k
> > >               32M               16k
> > >              128M               32k
> > >              512M               64k
> > >                2G              128k
> > >                8G              256k
> > >               32G              512k
> > >              128G             1024k
> > 
> > I'm not sure the size part makes a ton of sense. You can have really
> > fast small devices, and large slow devices. One real world example are
> > the Sun FMod SSD devices, which are only 22GB in size but are faster
> > than the Intel X25-E SLC disks.
> > 
> > What makes it even worse for these devices is that they are often
> > attached to fatter controllers than ahci, where command overhead is
> > larger.
> 
> Ah, good to know about this fast 22GB SSD.
> 
> > Running your script on such a device yields (I enlarged the read-count
> > by 2, makes it more reproducible):
> > 
> > MARVELL SD88SA02 MP1F
> > 
> > rasize	1st             2nd
> > ------------------------------------------------------------------
> >   4k	 41 MB/s	 41 MB/s
> >  16k	 85 MB/s	 81 MB/s
> >  32k	102 MB/s	109 MB/s
> >  64k	125 MB/s	144 MB/s
> > 128k	183 MB/s	185 MB/s
> > 256k	216 MB/s	216 MB/s
> > 512k	216 MB/s	236 MB/s
> > 1024k	251 MB/s	252 MB/s
> >   2M	258 MB/s	258 MB/s
> >   4M	266 MB/s	266 MB/s
> >   8M	266 MB/s	266 MB/s
> > 
> > So for that device, 1M-2M looks like the sweet spot, with even needing
> > 4-8M to fully reach full throughput.
> 
> Thanks for the data! I updated the formula to (16GB device => 1MB
> readahead). However the limit in this patch is only true for <4GB
> devices, since the default readahead size is merely 512KB.

Thanks Wu, you can add my acked-by.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-02-03  8:23 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-02 15:28 [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 01/11] readahead: limit readahead size for small devices Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 19:38   ` Jens Axboe
2010-02-02 19:38     ` Jens Axboe
2010-02-03  6:13     ` Wu Fengguang
2010-02-03  6:13       ` Wu Fengguang
2010-02-03  8:23       ` Jens Axboe [this message]
2010-02-03  8:23         ` Jens Axboe
2010-02-04  8:24   ` Clemens Ladisch
2010-02-04  8:24     ` Clemens Ladisch
2010-02-04 13:00     ` Wu Fengguang
2010-02-04 13:00       ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 02/11] readahead: bump up the default readahead size Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 03/11] readahead: introduce {MAX|MIN}_READAHEAD_PAGES macros for ease of use Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 04/11] readahead: replace ra->mmap_miss with ra->ra_flags Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 05/11] readahead: retain inactive lru pages to be accessed soon Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 06/11] readahead: thrashing safe context readahead Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 07/11] readahead: record readahead patterns Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 08/11] readahead: add tracing event Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-12 16:19   ` Steven Rostedt
2010-02-12 16:19     ` Steven Rostedt
2010-02-14  3:56     ` Wu Fengguang
2010-02-14  3:56       ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 09/11] readahead: add /debug/readahead/stats Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 10/11] readahead: dont do start-of-file readahead after lseek() Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 17:39   ` Linus Torvalds
2010-02-02 17:39     ` Linus Torvalds
2010-02-02 18:13   ` Olivier Galibert
2010-02-02 18:13     ` Olivier Galibert
2010-02-02 18:40     ` Linus Torvalds
2010-02-02 18:40       ` Linus Torvalds
2010-02-02 18:48       ` Olivier Galibert
2010-02-02 18:48         ` Olivier Galibert
2010-02-02 19:14         ` Linus Torvalds
2010-02-02 19:14           ` Linus Torvalds
2010-02-02 19:59           ` david
2010-02-02 19:59             ` david
2010-02-02 20:22             ` Linus Torvalds
2010-02-02 20:22               ` Linus Torvalds
2010-02-02 15:28 ` [PATCH 11/11] radixtree: speed up next/prev hole search Wu Fengguang
2010-02-02 15:28   ` Wu Fengguang
2010-02-02 22:38 ` [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead Vivek Goyal
2010-02-02 22:38   ` Vivek Goyal
2010-02-02 22:38   ` Vivek Goyal
2010-02-02 23:17   ` Vivek Goyal
2010-02-02 23:17     ` Vivek Goyal
2010-02-03  6:27   ` Wu Fengguang
2010-02-03  6:27     ` Wu Fengguang
2010-02-03 15:24     ` Vivek Goyal
2010-02-03 15:24       ` Vivek Goyal
2010-02-03 15:58       ` Vivek Goyal
2010-02-03 15:58         ` Vivek Goyal
2010-02-03 16:55         ` Fwd: " Mike Snitzer
2010-02-04 13:21         ` Wu Fengguang
2010-02-04 13:21           ` Wu Fengguang
2010-02-04 15:52           ` Vivek Goyal
2010-02-04 15:52             ` Vivek Goyal
2010-02-04 13:44       ` Wu Fengguang
2010-02-04 13:44         ` Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2010-02-07  4:10 [PATCH 00/11] " Wu Fengguang
2010-02-07  4:10 ` [PATCH 01/11] readahead: limit readahead size for small devices Wu Fengguang
2010-02-07  4:10   ` Wu Fengguang
2010-02-07  4:10   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100203082259.GO5733@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.