From: Shaohua Li <shaohua.li@intel.com>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Chris Mason <chris.mason@oracle.com>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
"Wu, Fengguang" <fengguang.wu@intel.com>
Subject: Re: btrfs: why default 4M readahead size?
Date: Fri, 19 Mar 2010 17:29:00 +0800 [thread overview]
Message-ID: <20100319092900.GA28071@sli10-desk.sh.intel.com> (raw)
In-Reply-To: <20100319082210.GO5768@kernel.dk>
On Fri, Mar 19, 2010 at 04:22:11PM +0800, Jens Axboe wrote:
> On Fri, Mar 19 2010, Shaohua Li wrote:
> > On Fri, Mar 19, 2010 at 08:59:48AM +0800, Shaohua Li wrote:
> > > On Thu, Mar 18, 2010 at 08:53:13PM +0800, Chris Mason wrote:
> > > > On Thu, Mar 18, 2010 at 09:42:57AM +0800, Shaohua Li wrote:
> > > > > Btrfs uses below equation to calculate ra_pages:
> > > > > fs_info->bdi.ra_pages = max(fs_info->bdi.ra_pages,
> > > > > 4 * 1024 * 1024 / PAGE_CACHE_SIZE);
> > > > > is the max() a typo of min()? This makes the readahead size is 4M by default,
> > > > > which is too big.
> > > >
> > > > Looks like things have changed since I tuned that number. Fengguang has
> > > > been busy ;)
> > > >
> > > > > I have a system with 16 CPU, 6G memory and 12 sata disks. I create a btrfs for
> > > > > each disk, so this isn't a raid setup. The test is fio, which has 12 tasks to
> > > > > access 12 files for each disk. The fio test is mmap sequential read. I measure
> > > > > the performance with different readahead size:
> > > > > ra size io throughput
> > > > > 4M 268288 k/s
> > > > > 2M 367616 k/s
> > > > > 1M 431104 k/s
> > > > > 512K 474112 k/s
> > > > > 256K 512000 k/s
> > > > > 128K 538624 k/s
> > > > > The 4M default readahead size has poor performance.
> > > > > I also does sync sequential read test, the test difference in't that big. But
> > > > > the 4M case still has about 10% drop compared to the 512k case.
> > > >
> > > > I'm surprised the 4M is so much slower. At any rate, the larger size
> > > > was selected because btrfs checksumming means we need a bigger buffer to
> > > > keep the disks saturated. Were you on a fancy intel box with hardware
> > > > crc32c enabled?
> > > yes, this machine supports sse4.2 instruction. Let me check the result with checksum
> > > disabled.
> > Sounds no big difference with checksum disabled. I format the disks and redo
> > the test:
> > 128k ra: 539648 k/s
> > 4m ra: 285696 k/s
>
> 4MB is definitely a huge read-ahead size, but I do wonder why it would
> perform that much worse than a 128KB window. If you narrow your test
> down to a single disk (or something simpler, at least), how does 4MB
> compare to 128KB? With 6GB of memory, you should not run into read-ahead
> memory thrashing.
test data for a single disk(just run one time so far):
128k ra: 88513k/s
4m ra:87630k/s
so no big difference.
Thanks,
Shaohua
next prev parent reply other threads:[~2010-03-19 9:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-18 1:42 btrfs: why default 4M readahead size? Shaohua Li
2010-03-18 12:53 ` Chris Mason
2010-03-19 0:59 ` Shaohua Li
2010-03-19 2:56 ` Shaohua Li
2010-03-19 8:22 ` Jens Axboe
2010-03-19 9:29 ` Shaohua Li [this message]
2010-03-19 12:57 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100319092900.GA28071@sli10-desk.sh.intel.com \
--to=shaohua.li@intel.com \
--cc=chris.mason@oracle.com \
--cc=fengguang.wu@intel.com \
--cc=jens.axboe@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).