From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: btrfs: why default 4M readahead size? Date: Thu, 18 Mar 2010 08:53:13 -0400 Message-ID: <20100318125313.GA14074@think> References: <20100318014257.GA30963@sli10-desk.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-btrfs@vger.kernel.org, jens.axboe@oracle.com, fengguang.wu@intel.com To: Shaohua Li Return-path: In-Reply-To: <20100318014257.GA30963@sli10-desk.sh.intel.com> List-ID: On Thu, Mar 18, 2010 at 09:42:57AM +0800, Shaohua Li wrote: > Btrfs uses below equation to calculate ra_pages: > fs_info->bdi.ra_pages = max(fs_info->bdi.ra_pages, > 4 * 1024 * 1024 / PAGE_CACHE_SIZE); > is the max() a typo of min()? This makes the readahead size is 4M by default, > which is too big. Looks like things have changed since I tuned that number. Fengguang has been busy ;) > I have a system with 16 CPU, 6G memory and 12 sata disks. I create a btrfs for > each disk, so this isn't a raid setup. The test is fio, which has 12 tasks to > access 12 files for each disk. The fio test is mmap sequential read. I measure > the performance with different readahead size: > ra size io throughput > 4M 268288 k/s > 2M 367616 k/s > 1M 431104 k/s > 512K 474112 k/s > 256K 512000 k/s > 128K 538624 k/s > The 4M default readahead size has poor performance. > I also does sync sequential read test, the test difference in't that big. But > the 4M case still has about 10% drop compared to the 512k case. I'm surprised the 4M is so much slower. At any rate, the larger size was selected because btrfs checksumming means we need a bigger buffer to keep the disks saturated. Were you on a fancy intel box with hardware crc32c enabled? > > One might argue how about the case memory isn't tight. I tried only run a > one-disk setup with only one task. The 4M ra almost has no difference with the > 128K ra. I guess the 128k default ra size for backing dev is carefuly choosed > to work with popular disks. > So my question is why we have a default 4M readahead size even with noraid case? I'm happy to tune it down if lower numbers are more appropriate now, thanks for trying this! -chris