From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264154AbUGBM4x (ORCPT ); Fri, 2 Jul 2004 08:56:53 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264085AbUGBM4w (ORCPT ); Fri, 2 Jul 2004 08:56:52 -0400 Received: from outmx009.isp.belgacom.be ([195.238.3.4]:35712 "EHLO outmx009.isp.belgacom.be") by vger.kernel.org with ESMTP id S264154AbUGBM4f (ORCPT ); Fri, 2 Jul 2004 08:56:35 -0400 Subject: Re: Block Device Caching From: FabF To: Markus Schaber Cc: Helge Hafting , Linux Kernel Mailing List In-Reply-To: <20040702140144.43088dc2@kingfisher.intern.logi-track.com> References: <20040630002014.4970b82d@kingfisher.intern.logi-track.com> <40E1FDEC.6020606@techsource.com> <40E279F4.4090708@hist.no> <20040630123828.7d48c6e6@kingfisher.intern.logi-track.com> <40E543D7.9030303@hist.no> <20040702140144.43088dc2@kingfisher.intern.logi-track.com> Content-Type: text/plain Message-Id: <1088772969.3643.11.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Fri, 02 Jul 2004 14:56:09 +0200 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2004-07-02 at 14:01, Markus Schaber wrote: > Hi, Helge, > > On Fri, 02 Jul 2004 13:15:35 +0200 > Helge Hafting wrote: > > > [Caching problems] > > >Any Ideas? > > > > > Not much. "bear" may have decided to drop cache in one case but not > > in the other - that is valid although strange. Take a look at what > > block size the device uses. A mounted device uses the fs blocksize, a > > partition(or entire disk) may be set to some different blocksize. (I > > believe it might be the classical 512 byte) Caching blocks > > with a size different from the page size (usually 4k) has extra > > overhead and could lead to extra memory pressure and different caching > > decisions. > > > > So I recommend retesting bear with a good blocksize, such as 4k, on > > the device. That also gives somewhat better performance in general > > than 512-byte blocks. There is a syscall for changing the block size > > of a device. > > > > (Block size may also be changed by the following hack: mount a fs > > like ext2 on the device, then umount it. Of course you need to use > > mkfs so this can't be done on a device that contains data. > > "mount" sets the device block size to the size used by the fs. > > After umount, the device is free to be opened directly again, but it > > retains the new blocksize.) > > This sounds reasonable, and so I did some tests wr/t mounting and memory > pressure: > > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 775776k used, 3177848k free, 458060k buffers > Swap: 1048568k total, 0k used, 1048568k free, 31812k cached > > bear:~/readcachetest# ./readcache /dev/daten/testing 1000 > frist run needed 12.374819 seconds, this is 80.809263 MBytes/Sec > second run needed 12.004670 seconds, this is 83.300915 MBytes/Sec > third run needed 11.898035 seconds, this is 84.047492 MBytes/Sec > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 775648k used, 3177976k free, 457904k buffers > Swap: 1048568k total, 0k used, 1048568k free, 31900k cached > > bear:~/readcachetest# mount /dev/daten/testing /testinmount/ -t ext2 > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 775720k used, 3177904k free, 457944k buffers > Swap: 1048568k total, 0k used, 1048568k free, 31860k cached > > bear:~/readcachetest# ./readcache /dev/daten/testing 1000 > frist run needed 12.199237 seconds, this is 81.972340 MBytes/Sec > second run needed 12.028633 seconds, this is 83.134966 MBytes/Sec > third run needed 12.454502 seconds, this is 80.292251 MBytes/Sec > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 1125432k used, 2828192k free, 808008k buffers > Swap: 1048568k total, 0k used, 1048568k free, 31792k cached > > bear:~/readcachetest# ./readcache /testinmount/test.dump 1000 > frist run needed 11.985626 seconds, this is 83.433272 MBytes/Sec > second run needed 3.682361 seconds, this is 271.564901 MBytes/Sec > third run needed 3.638563 seconds, this is 274.833774 MBytes/Sec > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 2150128k used, 1803496k free, 808736k buffers > Swap: 1048568k total, 0k used, 1048568k free, 1055688k cached > > bear:~/readcachetest# ./readcache /dev/daten/testing 1000 > frist run needed 12.418232 seconds, this is 80.526761 MBytes/Sec > second run needed 12.381597 seconds, this is 80.765026 MBytes/Sec > third run needed 12.185159 seconds, this is 82.067046 MBytes/Sec > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 2149232k used, 1804392k free, 807784k buffers > Swap: 1048568k total, 0k used, 1048568k free, 1055756k cached > > bear:~/readcachetest# umount /testinmount/ > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 775728k used, 3177896k free, 457872k buffers > Swap: 1048568k total, 0k used, 1048568k free, 31728k cached > > bear:~/readcachetest# ./readcache /dev/daten/testing 1000 > frist run needed 12.171772 seconds, this is 82.157306 MBytes/Sec > second run needed 11.842943 seconds, this is 84.438471 MBytes/Sec > third run needed 12.167059 seconds, this is 82.189131 MBytes/Sec > > bear:~/readcachetest# top | head -n 8 | tail -n 2 > Mem: 3953624k total, 775656k used, 3177968k free, 457876k buffers > Swap: 1048568k total, 0k used, 1048568k free, 31860k cached > > So files from the mounted fs get cached, but not direct reads from the > block device, although the lvm volume is mounted, and enough memory is > available. > > I currently suspect this to be a problem with the cciss controller, see > my other mail to the list... > > > You may also want to test with a smaller count. Reading directly from > > pagecache should be noticeably faster than reading from the cache on > > the raid controller, although either obviously is way faster than > > reading from actual disks. > > bear:~/readcachetest# ./readcache /dev/daten/testing 100 > frist run needed 1.202330 seconds, this is 83.171841 MBytes/Sec > second run needed 0.349686 seconds, this is 285.970842 MBytes/Sec > third run needed 0.346644 seconds, this is 288.480401 MBytes/Sec > > bear:~/readcachetest# mount /dev/daten/testing /testinmount/ -t ext2 > > bear:~/readcachetest# ./readcache /testinmount/test.dump 100 > frist run needed 1.205797 seconds, this is 82.932699 MBytes/Sec > second run needed 0.367004 seconds, this is 272.476594 MBytes/Sec > third run needed 0.362783 seconds, this is 275.646874 MBytes/Sec > > bear:~/readcachetest# ./readcache /testinmount/test.dump 100 > frist run needed 0.363097 seconds, this is 275.408500 MBytes/Sec > second run needed 0.363002 seconds, this is 275.480576 MBytes/Sec > third run needed 0.362852 seconds, this is 275.594457 MBytes/Sec > > bear:~/readcachetest# umount /testinmount/ > > bear:~/readcachetest# mount /dev/daten/testing /testinmount/ -t ext2 > > bear:~/readcachetest# ./readcache /testinmount/test.dump 100 > frist run needed 1.189227 seconds, this is 84.088235 MBytes/Sec > second run needed 0.367456 seconds, this is 272.141426 MBytes/Sec > third run needed 0.363139 seconds, this is 275.376646 MBytes/Sec > > Hmm, I'm somehow irritated now, as 275 MB/Sec seems rather low for reads > from the page cache on a 2.8Ghz Xeon machine (even my laptop gets about 400 > MB/Sec with this, and Skate even reaches 3486.385664 MBytes/Sec.) > > > No idea... > Markus. Did you played with hdparm -m ? Regards, FabF