* clearing filesystem cache for I/O benchmarks
@ 2004-07-23 22:54 Benjamin Rutt
2004-07-24 5:21 ` Chris Wedgwood
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-23 22:54 UTC (permalink / raw)
To: linux-kernel
How can I purge all of the kernel's filesystem caches, so I can trust
that my I/O (read) requests I'm trying to benchmark bypass the kernel
filesystem cache?
Unfortunately, I cannot:
1) reboot the system
2) re-mount the filesystem where the reads are occuring
So I propose that I am left with the following options:
3) Reading through a file sufficiently larger than the RAM installed
on the system? e.g. read through a 10GB file on a machine with 8GB
of RAM
4) Since I can create the files fresh every time, I would write() them
out using O_DIRECT flag to open(), then the immediately following
read of that file would be guaranteed to avoid pulling it from
cache.
So, can someone evaluate whether how whether options 3 and 4 would
work, or offer other suggestons? And I wouldn't object if the issue
of clearing disk and controller cache entered into the discussion (I'm
thinking #3 would do a better job at clearing disk/controller caches).
In case it is relevant, here are the two relevant kernel versions I'm
using, both under the distribution "Red Hat Enterprise Linux AS
release 3 (Taroon)":
Linux xio11 2.6.6 #2 SMP Wed Jun 9 10:37:24 EDT 2004 i686 i686 i386 GNU/Linux
Linux xio06 2.4.21-9.ELhugemem #1 SMP Tue Apr 27 13:52:32 EDT 2004 i686 i686 i386 GNU/Linux
Thank you,
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-23 22:54 clearing filesystem cache for I/O benchmarks Benjamin Rutt
@ 2004-07-24 5:21 ` Chris Wedgwood
2004-07-24 5:31 ` Tim Wright
` (2 subsequent siblings)
3 siblings, 0 replies; 20+ messages in thread
From: Chris Wedgwood @ 2004-07-24 5:21 UTC (permalink / raw)
To: linux-kernel
On Fri, Jul 23, 2004 at 06:54:54PM -0400, Benjamin Rutt wrote:
> How can I purge all of the kernel's filesystem caches, so I can
> trust that my I/O (read) requests I'm trying to benchmark bypass the
> kernel filesystem cache?
does "ioctl(fd, BLKFLSBUF,0)" suffice?
--cw
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-23 22:54 clearing filesystem cache for I/O benchmarks Benjamin Rutt
2004-07-24 5:21 ` Chris Wedgwood
@ 2004-07-24 5:31 ` Tim Wright
2004-07-26 0:07 ` Benjamin Rutt
2004-07-25 8:11 ` Andreas Haumer
2004-07-26 7:25 ` Andrew Morton
3 siblings, 1 reply; 20+ messages in thread
From: Tim Wright @ 2004-07-24 5:31 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel
Take a look at the code in hdparm tool that handles the '-f' option.
Basically calling ioctl(fd, BLKFLSBUF, o) where fd is a file descriptor
opened on the block device on which your filesystem resides should be
enough to clear the cache.
Regards,
Tim
On Fri, 2004-07-23 at 15:54, Benjamin Rutt wrote:
> How can I purge all of the kernel's filesystem caches, so I can trust
> that my I/O (read) requests I'm trying to benchmark bypass the kernel
> filesystem cache?
>
> Unfortunately, I cannot:
>
> 1) reboot the system
>
> 2) re-mount the filesystem where the reads are occuring
>
> So I propose that I am left with the following options:
>
> 3) Reading through a file sufficiently larger than the RAM installed
> on the system? e.g. read through a 10GB file on a machine with 8GB
> of RAM
>
> 4) Since I can create the files fresh every time, I would write() them
> out using O_DIRECT flag to open(), then the immediately following
> read of that file would be guaranteed to avoid pulling it from
> cache.
>
> So, can someone evaluate whether how whether options 3 and 4 would
> work, or offer other suggestons? And I wouldn't object if the issue
> of clearing disk and controller cache entered into the discussion (I'm
> thinking #3 would do a better job at clearing disk/controller caches).
>
> In case it is relevant, here are the two relevant kernel versions I'm
> using, both under the distribution "Red Hat Enterprise Linux AS
> release 3 (Taroon)":
>
> Linux xio11 2.6.6 #2 SMP Wed Jun 9 10:37:24 EDT 2004 i686 i686 i386 GNU/Linux
>
> Linux xio06 2.4.21-9.ELhugemem #1 SMP Tue Apr 27 13:52:32 EDT 2004 i686 i686 i386 GNU/Linux
>
> Thank you,
--
Tim Wright <timw@splhi.com>
Splhi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-23 22:54 clearing filesystem cache for I/O benchmarks Benjamin Rutt
2004-07-24 5:21 ` Chris Wedgwood
2004-07-24 5:31 ` Tim Wright
@ 2004-07-25 8:11 ` Andreas Haumer
2004-07-26 7:25 ` Andrew Morton
3 siblings, 0 replies; 20+ messages in thread
From: Andreas Haumer @ 2004-07-25 8:11 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi!
Benjamin Rutt wrote:
> How can I purge all of the kernel's filesystem caches, so I can trust
> that my I/O (read) requests I'm trying to benchmark bypass the kernel
> filesystem cache?
Some time ago I was looking for that, too, and found "cfree". Have a
look at <http://gizmolabs.org/~andrew/andrewweb/project.php?pid=3&tab=0>
It's a small utility and kernel module for linux-2.4 written by
Andrew de los Reyes. It allows to clear portions of the buffer cache
(e.g. for a complete sub-directory). I haven't analyzed it so I can't
say if it does things correctly, though.
HTH
- - andreas
- --
Andreas Haumer | mailto:andreas@xss.co.at
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFBA2s2xJmyeGcXPhERAuaTAKCaxNRjhbzf3G5uL1lsXYg41eF+jQCeP808
DNcut1YDptMCNsvAeXrt+d8=
=xPh9
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-24 5:31 ` Tim Wright
@ 2004-07-26 0:07 ` Benjamin Rutt
2004-07-26 1:40 ` Bernd Eckenfels
0 siblings, 1 reply; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-26 0:07 UTC (permalink / raw)
To: linux-kernel
Tim Wright <timw@splhi.com> writes:
> Take a look at the code in hdparm tool that handles the '-f' option.
>
> Basically calling ioctl(fd, BLKFLSBUF, o) where fd is a file descriptor
> opened on the block device on which your filesystem resides should be
> enough to clear the cache.
Thanks, that looks pretty useful, at least to force the I/O to make it
outside the kernel. I'm still getting cache hits for some read tests
though, no doubt due to cache near the physical disks and/or
controllers. Correct me if I'm wrong, but this ioctl doesn't appear
to go out and tell disks to clear their caches.
I think I'll use the BLKFLSBUF in any case in my tests though, as it
doesn't seem to take very long to execute. It can't hurt, and should
complement the act of reading through a large dummy file, which should
take care of the disk/controller caches.
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-26 0:07 ` Benjamin Rutt
@ 2004-07-26 1:40 ` Bernd Eckenfels
2004-07-26 12:47 ` Benjamin Rutt
0 siblings, 1 reply; 20+ messages in thread
From: Bernd Eckenfels @ 2004-07-26 1:40 UTC (permalink / raw)
To: linux-kernel
In article <87smbfr5qe.fsf@osu.edu> you wrote:
> Thanks, that looks pretty useful, at least to force the I/O to make it
> outside the kernel. I'm still getting cache hits for some read tests
> though
This might be due to read ahead... how do you check the cache hits, what
read patterns do you have?
Greetings
Bernd
--
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-23 22:54 clearing filesystem cache for I/O benchmarks Benjamin Rutt
` (2 preceding siblings ...)
2004-07-25 8:11 ` Andreas Haumer
@ 2004-07-26 7:25 ` Andrew Morton
2004-07-26 13:02 ` Benjamin Rutt
3 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2004-07-26 7:25 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel
Benjamin Rutt <rutt.4+news@osu.edu> wrote:
>
> How can I purge all of the kernel's filesystem caches, so I can trust
> that my I/O (read) requests I'm trying to benchmark bypass the kernel
> filesystem cache?
Either delete the benchmark test files or, in 2.6, use
fsync+posix_fadvise(POSIX_FADV_DONTNEED);
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-26 1:40 ` Bernd Eckenfels
@ 2004-07-26 12:47 ` Benjamin Rutt
0 siblings, 0 replies; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-26 12:47 UTC (permalink / raw)
To: linux-kernel
Bernd Eckenfels <ecki-news2004-05@lina.inka.de> writes:
> In article <87smbfr5qe.fsf@osu.edu> you wrote:
>> Thanks, that looks pretty useful, at least to force the I/O to make it
>> outside the kernel. I'm still getting cache hits for some read tests
>> though
>
> This might be due to read ahead... how do you check the cache hits,
There must be cache hits since my poor old IDE disk from 1998 can only
perform at around 13 MB/sec for sustained sequential reads. The
performance I was getting was 150 MB/sec for sustained sequential
reads, which led me to think it was cache hits for certain, since
there is no way my disk can be that fast.
> what read patterns do you have?
The test setup is simply this:
1) create a target file for benchmarking, say 32 Megabytes (my system
RAM is 256MB, enough to cache all of that 32MB file)
2) run hdparm -f <device> to clear cache
3) read the target file from beginning to end into a user memory
buffer (e.g. 32k in size), ignoring the read data (i.e. there are
no data operations I make on the read data, this is a pure I/O test)
Because of #3, I'm not doing anything to the read data (I'm just
overwriting it with the next read) so I wouldn't imagine there is much
time from one read to the next to leverage any readahead.
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-26 7:25 ` Andrew Morton
@ 2004-07-26 13:02 ` Benjamin Rutt
2004-07-27 6:40 ` Andrew Morton
0 siblings, 1 reply; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-26 13:02 UTC (permalink / raw)
To: linux-kernel
Andrew Morton <akpm@osdl.org> writes:
> Benjamin Rutt <rutt.4+news@osu.edu> wrote:
>>
>> How can I purge all of the kernel's filesystem caches, so I can trust
>> that my I/O (read) requests I'm trying to benchmark bypass the kernel
>> filesystem cache?
>
> Either delete the benchmark test files or
I'm not sure I follow. If I delete the benchmark files, I'll only
need to create them again later in order to do a read test, and I'll
have the same problem then, of how to eliminate the just-written-data
from cache. Unless you're suggesting I write using some special mode
that won't enter the written data into cache? (e.g. O_DIRECT?)
> , in 2.6, use fsync+posix_fadvise(POSIX_FADV_DONTNEED);
Thanks for the reference, I wasn't aware of that one. We are running
some 2.4 kernels in our storage cluster unfortunately so that won't be
usable for us everywhere. I take it POSIX_FADV_DONTNEED is ignored
under 2.4.
A related question...if no posix_fadvise() advice has been given, does
reading sequentially every byte of an 8GB file on a machine with <=
8GB of RAM guarantee that any page cache data that existed on the
machine prior to the start of the 8GB read is now gone?
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-26 13:02 ` Benjamin Rutt
@ 2004-07-27 6:40 ` Andrew Morton
2004-07-27 7:16 ` Hans Reiser
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Andrew Morton @ 2004-07-27 6:40 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel
(Please don't remove people from the email recipient list when doing kernel
work.)
Benjamin Rutt <rutt.4+news@osu.edu> wrote:
>
> Andrew Morton <akpm@osdl.org> writes:
>
> > Benjamin Rutt <rutt.4+news@osu.edu> wrote:
> >>
> >> How can I purge all of the kernel's filesystem caches, so I can trust
> >> that my I/O (read) requests I'm trying to benchmark bypass the kernel
> >> filesystem cache?
> >
> > Either delete the benchmark test files or
>
> I'm not sure I follow. If I delete the benchmark files, I'll only
> need to create them again later in order to do a read test, and I'll
> have the same problem then, of how to eliminate the just-written-data
> from cache.
OK.
> Thanks for the reference, I wasn't aware of that one. We are running
> some 2.4 kernels in our storage cluster unfortunately so that won't be
> usable for us everywhere. I take it POSIX_FADV_DONTNEED is ignored
> under 2.4.
posix_fadvise() will return -ENOSYS under 2.4.
However... If you write any amount of data to a file with O_DIRECT, that
will, as a side-effect, remove _all_ of that file's pagecache. In 2.4 as
well as 2.6. So you could scrub the pagecache by reading the first 4k then
writing it back with O_DIRECT.
However O_DIRECT is supported on very few filesystems in 2.4. ext2 and
reiserfs have it.
XFS in 2.4 has O_DIRECT, I think, but I don't know if the invalidation
side-effect works on XFS.
> A related question...if no posix_fadvise() advice has been given, does
> reading sequentially every byte of an 8GB file on a machine with <=
> 8GB of RAM guarantee that any page cache data that existed on the
> machine prior to the start of the 8GB read is now gone?
It's not guaranteed that this will work - if the pages which you're trying
to evict were accessed multiple times then it may take more page
replacement to reliably shoot them down. But writing a 2xmemory file and
then deleting it will be a reasonably effective way of evicting most of
the other pagecache.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 6:40 ` Andrew Morton
@ 2004-07-27 7:16 ` Hans Reiser
2004-07-27 17:31 ` Benjamin Rutt
2004-07-27 17:25 ` Benjamin Rutt
2004-07-29 1:05 ` Nathan Scott
2 siblings, 1 reply; 20+ messages in thread
From: Hans Reiser @ 2004-07-27 7:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: Benjamin Rutt, linux-kernel
Andrew Morton wrote:
>(Please don't remove people from the email recipient list when doing kernel
>work.)
>
>Benjamin Rutt <rutt.4+news@osu.edu> wrote:
>
>
>>Andrew Morton <akpm@osdl.org> writes:
>>
>>
>>
>>>Benjamin Rutt <rutt.4+news@osu.edu> wrote:
>>>
>>>
>>>> How can I purge all of the kernel's filesystem caches, so I can trust
>>>> that my I/O (read) requests I'm trying to benchmark bypass the kernel
>>>> filesystem cache?
>>>>
>>>>
>>>Either delete the benchmark test files or
>>>
>>>
>>I'm not sure I follow. If I delete the benchmark files, I'll only
>>need to create them again later in order to do a read test, and I'll
>>have the same problem then, of how to eliminate the just-written-data
>>from cache.
>>
>>
when benchmarking, please be careful that you don't end up benchmarking
umount/mount, or sync, or..... it can be remarkably hard to avoid such
mistakes.....
I tend to try to use large enough filesets that small things like cache
flush happenstance or bitmap loading overhead do not sway the benchmark.
Rebooting tends to work for resetting the OS thoroughly, though I would
be curious to hear comments on whether one ought to power down the disk
drive so that its cache flushes......;-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 6:40 ` Andrew Morton
2004-07-27 7:16 ` Hans Reiser
@ 2004-07-27 17:25 ` Benjamin Rutt
2004-07-27 20:00 ` Timothy Miller
2004-07-29 1:05 ` Nathan Scott
2 siblings, 1 reply; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-27 17:25 UTC (permalink / raw)
To: linux-kernel; +Cc: Andrew Morton
Andrew Morton <akpm@osdl.org> writes:
> (Please don't remove people from the email recipient list when doing kernel
> work.)
Sorry, I'm reading via gmane and my newsreader doesn't make it
straightforward to do so. But I'll do it manually for you.
> However... If you write any amount of data to a file with O_DIRECT, that
> will, as a side-effect, remove _all_ of that file's pagecache. In 2.4 as
> well as 2.6. So you could scrub the pagecache by reading the first 4k then
> writing it back with O_DIRECT.
Thanks, that does work for ext3, very well. It's obvious that it
clears kernel page cache and not controller/disk cache.
>> A related question...if no posix_fadvise() advice has been given, does
>> reading sequentially every byte of an 8GB file on a machine with <=
>> 8GB of RAM guarantee that any page cache data that existed on the
>> machine prior to the start of the 8GB read is now gone?
>
> It's not guaranteed that this will work - if the pages which you're trying
> to evict were accessed multiple times then it may take more page
> replacement to reliably shoot them down. But writing a 2xmemory file and
> then deleting it will be a reasonably effective way of evicting most of
> the other pagecache.
OK thanks, I'll take on good faith that this is the best scheme in
general. I was actually doing a somewhat different approach, reading
through a 2x memory "dummy" file before accessing the real file, but
based on your advice, I'll instead just create a 2x "dummy" file,
fsync it, and then delete it.
Thanks for the tips,
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 7:16 ` Hans Reiser
@ 2004-07-27 17:31 ` Benjamin Rutt
2004-07-27 18:03 ` Hans Reiser
0 siblings, 1 reply; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-27 17:31 UTC (permalink / raw)
To: linux-kernel
Hans Reiser <reiser@namesys.com> writes:
> when benchmarking, please be careful that you don't end up
> benchmarking umount/mount, or sync, or..... it can be remarkably hard
> to avoid such mistakes.....
I agree, I've made some blunders like that in the past. However for
write tests, we are including fsync() time, once, at the end of a file
write, since I feel it's unfair to trim that time. Not including
fsync() time would only test the ability of the various parts of the
I/O systems to do write buffering. It's easy to do lots of write
buffering, if you buy enough memory. Forcing the disks to write is
the only fair way to compare writes between I/O systems.
> I tend to try to use large enough filesets that small things like
> cache flush happenstance or bitmap loading overhead do not sway the
> benchmark.
Sounds familiar...we cycle among file sizes at every power of 2 point
from 8MB to 64GB. So by the time we access the 64GB file, all the
previous accesses for 8M..32GB will probably have pushed all of the
64GB file out of cache.
> Rebooting tends to work for resetting the OS thoroughly, though I
> would be curious to hear comments on whether one ought to power down
> the disk drive so that its cache flushes......;-)
With the mass storage environment I'm working in, you'd need to power
down the whole storage cluster, then remove that batteries that back
the controller cache...yes, clearing kernel cache is often just the
beginning. :)
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 17:31 ` Benjamin Rutt
@ 2004-07-27 18:03 ` Hans Reiser
2004-07-28 12:38 ` Benjamin Rutt
0 siblings, 1 reply; 20+ messages in thread
From: Hans Reiser @ 2004-07-27 18:03 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel
Benjamin Rutt wrote:
>Hans Reiser <reiser@namesys.com> writes:
>
>
>
>>when benchmarking, please be careful that you don't end up
>>benchmarking umount/mount, or sync, or..... it can be remarkably hard
>>to avoid such mistakes.....
>>
>>
>
>I agree, I've made some blunders like that in the past. However for
>write tests, we are including fsync() time, once, at the end of a file
>write, since I feel it's unfair to trim that time.
>
fsync performance gives you different performance. Better to write more
stuff to flush the cache.
> Not including
>fsync() time would only test the ability of the various parts of the
>I/O systems to do write buffering. It's easy to do lots of write
>buffering, if you buy enough memory. Forcing the disks to write is
>the only fair way to compare writes between I/O systems.
>
>
It isn't fair. fsync is a different code path, and may be less
efficient. Or more, depending on the fs. reiser4 is currently not well
optimized for fsync, maybe next year I will change that but not this
week....
Benchmarking well is hard.....
Hans
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 17:25 ` Benjamin Rutt
@ 2004-07-27 20:00 ` Timothy Miller
2004-07-28 12:51 ` Benjamin Rutt
0 siblings, 1 reply; 20+ messages in thread
From: Timothy Miller @ 2004-07-27 20:00 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel, Andrew Morton
Benjamin Rutt wrote:
> Andrew Morton <akpm@osdl.org> writes:
>
>
>>(Please don't remove people from the email recipient list when doing kernel
>>work.)
>
>
> Sorry, I'm reading via gmane and my newsreader doesn't make it
> straightforward to do so. But I'll do it manually for you.
I haven't been paying attention, and I don't know if anyone's already
suggested this, but going on the title, have you considered running the
same benchmark more than once and just throwing away the first result?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 18:03 ` Hans Reiser
@ 2004-07-28 12:38 ` Benjamin Rutt
2004-07-28 17:03 ` Hans Reiser
0 siblings, 1 reply; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-28 12:38 UTC (permalink / raw)
To: linux-kernel
Hans Reiser <reiser@namesys.com> writes:
> fsync performance gives you different performance. Better to write
> more stuff to flush the cache.
I'm trying to understand how that would work. Let's take an example
of a 64GB file that I'm writing out from scratch. I start a timer
before writing. With my fsync() way of testing, I expect to stop the
timer the moment last byte has been written and fsync() has been
called.
I gather you're saying that continuing writing past the 64GB mark,
causing LRU expiration of the last bytes of the 64GB bytes from write
buffers is a more fair way to test, versus just calling fsync() once
at the end. I'm happy to write my benchmarks this way too, except I
need to know two configuration values now:
1) when to stop the timer?
2) how much more to write past 64GB?
>> Not including
>>fsync() time would only test the ability of the various parts of the
>>I/O systems to do write buffering. It's easy to do lots of write
>>buffering, if you buy enough memory. Forcing the disks to write is
>>the only fair way to compare writes between I/O systems.
>>
>>
> It isn't fair. fsync is a different code path, and may be less
> efficient. Or more, depending on the fs. reiser4 is currently not
> well optimized for fsync, maybe next year I will change that but not
> this week....
I think we agree that forcing the disks to write all of the data
before the timer stops is a fair way to compare between filesystems.
Otherwise we're "almost" measuring disk throughput, except for what
has been write-buffered...a real gray area. But I think you're
pointing out that the results could be different depending on whether
the fsync() method or your "write past the intented amount" method for
flushing is used. I'd be happy to run these benchmarks both ways, as
long as I knew how. If you can help me answer my above questions,
I'll run them both ways.
Thanks,
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 20:00 ` Timothy Miller
@ 2004-07-28 12:51 ` Benjamin Rutt
0 siblings, 0 replies; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-28 12:51 UTC (permalink / raw)
To: linux-kernel
Timothy Miller <miller@techsource.com> writes:
> I haven't been paying attention, and I don't know if anyone's already
> suggested this, but going on the title, have you considered running
> the same benchmark more than once and just throwing away the first
> result?
I was gathering from upthread comments that data blocks that are read
more than once will be given a priority to be retained in cache. So I
think reading the same data twice could lead to unwanted cache hits.
And besides, some of our file sizes are quite small (e.g. 8MB) such
that reading through them the second time would almost guarantee cache
hits. I see your point, though, for reading through a 64GB file on a
system with 8GB of RAM. If such a system would retain in cache
anything except the last ~8GB, I'd be very surprised.
Based on comments from Andrew Morton, I'm going to take the following
approach to clear cache for read tests:
1) figure out the available RAM on the test system
2) write out a throwaway file twice that big, and fsync() it
3) delete that file
I gather this is optimistically the best way, that would work for all
filesystem types.
As far as clearing disk/controller cache, I have a plan of (after the
above has been done) reading through a 2GB "dummy" file that I create
once before running the test battery. The 2GB figure comes from the
fact that we have controllers with a 1GB cache. Plus, there are
around 36 disks on the backend, all raided together into one raid
device. So each disk brings 8MB of cache that we have to worry about
as well. If it is totally obvious that Andrew Morton's above recipe
will clear the disk/controller caches as well, the please point that
out, but it isn't obvious to me.
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-28 12:38 ` Benjamin Rutt
@ 2004-07-28 17:03 ` Hans Reiser
2004-07-28 18:19 ` Benjamin Rutt
0 siblings, 1 reply; 20+ messages in thread
From: Hans Reiser @ 2004-07-28 17:03 UTC (permalink / raw)
To: Benjamin Rutt; +Cc: linux-kernel
Benjamin Rutt wrote:
>Hans Reiser <reiser@namesys.com> writes:
>
>
>
>>fsync performance gives you different performance. Better to write
>>more stuff to flush the cache.
>>
>>
>
>I'm trying to understand how that would work. Let's take an example
>of a 64GB file that I'm writing out from scratch. I start a timer
>before writing. With my fsync() way of testing, I expect to stop the
>timer the moment last byte has been written and fsync() has been
>called.
>
>I gather you're saying that continuing writing past the 64GB mark,
>causing LRU expiration of the last bytes of the 64GB bytes from write
>buffers is a more fair way to test, versus just calling fsync() once
>at the end. I'm happy to write my benchmarks this way too, except I
>need to know two configuration values now:
>
>1) when to stop the timer?
>2) how much more to write past 64GB?
>
>
Probably the best you can do is write enough in the course of the test
that fsync at the end (or data remaining in cache) is insignificant
noise. Benchmarks that make fsync or the cache significant
unintentionally are common and bad.
>
>
>>> Not including
>>>fsync() time would only test the ability of the various parts of the
>>>I/O systems to do write buffering. It's easy to do lots of write
>>>buffering, if you buy enough memory. Forcing the disks to write is
>>>the only fair way to compare writes between I/O systems.
>>>
>>>
>>>
>>>
>>It isn't fair. fsync is a different code path, and may be less
>>efficient. Or more, depending on the fs. reiser4 is currently not
>>well optimized for fsync, maybe next year I will change that but not
>>this week....
>>
>>
>
>I think we agree that forcing the disks to write all of the data
>before the timer stops is a fair way to compare between filesystems.
>Otherwise we're "almost" measuring disk throughput, except for what
>has been write-buffered...a real gray area. But I think you're
>pointing out that the results could be different depending on whether
>the fsync() method or your "write past the intented amount" method for
>flushing is used. I'd be happy to run these benchmarks both ways, as
>long as I knew how. If you can help me answer my above questions,
>I'll run them both ways.
>
>Thanks,
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-28 17:03 ` Hans Reiser
@ 2004-07-28 18:19 ` Benjamin Rutt
0 siblings, 0 replies; 20+ messages in thread
From: Benjamin Rutt @ 2004-07-28 18:19 UTC (permalink / raw)
To: linux-kernel
Hans Reiser <reiser@namesys.com> writes:
> Probably the best you can do is write enough in the course of the test
> that fsync at the end (or data remaining in cache) is insignificant
> noise. Benchmarks that make fsync or the cache significant
> unintentionally are common and bad.
I assure you, our benchmarks will avoid these common pitfalls.
--
Benjamin Rutt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: clearing filesystem cache for I/O benchmarks
2004-07-27 6:40 ` Andrew Morton
2004-07-27 7:16 ` Hans Reiser
2004-07-27 17:25 ` Benjamin Rutt
@ 2004-07-29 1:05 ` Nathan Scott
2 siblings, 0 replies; 20+ messages in thread
From: Nathan Scott @ 2004-07-29 1:05 UTC (permalink / raw)
To: Andrew Morton; +Cc: Benjamin Rutt, linux-kernel
On Mon, Jul 26, 2004 at 11:40:05PM -0700, Andrew Morton wrote:
> ...
> However... If you write any amount of data to a file with O_DIRECT, that
> will, as a side-effect, remove _all_ of that file's pagecache. In 2.4 as
> well as 2.6. So you could scrub the pagecache by reading the first 4k then
> writing it back with O_DIRECT.
>
> However O_DIRECT is supported on very few filesystems in 2.4. ext2 and
> reiserfs have it.
>
> XFS in 2.4 has O_DIRECT, I think, but I don't know if the invalidation
> side-effect works on XFS.
Yep, it does.
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2004-07-29 0:12 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-23 22:54 clearing filesystem cache for I/O benchmarks Benjamin Rutt
2004-07-24 5:21 ` Chris Wedgwood
2004-07-24 5:31 ` Tim Wright
2004-07-26 0:07 ` Benjamin Rutt
2004-07-26 1:40 ` Bernd Eckenfels
2004-07-26 12:47 ` Benjamin Rutt
2004-07-25 8:11 ` Andreas Haumer
2004-07-26 7:25 ` Andrew Morton
2004-07-26 13:02 ` Benjamin Rutt
2004-07-27 6:40 ` Andrew Morton
2004-07-27 7:16 ` Hans Reiser
2004-07-27 17:31 ` Benjamin Rutt
2004-07-27 18:03 ` Hans Reiser
2004-07-28 12:38 ` Benjamin Rutt
2004-07-28 17:03 ` Hans Reiser
2004-07-28 18:19 ` Benjamin Rutt
2004-07-27 17:25 ` Benjamin Rutt
2004-07-27 20:00 ` Timothy Miller
2004-07-28 12:51 ` Benjamin Rutt
2004-07-29 1:05 ` Nathan Scott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).