block cache replacement strategy?

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* block cache replacement strategy?
@ 2010-09-07 13:34 Johannes Stezenbach
  2010-09-09 12:00 ` Johannes Stezenbach
  2010-09-30 23:27 ` Jan Kara
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Stezenbach @ 2010-09-07 13:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-mm, linux-kernel

Hi,

during some simple disk read throughput testing I observed
caching behaviour that doesn't seem right.  The machine
has 2G of RAM and AMD Athlon 4850e, x86_64 kernel but 32bit
userspace, Linux 2.6.35.4.  It seems that contents of the
block cache are not evicted to make room for other blocks.
(Or something like that, I have no real clue about this.)

Since this is a rather artificial test I'm not too worried,
but it looks strange to me so I thought I better report it.


zzz:~# echo 3 >/proc/sys/vm/drop_caches 
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.9454 s, 75.2 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.92799 s, 1.1 GB/s

OK, seems like the blocks are cached. But:

zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.8375 s, 75.8 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.8429 s, 75.7 MB/s

Even if I let 15min pass and repeat the dd command
several times, I cannot see any caching effects, it
stays at ~75 MB/s.

zzz:~# cat /proc/meminfo 
MemTotal:        1793272 kB
MemFree:           15216 kB
Buffers:         1378820 kB
Cached:            20080 kB
SwapCached:            0 kB
Active:           792720 kB
Inactive:         758832 kB
Active(anon):      91716 kB
Inactive(anon):    64652 kB
Active(file):     701004 kB
Inactive(file):   694180 kB

But then:

zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 5.23983 s, 200 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.908284 s, 1.2 GB/s

zzz:~# cat /proc/meminfo 
MemTotal:        1793272 kB
MemFree:           16168 kB
Buffers:         1377308 kB
Cached:            20660 kB
SwapCached:            0 kB
Active:          1140384 kB
Inactive:         410236 kB
Active(anon):      91716 kB
Inactive(anon):    64652 kB
Active(file):    1048668 kB
Inactive(file):   345584 kB


And finally:

zzz:~# echo 3 >/proc/sys/vm/drop_caches
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.948 s, 75.2 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.985031 s, 1.1 GB/s


Now these blocks get cached but then the others don't:

zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.9394 s, 75.2 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.9403 s, 75.2 MB/s


Best Regards,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-07 13:34 block cache replacement strategy? Johannes Stezenbach
@ 2010-09-09 12:00 ` Johannes Stezenbach
  2010-09-10 10:02   ` Florian Mickler
  2010-09-30 23:27 ` Jan Kara
  1 sibling, 1 reply; 9+ messages in thread
From: Johannes Stezenbach @ 2010-09-09 12:00 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-mm, linux-kernel

On Tue, Sep 07, 2010 at 03:34:29PM +0200, Johannes Stezenbach wrote:
> 
> during some simple disk read throughput testing I observed
> caching behaviour that doesn't seem right.  The machine
> has 2G of RAM and AMD Athlon 4850e, x86_64 kernel but 32bit
> userspace, Linux 2.6.35.4.  It seems that contents of the
> block cache are not evicted to make room for other blocks.
> (Or something like that, I have no real clue about this.)
> 
> Since this is a rather artificial test I'm not too worried,
> but it looks strange to me so I thought I better report it.

C'mon guys, please comment.  Is this a bug or not?
Or is my question too silly?


Johannes

> zzz:~# echo 3 >/proc/sys/vm/drop_caches 
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.9454 s, 75.2 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 0.92799 s, 1.1 GB/s
> 
> OK, seems like the blocks are cached. But:
> 
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.8375 s, 75.8 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.8429 s, 75.7 MB/s
> 
> Even if I let 15min pass and repeat the dd command
> several times, I cannot see any caching effects, it
> stays at ~75 MB/s.
> 
> zzz:~# cat /proc/meminfo 
> MemTotal:        1793272 kB
> MemFree:           15216 kB
> Buffers:         1378820 kB
> Cached:            20080 kB
> SwapCached:            0 kB
> Active:           792720 kB
> Inactive:         758832 kB
> Active(anon):      91716 kB
> Inactive(anon):    64652 kB
> Active(file):     701004 kB
> Inactive(file):   694180 kB
> 
> But then:
> 
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 5.23983 s, 200 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 0.908284 s, 1.2 GB/s
> 
> zzz:~# cat /proc/meminfo 
> MemTotal:        1793272 kB
> MemFree:           16168 kB
> Buffers:         1377308 kB
> Cached:            20660 kB
> SwapCached:            0 kB
> Active:          1140384 kB
> Inactive:         410236 kB
> Active(anon):      91716 kB
> Inactive(anon):    64652 kB
> Active(file):    1048668 kB
> Inactive(file):   345584 kB
> 
> 
> And finally:
> 
> zzz:~# echo 3 >/proc/sys/vm/drop_caches
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.948 s, 75.2 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 0.985031 s, 1.1 GB/s
> 
> 
> Now these blocks get cached but then the others don't:
> 
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.9394 s, 75.2 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.9403 s, 75.2 MB/s

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-09 12:00 ` Johannes Stezenbach
@ 2010-09-10 10:02   ` Florian Mickler
  2010-09-10 16:02     ` Johannes Stezenbach
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Mickler @ 2010-09-10 10:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, linux-fsdevel, linux-mm, linux-kernel

On Thu, 9 Sep 2010 14:00:44 +0200
Johannes Stezenbach <js@sig21.net> wrote:

> On Tue, Sep 07, 2010 at 03:34:29PM +0200, Johannes Stezenbach wrote:
> > 
> > during some simple disk read throughput testing I observed
> > caching behaviour that doesn't seem right.  The machine
> > has 2G of RAM and AMD Athlon 4850e, x86_64 kernel but 32bit
> > userspace, Linux 2.6.35.4.  It seems that contents of the
> > block cache are not evicted to make room for other blocks.
> > (Or something like that, I have no real clue about this.)
> > 
> > Since this is a rather artificial test I'm not too worried,
> > but it looks strange to me so I thought I better report it.
> 
> C'mon guys, please comment.  Is this a bug or not?
> Or is my question too silly?
> 
> 
> Johannes

Well I personally have  no clue about the block caching, but perhaps
that is an heuristic to prevent the cache from fluctuating too much?
Some minimum time a block is hold... in a big linear read the cache is
useless anyway most of the time, so it could make some sense...

You could try accessing random files after filling up the cache and
check if those evict the the cache.  That should rule out any
linear-read-detection heuristic. 

Cheers,
Flo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-10 10:02   ` Florian Mickler
@ 2010-09-10 16:02     ` Johannes Stezenbach
  2010-09-13 15:21       ` Johannes Stezenbach
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Stezenbach @ 2010-09-10 16:02 UTC (permalink / raw)
  To: Florian Mickler; +Cc: linux-kernel, linux-mm, linux-fsdevel

On Fri, Sep 10, 2010 at 12:02:35PM +0200, Florian Mickler wrote:
> > On Tue, Sep 07, 2010 at 03:34:29PM +0200, Johannes Stezenbach wrote:
> > > 
> > > during some simple disk read throughput testing I observed
> > > caching behaviour that doesn't seem right.  The machine
> > > has 2G of RAM and AMD Athlon 4850e, x86_64 kernel but 32bit
> > > userspace, Linux 2.6.35.4.  It seems that contents of the
> > > block cache are not evicted to make room for other blocks.
> > > (Or something like that, I have no real clue about this.)
> > > 
> > > Since this is a rather artificial test I'm not too worried,
> > > but it looks strange to me so I thought I better report it.
> 
> Well I personally have  no clue about the block caching, but perhaps
> that is an heuristic to prevent the cache from fluctuating too much?
> Some minimum time a block is hold... in a big linear read the cache is
> useless anyway most of the time, so it could make some sense...
> 
> You could try accessing random files after filling up the cache and
> check if those evict the the cache.  That should rule out any
> linear-read-detection heuristic. 

OK, here is another run with simple files (using two kvm images
I had lying around).
Note how the cache used by /dev/sda2 apparently prevents the
kvm image from being cached, but also how the cache used by test.img
prevents test2.img from being cached.

Linear read heuristic might be a good guess, but it would
be nice to hear a comment from a vm/fs expert which
confirms this works as intended.


Thanks
Johannes

zzz:~# echo 3 >/proc/sys/vm/drop_caches
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 13.9516 s, 75.2 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.957778 s, 1.1 GB/s

zzz:~# dd if=~js/qemu/test.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.4247 s, 56.9 MB/s
zzz:~# dd if=~js/qemu/test.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.3675 s, 57.1 MB/s
zzz:~# dd if=~js/qemu/test.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.3925 s, 57.0 MB/s

zzz:~# echo 3 >/proc/sys/vm/drop_caches
zzz:~# dd if=~js/qemu/test.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.5455 s, 56.5 MB/s
zzz:~# dd if=~js/qemu/test.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.950387 s, 1.1 GB/s

zzz:~# dd if=~js/qemu/test2.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.085 s, 58.0 MB/s
zzz:~# dd if=~js/qemu/test2.img of=/dev/null bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 17.7351 s, 59.1 MB/s

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-10 16:02     ` Johannes Stezenbach
@ 2010-09-13 15:21       ` Johannes Stezenbach
  2010-09-13 19:09         ` dave b
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Stezenbach @ 2010-09-13 15:21 UTC (permalink / raw)
  To: Florian Mickler; +Cc: linux-kernel, linux-mm, linux-fsdevel

On Fri, Sep 10, 2010 at 06:02:48PM +0200, Johannes Stezenbach wrote:
> 
> Linear read heuristic might be a good guess, but it would
> be nice to hear a comment from a vm/fs expert which
> confirms this works as intended.

Apparently I'm unworthy to get a response from someone knowledgable :-(

Anyway I found lmdd (from lmbench) can do random reads,
and indeed causes the data to enter the block (page?) cache,
replacing the previous data.


Johannes


zzz:~# echo 3 >/proc/sys/vm/drop_caches

zzz:~# ./lmdd if=~js/qemu/test.img bs=1M count=1000
1000.0000 MB in 17.7554 secs, 56.3210 MB/sec
zzz:~# ./lmdd if=~js/qemu/test.img bs=1M count=1000
1000.0000 MB in 0.9112 secs, 1097.4178 MB/sec

zzz:~# ./lmdd if=~js/qemu/test2.img bs=1M count=1000 rand=1G norepeat=
norepeat on 238035072
norepeat on 724579648
1000.0000 MB in 21.4419 secs, 46.6376 MB/sec
zzz:~# ./lmdd if=~js/qemu/test2.img bs=1M count=1000 rand=1G norepeat=
norepeat on 238035072
norepeat on 724579648
1000.0000 MB in 14.3859 secs, 69.5125 MB/sec
zzz:~# ./lmdd if=~js/qemu/test2.img bs=1M count=1000 rand=1G norepeat=
norepeat on 238035072
norepeat on 724579648
1000.0000 MB in 0.8764 secs, 1141.0810 MB/sec

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-13 15:21       ` Johannes Stezenbach
@ 2010-09-13 19:09         ` dave b
  2010-09-13 19:26           ` Johannes Stezenbach
  0 siblings, 1 reply; 9+ messages in thread
From: dave b @ 2010-09-13 19:09 UTC (permalink / raw)
  To: Johannes Stezenbach
  Cc: Florian Mickler, linux-kernel, linux-mm, linux-fsdevel

On 14 September 2010 01:21, Johannes Stezenbach <js@sig21.net> wrote:
> On Fri, Sep 10, 2010 at 06:02:48PM +0200, Johannes Stezenbach wrote:
>>
>> Linear read heuristic might be a good guess, but it would
>> be nice to hear a comment from a vm/fs expert which
>> confirms this works as intended.
>
> Apparently I'm unworthy to get a response from someone knowledgable :-(
>
> Anyway I found lmdd (from lmbench) can do random reads,
> and indeed causes the data to enter the block (page?) cache,
> replacing the previous data.


I am no expert, but what did you think would happen if you did dd
twice from /dev/zero?
but... Honestly what do you think will be cached?
If you want 'COW', use btrfs.

--
Conscience doth make cowards of us all.		-- Shakespeare

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-13 19:09         ` dave b
@ 2010-09-13 19:26           ` Johannes Stezenbach
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Stezenbach @ 2010-09-13 19:26 UTC (permalink / raw)
  To: dave b; +Cc: Florian Mickler, linux-kernel, linux-mm, linux-fsdevel

On Tue, Sep 14, 2010 at 05:09:31AM +1000, dave b wrote:
> On 14 September 2010 01:21, Johannes Stezenbach <js@sig21.net> wrote:
> > On Fri, Sep 10, 2010 at 06:02:48PM +0200, Johannes Stezenbach wrote:
> >>
> >> Linear read heuristic might be a good guess, but it would
> >> be nice to hear a comment from a vm/fs expert which
> >> confirms this works as intended.
> >
> > Anyway I found lmdd (from lmbench) can do random reads,
> > and indeed causes the data to enter the block (page?) cache,
> > replacing the previous data.
> 
> I am no expert, but what did you think would happen if you did dd
> twice from /dev/zero?
> but... Honestly what do you think will be cached?

It's not from /dev/zero, it is from file to /dev/null.

It all started with me wanting to compare disk read bandwidth
vs. read bandwidth of my root partition via dm-crypt + LVM,
and then wondering why dd from raw disk seemed to be
cached while dd from crypted root partition didn't.


Johannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-07 13:34 block cache replacement strategy? Johannes Stezenbach
  2010-09-09 12:00 ` Johannes Stezenbach
@ 2010-09-30 23:27 ` Jan Kara
  2010-10-01 13:05   ` Johannes Stezenbach
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Kara @ 2010-09-30 23:27 UTC (permalink / raw)
  To: Johannes Stezenbach; +Cc: linux-fsdevel, linux-mm, linux-kernel

  Hi,

On Tue 07-09-10 15:34:29, Johannes Stezenbach wrote:
> during some simple disk read throughput testing I observed
> caching behaviour that doesn't seem right.  The machine
> has 2G of RAM and AMD Athlon 4850e, x86_64 kernel but 32bit
> userspace, Linux 2.6.35.4.  It seems that contents of the
> block cache are not evicted to make room for other blocks.
> (Or something like that, I have no real clue about this.)
> 
> Since this is a rather artificial test I'm not too worried,
> but it looks strange to me so I thought I better report it.
> 
> 
> zzz:~# echo 3 >/proc/sys/vm/drop_caches 
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.9454 s, 75.2 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 0.92799 s, 1.1 GB/s
> 
> OK, seems like the blocks are cached. But:
> 
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.8375 s, 75.8 MB/s
> zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 13.8429 s, 75.7 MB/s
  I took a look at this because it looked strange at the first sight to me.
After some code reading the result is that everything is working as
designed.
  The first dd fills up memory with 1GB of data. Pages with data just freshly
read from disk are in "Inactive" state. When these pages are read again by
the second dd, they move into the "Active" state - caching has proved
useful and thus we value the data more. When the third dd is run, it
eventually needs to reclaim some pages to cache new data. System preferably
reclaims "Inactive" pages and since it has plenty of them - all the data
the third dd has read so far - it succeeds. Thus when a third dd finishes,
only a small part of the whole 1 GB chunk is in memory since we continually
reclaimed pages from it.
  Active pages would start becoming inactive only when there would be too
many of them (e.g. when there would be more active pages than inactive
pages). But that does not happen with your workload... I guess this
explains it.

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block cache replacement strategy?
  2010-09-30 23:27 ` Jan Kara
@ 2010-10-01 13:05   ` Johannes Stezenbach
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Stezenbach @ 2010-10-01 13:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, linux-mm, linux-kernel

Hi,

On Fri, Oct 01, 2010 at 01:27:59AM +0200, Jan Kara wrote:
> On Tue 07-09-10 15:34:29, Johannes Stezenbach wrote:
> > 
> > zzz:~# echo 3 >/proc/sys/vm/drop_caches 
> > zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1.0 GB) copied, 13.9454 s, 75.2 MB/s
> > zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1.0 GB) copied, 0.92799 s, 1.1 GB/s
> > 
> > OK, seems like the blocks are cached. But:
> > 
> > zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1.0 GB) copied, 13.8375 s, 75.8 MB/s
> > zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1.0 GB) copied, 13.8429 s, 75.7 MB/s
>   I took a look at this because it looked strange at the first sight to me.
> After some code reading the result is that everything is working as
> designed.
>   The first dd fills up memory with 1GB of data. Pages with data just freshly
> read from disk are in "Inactive" state. When these pages are read again by
> the second dd, they move into the "Active" state - caching has proved
> useful and thus we value the data more. When the third dd is run, it
> eventually needs to reclaim some pages to cache new data. System preferably
> reclaims "Inactive" pages and since it has plenty of them - all the data
> the third dd has read so far - it succeeds. Thus when a third dd finishes,
> only a small part of the whole 1 GB chunk is in memory since we continually
> reclaimed pages from it.
>   Active pages would start becoming inactive only when there would be too
> many of them (e.g. when there would be more active pages than inactive
> pages). But that does not happen with your workload... I guess this
> explains it.

Thank you for your comments, I see now how it works.

What you snipped from my post:

> > Even if I let 15min pass and repeat the dd command
> > several times, I cannot see any caching effects, it
> > stays at ~75 MB/s.
...
> > Active:           792720 kB
> > Inactive:         758832 kB

So with my new knowledge I tried to run dd with a smaller data set
to get new data on the Active pages list:

zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=680 skip=1000
680+0 records in
680+0 records out
713031680 bytes (713 MB) copied, 9.8105 s, 72.7 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=680 skip=1000
680+0 records in
680+0 records out
713031680 bytes (713 MB) copied, 0.676862 s, 1.1 GB/s

zzz:~# cat /proc/meminfo 
MemTotal:        1793272 kB
MemFree:           15788 kB
Buffers:         1379332 kB
Cached:            14084 kB
SwapCached:        19516 kB
Active:          1493748 kB
Inactive:          45928 kB
Active(anon):     106416 kB
Inactive(anon):    42456 kB
Active(file):    1387332 kB
Inactive(file):     3472 kB

zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 5.09198 s, 206 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.63369 s, 642 MB/s
zzz:~# dd if=/dev/sda2 of=/dev/null bs=1M count=1000 skip=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.892916 s, 1.2 GB/s


Yippie!

BTW, it seems this has nothing to do with sequential read, and my
earlier testing with lmdd was flawed since lmdd uses 1M = 1000000
and 1m = 1048576, thus my test read overlapping blocks and the
resulting data set was smaller than the number of inactive pages.
A correct test with lmdd would use

  lmdd if=some_large_file_or_blockdev bs=1m count=1024 rand=5g norepeat=
  lmdd if=some_large_file_or_blockdev bs=1m count=1024 rand=5g norepeat= start=5g

and shows the same caching behaviour (on a machine with 2G RAM).


Thanks
Johannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-10-01 13:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-07 13:34 block cache replacement strategy? Johannes Stezenbach
2010-09-09 12:00 ` Johannes Stezenbach
2010-09-10 10:02   ` Florian Mickler
2010-09-10 16:02     ` Johannes Stezenbach
2010-09-13 15:21       ` Johannes Stezenbach
2010-09-13 19:09         ` dave b
2010-09-13 19:26           ` Johannes Stezenbach
2010-09-30 23:27 ` Jan Kara
2010-10-01 13:05   ` Johannes Stezenbach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).