public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Idea for improving linux buffer cache behaviour
@ 2003-10-04 15:34 David Ashley
  2003-10-04 19:14 ` Rik van Riel
  0 siblings, 1 reply; 5+ messages in thread
From: David Ashley @ 2003-10-04 15:34 UTC (permalink / raw)
  To: linux-kernel

Forgive me if this has already been thought of, or is obsolete, or is just
plain a bad idea, but here it is:

When I am doing large block device operations, such as ripping a DVD, the
cache gets loaded with all this data I don't care about (the contents of the
DVD itself). The cache data I care about is glibc, other shared libraries, and
binary executables like xterm + whatnot that I am constantly using. After
doing the large block operations, all that important data is no longer in the
cache, so it has to be reloaded from disk. It is then annoyingly slow to use
the machine for a while, especially each new executable that has to be loaded.
Performance is so much better when the stuff is sitting there in a ram cache.

Here's the idea: For each cache item, keep a count of how many times it has
been accessed (read). Also keep a count of how old the cache entry is. When
looking for cache data to free up to make space for a new cache entry, throw
out the data based on
1) Lowest access count looked at first to toss
2) If access counts equal, throw out oldest first

Actually you could have a "keep" rating on each cache entry. The higher the
rating the more you want to keep it in the cache. It could be this:
A * (access_count) - B * (age)
where A and B are positive numbers. Every time you go to cache something new
you increment a counter and store that in with the cache entry. The age of
the cache entry is the current value of the counter minus the cache entry's
value. A could be much larger than B.

The net result is commonly used items you very much want to remain in cache
always quickly get rated very highly as the system is used.

I'm using 2.4.20. Maybe 2.[5|6] does much more intelligent cache handling.

-Dave
PS cc me on replies, I don't read this group normally.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Idea for improving linux buffer cache behaviour
  2003-10-04 15:34 Idea for improving linux buffer cache behaviour David Ashley
@ 2003-10-04 19:14 ` Rik van Riel
  2003-10-05  5:34   ` Mike Fedyk
  0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2003-10-04 19:14 UTC (permalink / raw)
  To: David Ashley; +Cc: linux-kernel

On Sat, 4 Oct 2003, David Ashley wrote:

> Forgive me if this has already been thought of, or is obsolete, or is
> just plain a bad idea, but here it is:

Do you also want an answer if the kernel already does
exactly what you are suggesting ? ;)

> 1) Lowest access count looked at first to toss
> 2) If access counts equal, throw out oldest first

> The net result is commonly used items you very much want to remain in
> cache always quickly get rated very highly as the system is used.

Which results in exactly the behaviour you're complaining
about ;))

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Idea for improving linux buffer cache behaviour
  2003-10-04 19:14 ` Rik van Riel
@ 2003-10-05  5:34   ` Mike Fedyk
  2003-10-05 17:26     ` Helge Hafting
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Fedyk @ 2003-10-05  5:34 UTC (permalink / raw)
  To: Rik van Riel; +Cc: David Ashley, linux-kernel

On Sat, Oct 04, 2003 at 03:14:14PM -0400, Rik van Riel wrote:
> On Sat, 4 Oct 2003, David Ashley wrote:
> 
> > Forgive me if this has already been thought of, or is obsolete, or is
> > just plain a bad idea, but here it is:
> 
> Do you also want an answer if the kernel already does
> exactly what you are suggesting ? ;)
> 

Then why doesn't it work better?

> > 1) Lowest access count looked at first to toss
> > 2) If access counts equal, throw out oldest first
> 
> > The net result is commonly used items you very much want to remain in
> > cache always quickly get rated very highly as the system is used.
> 
> Which results in exactly the behaviour you're complaining
> about ;))

So, you use the system, have glibc loaded, and then play a dvd, and now
glibc needs to be re-read because it's not in cache.

Why wasn't glibc (one example) kept in cache with the streaming read from
the dvd?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Idea for improving linux buffer cache behaviour
  2003-10-05  5:34   ` Mike Fedyk
@ 2003-10-05 17:26     ` Helge Hafting
  2003-10-05 17:56       ` CJ
  0 siblings, 1 reply; 5+ messages in thread
From: Helge Hafting @ 2003-10-05 17:26 UTC (permalink / raw)
  To: Rik van Riel, David Ashley, linux-kernel

On Sat, Oct 04, 2003 at 10:34:58PM -0700, Mike Fedyk wrote:
> On Sat, Oct 04, 2003 at 03:14:14PM -0400, Rik van Riel wrote:
> > On Sat, 4 Oct 2003, David Ashley wrote:
> > 
> > > Forgive me if this has already been thought of, or is obsolete, or is
> > > just plain a bad idea, but here it is:
> > 
> > Do you also want an answer if the kernel already does
> > exactly what you are suggesting ? ;)
> > 
> 
> Then why doesn't it work better?
> 
> > > 1) Lowest access count looked at first to toss
> > > 2) If access counts equal, throw out oldest first
> > 
> > > The net result is commonly used items you very much want to remain in
> > > cache always quickly get rated very highly as the system is used.
> > 
> > Which results in exactly the behaviour you're complaining
> > about ;))
> 
> So, you use the system, have glibc loaded, and then play a dvd, and now
> glibc needs to be re-read because it's not in cache.
> 
> Why wasn't glibc (one example) kept in cache with the streaming read from
> the dvd?

There may be many reasons here, take a look at how many times the
dvd contents were used.  You may get a surprise there.  
The number ought to be 1, right?  But the burner program may read
smaller chunks or something, causing many references to the same block.

Also, the number-of-references approach has its own problems.
Something that is used a lot for a while will stay in cache for
a long while when no longer used, taking up space.  That can be
a problem too - i.e. run some large simulation which fill up
memory for a while, and nothing else stays in cache afterwards.

Helge Hafting

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Idea for improving linux buffer cache behaviour
  2003-10-05 17:26     ` Helge Hafting
@ 2003-10-05 17:56       ` CJ
  0 siblings, 0 replies; 5+ messages in thread
From: CJ @ 2003-10-05 17:56 UTC (permalink / raw)
  To: linux-kernel

1. A problem is o_direct is broken and/or confused with
    file systems.  There is a misguided micro-optimization
    that requires page alignment and sector alignment and
    size.  Even if broken DMA or controllers require these,
    O_DIRECT need not.  O_DIRECT is about the cache.

2. Even when O_DIRECT requires a bounce buffer, it need
    not wipe memory, it could easily confine itself to 1-4
    buffers and even support read ahead.  Then DVDs could
    be mounted O_DIRECT by default.

3. Buffer management has become a DOS on Linux leaving
    disk bound programs with the disk light off for ten
    seconds at a crack.  Writing is worst of all.



Helge Hafting wrote:

> On Sat, Oct 04, 2003 at 10:34:58PM -0700, Mike Fedyk wrote:
> 
>>On Sat, Oct 04, 2003 at 03:14:14PM -0400, Rik van Riel wrote:
>>
>>>On Sat, 4 Oct 2003, David Ashley wrote:
>>>
>>>
>>>>Forgive me if this has already been thought of, or is obsolete, or is
>>>>just plain a bad idea, but here it is:
>>>
>>>Do you also want an answer if the kernel already does
>>>exactly what you are suggesting ? ;)
>>>
>>
>>Then why doesn't it work better?
>>
>>
>>>>1) Lowest access count looked at first to toss
>>>>2) If access counts equal, throw out oldest first
>>>
>>>>The net result is commonly used items you very much want to remain in
>>>>cache always quickly get rated very highly as the system is used.
>>>
>>>Which results in exactly the behaviour you're complaining
>>>about ;))
>>
>>So, you use the system, have glibc loaded, and then play a dvd, and now
>>glibc needs to be re-read because it's not in cache.
>>
>>Why wasn't glibc (one example) kept in cache with the streaming read from
>>the dvd?
> 
> 
> There may be many reasons here, take a look at how many times the
> dvd contents were used.  You may get a surprise there.  
> The number ought to be 1, right?  But the burner program may read
> smaller chunks or something, causing many references to the same block.
> 
> Also, the number-of-references approach has its own problems.
> Something that is used a lot for a while will stay in cache for
> a long while when no longer used, taking up space.  That can be
> a problem too - i.e. run some large simulation which fill up
> memory for a while, and nothing else stays in cache afterwards.
> 
> Helge Hafting
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-10-05 17:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-04 15:34 Idea for improving linux buffer cache behaviour David Ashley
2003-10-04 19:14 ` Rik van Riel
2003-10-05  5:34   ` Mike Fedyk
2003-10-05 17:26     ` Helge Hafting
2003-10-05 17:56       ` CJ

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox