public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* replacing the page replacement algo.
@ 2001-11-19  0:17 Shaya Potter
  2001-11-19  1:44 ` Rik van Riel
  0 siblings, 1 reply; 8+ messages in thread
From: Shaya Potter @ 2001-11-19  0:17 UTC (permalink / raw)
  To: linux-kernel

If I wanted to experiment with different algorithms that chose which
page to replace (say on a page fault) what functions would I have to
replace?  or in other words, is there any good place that explains the
code path of how pages are chosen to be swapped out.  For example, by
scheduling, its easy to replace the scheduler because all you need to
deal with is schedule() and possibly add_to/del_from runqueue, with
schedule() being the important function, is there an equivalent function
in 2.4 for choosing pages to swap out?

thanks,

shaya potter

-- 
spotter@{cs.columbia.edu,yucs.org}


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-11-19  0:17 replacing the page replacement algo Shaya Potter
@ 2001-11-19  1:44 ` Rik van Riel
  2001-11-19  2:31   ` Shaya Potter
  0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2001-11-19  1:44 UTC (permalink / raw)
  To: Shaya Potter; +Cc: linux-kernel

On 18 Nov 2001, Shaya Potter wrote:

> If I wanted to experiment with different algorithms that chose which
> page to replace (say on a page fault) what functions would I have to
> replace?

try_to_free_pages() and all the functions it calls.

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-11-19  1:44 ` Rik van Riel
@ 2001-11-19  2:31   ` Shaya Potter
  2001-11-19  2:38     ` Rik van Riel
  0 siblings, 1 reply; 8+ messages in thread
From: Shaya Potter @ 2001-11-19  2:31 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Shaya Potter, linux-kernel

On Sun, 2001-11-18 at 20:44, Rik van Riel wrote:
> On 18 Nov 2001, Shaya Potter wrote:
> 
> > If I wanted to experiment with different algorithms that chose which
> > page to replace (say on a page fault) what functions would I have to
> > replace?
> 
> try_to_free_pages() and all the functions it calls.

I was looking at vmscan.c and it appears that swap_out() is what I
want.  If instead of having it step through the mmlist, I give it the
explicit mm of the processes that I want a page swapped out from? so I
could implement my algorithm either inside that func or as function
calls from it and have it pass onto swap_out_mm() the mm of the
processes I choose to swap out.

or am I totally misunderstanding something here? (likely, as this is my
first time digging into the vm and trying to learn about it)

thanks,

shaya potter


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-11-19  2:31   ` Shaya Potter
@ 2001-11-19  2:38     ` Rik van Riel
  2001-11-19  2:51       ` Shaya Potter
  0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2001-11-19  2:38 UTC (permalink / raw)
  To: Shaya Potter; +Cc: Shaya Potter, linux-kernel

On 18 Nov 2001, Shaya Potter wrote:
> On Sun, 2001-11-18 at 20:44, Rik van Riel wrote:
> > On 18 Nov 2001, Shaya Potter wrote:
> >
> > > If I wanted to experiment with different algorithms that chose which
> > > page to replace (say on a page fault) what functions would I have to
> > > replace?
> >
> > try_to_free_pages() and all the functions it calls.
>
> I was looking at vmscan.c and it appears that swap_out() is what I
> want.

You're missing the fact here that swap_out() only unmaps
pages from processes' page tables, actual reclaiming of
the pages is done elsewhere.

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-11-19  2:38     ` Rik van Riel
@ 2001-11-19  2:51       ` Shaya Potter
  2001-11-20  3:49         ` Eric W. Biederman
  2001-12-28  8:23         ` Daniel Phillips
  0 siblings, 2 replies; 8+ messages in thread
From: Shaya Potter @ 2001-11-19  2:51 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel

On Sun, 2001-11-18 at 21:38, Rik van Riel wrote:
> On 18 Nov 2001, Shaya Potter wrote:
> > On Sun, 2001-11-18 at 20:44, Rik van Riel wrote:
> > > On 18 Nov 2001, Shaya Potter wrote:
> > >
> > > > If I wanted to experiment with different algorithms that chose which
> > > > page to replace (say on a page fault) what functions would I have to
> > > > replace?
> > >
> > > try_to_free_pages() and all the functions it calls.
> >
> > I was looking at vmscan.c and it appears that swap_out() is what I
> > want.
> 
> You're missing the fact here that swap_out() only unmaps
> pages from processes' page tables, actual reclaiming of
> the pages is done elsewhere.

ok, but if what I'm interested in playing with right now is playing
around with which pages get swapped out, and not with the actual
reclamation procedure, is it ok to just play with swap_out and having it
do the thing it does, and let the rest of the kernel behave as is, or
will this cause problems?

thanks again,

shaya potter


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-11-19  2:51       ` Shaya Potter
@ 2001-11-20  3:49         ` Eric W. Biederman
  2001-12-28  8:23         ` Daniel Phillips
  1 sibling, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2001-11-20  3:49 UTC (permalink / raw)
  To: Shaya Potter; +Cc: Rik van Riel, linux-kernel

Shaya Potter <spotter@cs.columbia.edu> writes:

> ok, but if what I'm interested in playing with right now is playing
> around with which pages get swapped out, and not with the actual
> reclamation procedure, is it ok to just play with swap_out and having it
> do the thing it does, and let the rest of the kernel behave as is, or
> will this cause problems?

The name of swap_out is historical.  It currently does no writing to
disk it just removes the page from the page tables.  But the page is
still in RAM.  The other routines are what decide which page should
actually be reused, or which pages get written.

The current code is both simple, auto-tuning and does a fairly good
job so look very closely at it before trying to randomly tune it. 

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-11-19  2:51       ` Shaya Potter
  2001-11-20  3:49         ` Eric W. Biederman
@ 2001-12-28  8:23         ` Daniel Phillips
  2001-12-28 12:22           ` Rik van Riel
  1 sibling, 1 reply; 8+ messages in thread
From: Daniel Phillips @ 2001-12-28  8:23 UTC (permalink / raw)
  To: Shaya Potter, Rik van Riel; +Cc: linux-kernel

On November 19, 2001 03:51 am, Shaya Potter wrote:
> ok, but if what I'm interested in playing with right now is playing
> around with which pages get swapped out, and not with the actual
> reclamation procedure, is it ok to just play with swap_out and having it
> do the thing it does, and let the rest of the kernel behave as is, or
> will this cause problems?

No, it's quite a bit more complex than you imagine.  I'll do a *very quick* 
trip through it.

Swap_out and shrink_caches work together to decide which pages to evict.  The 
interaction is quite subtle.  Basically, swap_out scans through virtual 
memory doing one of two things to each page:

  - If the page was referenced, set the reference bit in the struct page
  - Otherwise unmap the page so that it can eventually be evicted

The rest of the work is done by shrink_caches.  It is concerned with two 
lists: an 'active' LRU list and a 'inactive' FIFO list, each of which is a 
list of struct page, i.e., descriptors of physical pages.  It also has to 
worry about some caches that aren't simple, replaceable pages.  All the 
following things are being done, more or less at the same time:

  - Move referenced pages from the tail of the active list to the head of the 
    active list (implements 'LRU' policy)

  - Move unreferenced pages from the tail of the active list to the head of 
    the inactive list (queue for eviction)

  - Move referenced pages from the tail of the inactive list to the head of
    the inactive list

  - Schedule dirty pages at the tail of the inactive list for writeout

  - Recover clean pages from the tail of the inactive list

  - Explicitly shrink the dcache, icache and dqcache as necessary by
    evicting objects from those caches in the hope of recovering pages
    from them.

You can think of the whole process as being roughly divided into two parts: 
virtual scanning (swap_out) and physical scanning (refill_inactive, 
shrink_cache).  The reason for dividing it this way is simple: the only way 
we can associate the hardware page-referenced bit with a physical page is by 
scanning all the page tables (the virtual scan).  I.e, there is no way to 
find that hardware bit, starting from a struct page, and we do need the 
information in that bit to decide which process pages should be evicted.

Some non-scanning events can occur that affect the page replacement process:

  - A page may be explicitly touched by file IO, which is another way for
    a page to move from the inactive to active list.

  - A process page that has been unmapped may be faulted back in before
    shrink_caches gets around to evicting it

These are two different kinds of 'rescue'.  The efficient operation of the 
replacement policy relies on such rescuing.

I hope you can see now that tweaking the page replacement policy is not a 
simple matter of playing with any single function.  All the parts I described 
work together in a complex dance.  I glossed over many important details as 
well, particularly the fact that you can't consider page eviction in 
isolation from page allocation.  I didn't even touch on zone balancing.

Good luck, have fun reading the code.  If you're ready to do serious work on 
the page replacement policy in less than 6 months you'll be doing very well.
You might want to head on over to www.kernelnewbies.org and have a read 
through some of the excellent background material there.

--
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: replacing the page replacement algo.
  2001-12-28  8:23         ` Daniel Phillips
@ 2001-12-28 12:22           ` Rik van Riel
  0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2001-12-28 12:22 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Shaya Potter, linux-kernel

On Fri, 28 Dec 2001, Daniel Phillips wrote:
> On November 19, 2001 03:51 am, Shaya Potter wrote:
> > ok, but if what I'm interested in playing with right now is playing
> > around with which pages get swapped out, and not with the actual
> > reclamation procedure, is it ok to just play with swap_out and having it
> > do the thing it does, and let the rest of the kernel behave as is, or
> > will this cause problems?
>
> No, it's quite a bit more complex than you imagine.  I'll do a *very
> quick* trip through it.

[snip complex interaction in standard kernel]

Shaya, if you want a VM where you can easily change the page
replacement algorithm, you probably want to work on top of my
'rmap' patch. The VM in that patch uses reverse mappings to
determine which process uses a page, so you could put normal
clock, lru, ... algorithms on top of it.

Things like swap_out() are completely gone in my patch, it's
just one selection based on physical page.

regards,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-12-28 12:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-11-19  0:17 replacing the page replacement algo Shaya Potter
2001-11-19  1:44 ` Rik van Riel
2001-11-19  2:31   ` Shaya Potter
2001-11-19  2:38     ` Rik van Riel
2001-11-19  2:51       ` Shaya Potter
2001-11-20  3:49         ` Eric W. Biederman
2001-12-28  8:23         ` Daniel Phillips
2001-12-28 12:22           ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox