public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nikita Danilov <nikita@clusterfs.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: Andrew Morton <akpm@osdl.org>,
	76306.1226@compuserve.com, linux-kernel@vger.kernel.org,
	nickpiggin@yahoo.com.au
Subject: Re: balance_pgdat(): where is total_scanned ever updated?
Date: Wed, 10 Nov 2004 16:24:45 +0300	[thread overview]
Message-ID: <16786.5789.465433.655127@thebsh.namesys.com> (raw)
In-Reply-To: <20041109185221.GA8414@logos.cnet>

Marcelo Tosatti writes:

[...]

 > 
 > Another related thing I noted this afternoon is that right now kswapd will
 > always block on full queues:
 > 
 > static int may_write_to_queue(struct backing_dev_info *bdi)
 > {
 >         if (current_is_kswapd())
 >                 return 1;
 >         if (current_is_pdflush())       /* This is unlikely, but why not... */
 >                 return 1;
 >         if (!bdi_write_congested(bdi))
 >                 return 1;
 >         if (bdi == current->backing_dev_info)
 >                 return 1;
 >         return 0;
 > }
 > 
 > We should make kswapd use the "bdi_write_congested" information and avoid
 > blocking on full queues. It should improve performance on multi-device 
 > systems with intense VM loads.

This will have following undesirable side effect: if
may_write_to_queue() returns false, page is not paged out, instead it is
thrown to the head of the inactive queue, thus destroying "LRU
ordering", shrink_list() will dive deeper into inactive list, reclaiming
hotter pages.

It's OK to accidentially skip pageout in direct reclaim path, because

 - we hope most pageout is done by kswapd, and

 - we don't want __alloc_pages() to stall

but _something_ in the kernel should take a pain of actually writing
pages out in LRU order.

 > 
 > Maybe something along the lines 
 > 
 > "if the reclaim ratio is high, do not writepage"
 > "if the reclaim ratio is below high, writepage but not block"
 > "if the reclaim ratio is low, writepage and block"

If kswapd blocking is a concern, inactive list scanning should be
decoupled from actual page-out (a la Solaris): kswapd queues pages to
the yet another kernel thread that calls pageout().

I played with this idea (see
http://nikita.w3.to/code/patches/2-6-10-rc1/async-writepage.txt note
that async_writepage() has to be adjusted to work for kswapd), but while
in some cases (large concurrent builds) it does provide a benefit, in
other cases (heavy write through mmap) it makes throughput slightly
worse.

Besides, this doesn't completely avoid the problem of destroying LRU
ordering, as kswapd still proceeds further through inactive list while
pages are sent out asynchronously.

Nikita.


  parent reply	other threads:[~2004-11-10 13:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-06 19:15 balance_pgdat(): where is total_scanned ever updated? Chuck Ebbert
2004-11-07  0:11 ` Andrew Morton
2004-11-09 10:42   ` Marcelo Tosatti
2004-11-09 19:36     ` Andrew Morton
2004-11-09 18:02       ` Marcelo Tosatti
2004-11-09 21:40         ` Andrew Morton
2004-11-09 18:52           ` Marcelo Tosatti
2004-11-09 22:40             ` Andrew Morton
2004-11-10 13:24             ` Nikita Danilov [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-11-07  5:02 Chuck Ebbert
2004-11-10  3:34 Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16786.5789.465433.655127@thebsh.namesys.com \
    --to=nikita@clusterfs.com \
    --cc=76306.1226@compuserve.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox