linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Nick Piggin <npiggin@suse.de>
Cc: linux-mm@kvack.org,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Chris Mason <chris.mason@oracle.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] vmscan: Put kswapd to sleep on its own waitqueue, not congestion
Date: Tue, 9 Mar 2010 10:21:46 +0000	[thread overview]
Message-ID: <20100309102145.GB4883@csn.ul.ie> (raw)
In-Reply-To: <20100309100044.GE8653@laptop>

On Tue, Mar 09, 2010 at 09:00:44PM +1100, Nick Piggin wrote:
> On Mon, Mar 08, 2010 at 11:48:23AM +0000, Mel Gorman wrote:
> > If kswapd is raising its priority to get the zone over the high
> > watermark, it may call congestion_wait() ostensibly to allow congestion
> > to clear. However, there is no guarantee that the queue is congested at
> > this point because it depends on kswapds previous actions as well as the
> > rest of the system. Kswapd could simply be working hard because there is
> > a lot of SYNC traffic in which case it shouldn't be sleeping.
> > 
> > Rather than waiting on congestion and potentially sleeping for longer
> > than it should, this patch puts kswapd back to sleep on the kswapd_wait
> > queue for the timeout. If direct reclaimers are in trouble, kswapd will
> > be rewoken as it should instead of sleeping when there is work to be
> > done.
> 
> Well but it is quite possible that many allocators are coming in to
> wake it up. So with your patch, I think we'd need to consider the case
> where the timeout approaches 0 here (if it's always being woken).
> 

True, similar to how zonepressure_wait() rechecks the watermarks if
there is still a timeout left and deciding whether to sleep again or
not.

> Direct reclaimers need not be involved because the pages might be
> hovering around the asynchronous reclaim watermarks (which would be
> the ideal case of system operation).
> 
> In which case, can you explain how this change makes sense? Why is
> it a good thing not to wait when we previously did wait?
> 

Well, it makes sense from the perspective it's better for kswapd to be doing
work than direct reclaim. If processes are hitting the watermarks then
why should kswapd be asleep?

That said, if the timeout was non-zero it should be able to make some decision
on whether it should be really awake. Putting the page allocator and kswapd
patches into the same series was a mistake because it's conflating two
different problems as one. I'm going to drop this one for the moment and
treat the page allocator patch in isolation.

> > 
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > ---
> >  mm/vmscan.c |   11 +++++++----
> >  1 files changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 4f92a48..894d366 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1955,7 +1955,7 @@ static int sleeping_prematurely(pg_data_t *pgdat, int order, long remaining)
> >   * interoperates with the page allocator fallback scheme to ensure that aging
> >   * of pages is balanced across the zones.
> >   */
> > -static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
> > +static unsigned long balance_pgdat(pg_data_t *pgdat, wait_queue_t *wait, int order)
> >  {
> >  	int all_zones_ok;
> >  	int priority;
> > @@ -2122,8 +2122,11 @@ loop_again:
> >  		if (total_scanned && (priority < DEF_PRIORITY - 2)) {
> >  			if (has_under_min_watermark_zone)
> >  				count_vm_event(KSWAPD_SKIP_CONGESTION_WAIT);
> > -			else
> > -				congestion_wait(BLK_RW_ASYNC, HZ/10);
> > +			else {
> > +				prepare_to_wait(&pgdat->kswapd_wait, wait, TASK_INTERRUPTIBLE);
> > +				schedule_timeout(HZ/10);
> > +				finish_wait(&pgdat->kswapd_wait, wait);
> > +			}
> >  		}
> >  
> >  		/*
> > @@ -2272,7 +2275,7 @@ static int kswapd(void *p)
> >  		 * after returning from the refrigerator
> >  		 */
> >  		if (!ret)
> > -			balance_pgdat(pgdat, order);
> > +			balance_pgdat(pgdat, &wait, order);
> >  	}
> >  	return 0;
> >  }
> > -- 
> > 1.6.5
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-09 10:22 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-08 11:48 [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure Mel Gorman
2010-03-08 11:48 ` [PATCH 1/3] page-allocator: Under memory pressure, wait on pressure to relieve instead of congestion Mel Gorman
2010-03-09 13:35   ` Nick Piggin
2010-03-09 14:17     ` Mel Gorman
2010-03-09 15:03       ` Nick Piggin
2010-03-09 15:42         ` Christian Ehrhardt
2010-03-09 18:22           ` Mel Gorman
2010-03-10  2:38             ` Nick Piggin
2010-03-09 17:35         ` Mel Gorman
2010-03-10  2:35           ` Nick Piggin
2010-03-09 15:50   ` Christoph Lameter
2010-03-09 15:56     ` Christian Ehrhardt
2010-03-09 16:09       ` Christoph Lameter
2010-03-09 17:01         ` Mel Gorman
2010-03-09 17:11           ` Christoph Lameter
2010-03-09 17:30             ` Mel Gorman
2010-03-08 11:48 ` [PATCH 2/3] page-allocator: Check zone pressure when batch of pages are freed Mel Gorman
2010-03-09  9:53   ` Nick Piggin
2010-03-09 10:08     ` Mel Gorman
2010-03-09 10:23       ` Nick Piggin
2010-03-09 10:36         ` Mel Gorman
2010-03-09 11:11           ` Nick Piggin
2010-03-09 11:29             ` Mel Gorman
2010-03-08 11:48 ` [PATCH 3/3] vmscan: Put kswapd to sleep on its own waitqueue, not congestion Mel Gorman
2010-03-09 10:00   ` Nick Piggin
2010-03-09 10:21     ` Mel Gorman [this message]
2010-03-09 10:32       ` Nick Piggin
2010-03-11 23:41 ` [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure Andrew Morton
2010-03-12  6:39   ` Christian Ehrhardt
2010-03-12  7:05     ` Andrew Morton
2010-03-12 10:47       ` Mel Gorman
2010-03-12 12:15         ` Christian Ehrhardt
2010-03-12 14:37           ` Andrew Morton
2010-03-15 12:29             ` Mel Gorman
2010-03-15 14:45               ` Christian Ehrhardt
2010-03-15 12:34             ` Christian Ehrhardt
2010-03-15 20:09               ` Andrew Morton
2010-03-16 10:11                 ` Mel Gorman
2010-03-18 17:42                 ` Mel Gorman
2010-03-22 23:50                 ` Mel Gorman
2010-03-23 14:35                   ` Christian Ehrhardt
2010-03-23 21:35                   ` Corrado Zoccolo
2010-03-24 11:48                     ` Mel Gorman
2010-03-24 12:56                       ` Corrado Zoccolo
2010-03-23 22:29                   ` Rik van Riel
2010-03-24 14:50                     ` Mel Gorman
2010-04-19 12:22                       ` Christian Ehrhardt
2010-04-19 21:44                         ` Johannes Weiner
2010-04-20  7:20                           ` Christian Ehrhardt
2010-04-20  8:54                             ` Christian Ehrhardt
2010-04-20 15:32                             ` Johannes Weiner
2010-04-20 17:22                               ` Rik van Riel
2010-04-21  4:23                                 ` Christian Ehrhardt
2010-04-21  7:35                                   ` Christian Ehrhardt
2010-04-21 13:19                                     ` Rik van Riel
2010-04-22  6:21                                       ` Christian Ehrhardt
2010-04-26 10:59                                         ` Subject: [PATCH][RFC] mm: make working set portion that is protected tunable v2 Christian Ehrhardt
2010-04-26 11:59                                           ` KOSAKI Motohiro
2010-04-26 12:43                                             ` Christian Ehrhardt
2010-04-26 14:20                                               ` Rik van Riel
2010-04-27 14:00                                                 ` Christian Ehrhardt
2010-04-21  9:03                                   ` [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure Johannes Weiner
2010-04-21 13:20                                   ` Rik van Riel
2010-04-20 14:40                           ` Rik van Riel
2010-03-24  2:38                   ` Greg KH
2010-03-24 11:49                     ` Mel Gorman
2010-03-24 13:13                   ` Johannes Weiner
2010-03-12  9:09   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100309102145.GB4883@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=chris.mason@oracle.com \
    --cc=ehrhardt@linux.vnet.ibm.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).