All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: chrisw@osdl.org, linux-kernel@vger.kernel.org, piggin@cyberone.com.au
Subject: Re: kswapd in tight loop 2.6.9-rc3-bk-recent
Date: Thu, 7 Oct 2004 17:42:42 -0700	[thread overview]
Message-ID: <20041007174242.3dd6facd.akpm@osdl.org> (raw)
In-Reply-To: <4165E0A7.7080305@yahoo.com.au>


OK, after backing out the `goto spaghetti;' patch and cleaning up a few
thing I'll test the below.  It'll make kswapd much less aggressive.



diff -puN mm/vmscan.c~no-wild-kswapd-2 mm/vmscan.c
--- 25/mm/vmscan.c~no-wild-kswapd-2	2004-10-07 17:38:20.342906376 -0700
+++ 25-akpm/mm/vmscan.c	2004-10-07 17:38:20.348905464 -0700
@@ -964,6 +964,17 @@ out:
  * of the number of free pages in the lower zones.  This interoperates with
  * the page allocator fallback scheme to ensure that aging of pages is balanced
  * across the zones.
+ *
+ * kswapd can be semi-livelocked if some other process is allocating pages
+ * while kswapd is simultaneously trying to balance the same zone.  That's OK,
+ * because we _want_ kswapd to work continuously in this situation.  But a
+ * side-effect of kswapd's ongoing work is that the pageout priority keeps on
+ * winding up so we bogusly start doing swapout.
+ *
+ * To fix this we take a snapshot of the number of pages which need to be
+ * reclaimed from each zone in zone->pages_to_reclaim and never reclaim more
+ * pages than that.  Once the required number of pages have been reclaimed from
+ * each zone, we're done.  kwsapd will go back to sleep until someone wakes it.
  */
 static int balance_pgdat(pg_data_t *pgdat, int nr_pages)
 {
@@ -984,6 +995,7 @@ static int balance_pgdat(pg_data_t *pgda
 		struct zone *zone = pgdat->node_zones + i;
 
 		zone->temp_priority = DEF_PRIORITY;
+		zone->pages_to_reclaim = zone->pages_high - zone->pages_free;
 	}
 
 	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
@@ -1003,7 +1015,7 @@ static int balance_pgdat(pg_data_t *pgda
 						priority != DEF_PRIORITY)
 					continue;
 
-				if (zone->free_pages <= zone->pages_high) {
+				if (zone->pages_to_reclaim > 0) {
 					end_zone = i;
 					break;
 				}
@@ -1036,10 +1048,11 @@ static int balance_pgdat(pg_data_t *pgda
 			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
 				continue;
 
-			if (nr_pages == 0) {	/* Not software suspend */
-				if (zone->free_pages <= zone->pages_high)
-					all_zones_ok = 0;
-			}
+			if (zone->pages_to_reclaim <= 0)
+				continue;
+
+			if (nr_pages == 0)	/* Not software suspend */
+				all_zones_ok = 0;
 			zone->temp_priority = priority;
 			if (zone->prev_priority > priority)
 				zone->prev_priority = priority;
@@ -1049,6 +1062,10 @@ static int balance_pgdat(pg_data_t *pgda
 			shrink_zone(zone, &sc);
 			reclaim_state->reclaimed_slab = 0;
 			shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
+
+			/* This fails to account for slab reclaim */
+			zone->pages_to_reclaim -= sc.nr_reclaimed;
+
 			sc.nr_reclaimed += reclaim_state->reclaimed_slab;
 			total_reclaimed += sc.nr_reclaimed;
 			total_scanned += sc.nr_scanned;
diff -puN include/linux/mmzone.h~no-wild-kswapd-2 include/linux/mmzone.h
--- 25/include/linux/mmzone.h~no-wild-kswapd-2	2004-10-07 17:38:20.343906224 -0700
+++ 25-akpm/include/linux/mmzone.h	2004-10-07 17:40:20.847586880 -0700
@@ -137,8 +137,9 @@ struct zone {
 	unsigned long		nr_scan_inactive;
 	unsigned long		nr_active;
 	unsigned long		nr_inactive;
-	int			all_unreclaimable; /* All pages pinned */
-	unsigned long		pages_scanned;	   /* since last reclaim */
+	long			pages_to_reclaim;	/* kswapd usage */
+	int			all_unreclaimable;	/* All pages pinned */
+	unsigned long		pages_scanned;		/* since last reclaim */
 
 	ZONE_PADDING(_pad2_)
 
_


  parent reply	other threads:[~2004-10-08  0:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-07 21:20 kswapd in tight loop 2.6.9-rc3-bk-recent Chris Wright
2004-10-07 21:39 ` Dave Jones
2004-10-07 21:49   ` Chris Wright
2004-10-07 23:40 ` Andrew Morton
2004-10-08  0:34   ` Nick Piggin
2004-10-08  0:37     ` Andrew Morton
2004-10-08  0:51       ` Chris Wright
2004-10-08  1:40         ` Andrew Morton
2004-10-08  0:42     ` Andrew Morton [this message]
2004-10-08  1:41       ` Chris Wright
2004-10-08  1:51         ` Chris Wright
2004-10-08  1:53           ` Andrew Morton
2004-10-08  2:46             ` Nick Piggin
2004-10-08  3:01               ` Andrew Morton
2004-10-08  3:13                 ` Nick Piggin
2004-10-08  3:30                   ` Andrew Morton
2004-10-08  3:54                     ` Nick Piggin
2004-10-08  4:48                       ` Nick Piggin
2004-10-08  4:57                         ` Andrew Morton
2004-10-08  5:21                     ` Chris Wright
2004-10-08  5:27                       ` Chris Wright
2004-10-08 10:10                       ` Nick Piggin
2004-10-08  3:15                 ` Nick Piggin
2004-10-08  3:05             ` Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041007174242.3dd6facd.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=chrisw@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=piggin@cyberone.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.