From: Andrew Morton <akpm@osdl.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: chrisw@osdl.org, linux-kernel@vger.kernel.org, piggin@cyberone.com.au
Subject: Re: kswapd in tight loop 2.6.9-rc3-bk-recent
Date: Thu, 7 Oct 2004 17:42:42 -0700 [thread overview]
Message-ID: <20041007174242.3dd6facd.akpm@osdl.org> (raw)
In-Reply-To: <4165E0A7.7080305@yahoo.com.au>
OK, after backing out the `goto spaghetti;' patch and cleaning up a few
thing I'll test the below. It'll make kswapd much less aggressive.
diff -puN mm/vmscan.c~no-wild-kswapd-2 mm/vmscan.c
--- 25/mm/vmscan.c~no-wild-kswapd-2 2004-10-07 17:38:20.342906376 -0700
+++ 25-akpm/mm/vmscan.c 2004-10-07 17:38:20.348905464 -0700
@@ -964,6 +964,17 @@ out:
* of the number of free pages in the lower zones. This interoperates with
* the page allocator fallback scheme to ensure that aging of pages is balanced
* across the zones.
+ *
+ * kswapd can be semi-livelocked if some other process is allocating pages
+ * while kswapd is simultaneously trying to balance the same zone. That's OK,
+ * because we _want_ kswapd to work continuously in this situation. But a
+ * side-effect of kswapd's ongoing work is that the pageout priority keeps on
+ * winding up so we bogusly start doing swapout.
+ *
+ * To fix this we take a snapshot of the number of pages which need to be
+ * reclaimed from each zone in zone->pages_to_reclaim and never reclaim more
+ * pages than that. Once the required number of pages have been reclaimed from
+ * each zone, we're done. kwsapd will go back to sleep until someone wakes it.
*/
static int balance_pgdat(pg_data_t *pgdat, int nr_pages)
{
@@ -984,6 +995,7 @@ static int balance_pgdat(pg_data_t *pgda
struct zone *zone = pgdat->node_zones + i;
zone->temp_priority = DEF_PRIORITY;
+ zone->pages_to_reclaim = zone->pages_high - zone->pages_free;
}
for (priority = DEF_PRIORITY; priority >= 0; priority--) {
@@ -1003,7 +1015,7 @@ static int balance_pgdat(pg_data_t *pgda
priority != DEF_PRIORITY)
continue;
- if (zone->free_pages <= zone->pages_high) {
+ if (zone->pages_to_reclaim > 0) {
end_zone = i;
break;
}
@@ -1036,10 +1048,11 @@ static int balance_pgdat(pg_data_t *pgda
if (zone->all_unreclaimable && priority != DEF_PRIORITY)
continue;
- if (nr_pages == 0) { /* Not software suspend */
- if (zone->free_pages <= zone->pages_high)
- all_zones_ok = 0;
- }
+ if (zone->pages_to_reclaim <= 0)
+ continue;
+
+ if (nr_pages == 0) /* Not software suspend */
+ all_zones_ok = 0;
zone->temp_priority = priority;
if (zone->prev_priority > priority)
zone->prev_priority = priority;
@@ -1049,6 +1062,10 @@ static int balance_pgdat(pg_data_t *pgda
shrink_zone(zone, &sc);
reclaim_state->reclaimed_slab = 0;
shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
+
+ /* This fails to account for slab reclaim */
+ zone->pages_to_reclaim -= sc.nr_reclaimed;
+
sc.nr_reclaimed += reclaim_state->reclaimed_slab;
total_reclaimed += sc.nr_reclaimed;
total_scanned += sc.nr_scanned;
diff -puN include/linux/mmzone.h~no-wild-kswapd-2 include/linux/mmzone.h
--- 25/include/linux/mmzone.h~no-wild-kswapd-2 2004-10-07 17:38:20.343906224 -0700
+++ 25-akpm/include/linux/mmzone.h 2004-10-07 17:40:20.847586880 -0700
@@ -137,8 +137,9 @@ struct zone {
unsigned long nr_scan_inactive;
unsigned long nr_active;
unsigned long nr_inactive;
- int all_unreclaimable; /* All pages pinned */
- unsigned long pages_scanned; /* since last reclaim */
+ long pages_to_reclaim; /* kswapd usage */
+ int all_unreclaimable; /* All pages pinned */
+ unsigned long pages_scanned; /* since last reclaim */
ZONE_PADDING(_pad2_)
_
next prev parent reply other threads:[~2004-10-08 0:49 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-07 21:20 kswapd in tight loop 2.6.9-rc3-bk-recent Chris Wright
2004-10-07 21:39 ` Dave Jones
2004-10-07 21:49 ` Chris Wright
2004-10-07 23:40 ` Andrew Morton
2004-10-08 0:34 ` Nick Piggin
2004-10-08 0:37 ` Andrew Morton
2004-10-08 0:51 ` Chris Wright
2004-10-08 1:40 ` Andrew Morton
2004-10-08 0:42 ` Andrew Morton [this message]
2004-10-08 1:41 ` Chris Wright
2004-10-08 1:51 ` Chris Wright
2004-10-08 1:53 ` Andrew Morton
2004-10-08 2:46 ` Nick Piggin
2004-10-08 3:01 ` Andrew Morton
2004-10-08 3:13 ` Nick Piggin
2004-10-08 3:30 ` Andrew Morton
2004-10-08 3:54 ` Nick Piggin
2004-10-08 4:48 ` Nick Piggin
2004-10-08 4:57 ` Andrew Morton
2004-10-08 5:21 ` Chris Wright
2004-10-08 5:27 ` Chris Wright
2004-10-08 10:10 ` Nick Piggin
2004-10-08 3:15 ` Nick Piggin
2004-10-08 3:05 ` Chris Wright
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041007174242.3dd6facd.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=chrisw@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=piggin@cyberone.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.