From: Anton Blanchard <anton@samba.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: cl@linux-foundation.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim
Date: Tue, 23 Feb 2010 12:55:51 +1100 [thread overview]
Message-ID: <20100223015551.GG31681@kryten> (raw)
In-Reply-To: <20100219145523.GN30258@csn.ul.ie>
[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]
Hi Mel,
> You're pretty much on the button here. Only one thread at a time enters
> zone_reclaim. The others back off and try the next zone in the zonelist
> instead. I'm not sure what the original intention was but most likely it
> was to prevent too many parallel reclaimers in the same zone potentially
> dumping out way more data than necessary.
>
> > I'm not sure if there is an easy way to fix this without penalising other
> > workloads though.
> >
>
> You could experiment with waiting on the bit if the GFP flags allowi it? The
> expectation would be that the reclaim operation does not take long. Wait
> on the bit, if you are making the forward progress, recheck the
> watermarks before continueing.
Thanks to you and Christoph for some suggestions to try. Attached is a
chart showing the results of the following tests:
baseline.txt
The current ppc64 default of zone_reclaim_mode = 0. As expected we see
no change in remote node memory usage even after 10 iterations.
zone_reclaim_mode.txt
Now we set zone_reclaim_mode = 1. On each iteration we continue to improve,
but even after 10 runs of stream we have > 10% remote node memory usage.
reclaim_4096_pages.txt
Instead of reclaiming 32 pages at a time, we try for a much larger batch
of 4096. The slope is much steeper but it still takes around 6 iterations
to get almost all local node memory.
wait_on_busy_flag.txt
Here we busy wait if the ZONE_RECLAIM_LOCKED flag is set. As you suggest
we would need to check the GFP flags etc, but so far it looks the most
promising. We only get a few percent of remote node memory on the first
iteration and get all local node by the second.
Perhaps a combination of larger batch size and waiting on the busy
flag is the way to go?
Anton
[-- Attachment #2: stream_test:_percentage_off_node_memory.png --]
[-- Type: image/png, Size: 34767 bytes --]
[-- Attachment #3: reclaim_4096_pages.patch --]
[-- Type: text/x-diff, Size: 376 bytes --]
--- mm/vmscan.c~ 2010-02-21 23:47:14.000000000 -0600
+++ mm/vmscan.c 2010-02-22 03:22:01.000000000 -0600
@@ -2534,7 +2534,7 @@
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
.nr_to_reclaim = max_t(unsigned long, nr_pages,
- SWAP_CLUSTER_MAX),
+ 4096),
.gfp_mask = gfp_mask,
.swappiness = vm_swappiness,
.order = order,
[-- Attachment #4: wait_on_ZONE_RECLAIM_LOCKED.patch --]
[-- Type: text/x-diff, Size: 482 bytes --]
--- mm/vmscan.c~ 2010-02-21 23:47:14.000000000 -0600
+++ mm/vmscan.c 2010-02-21 23:47:31.000000000 -0600
@@ -2634,8 +2634,8 @@
if (node_state(node_id, N_CPU) && node_id != numa_node_id())
return ZONE_RECLAIM_NOSCAN;
- if (zone_test_and_set_flag(zone, ZONE_RECLAIM_LOCKED))
- return ZONE_RECLAIM_NOSCAN;
+ while (zone_test_and_set_flag(zone, ZONE_RECLAIM_LOCKED))
+ cpu_relax();
ret = __zone_reclaim(zone, gfp_mask, order);
zone_clear_flag(zone, ZONE_RECLAIM_LOCKED);
next prev parent reply other threads:[~2010-02-23 1:55 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-18 22:29 [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim Anton Blanchard
2010-02-19 0:07 ` Anton Blanchard
2010-02-19 14:55 ` Mel Gorman
2010-02-19 15:12 ` Christoph Lameter
2010-02-19 15:41 ` Balbir Singh
2010-02-19 15:51 ` Christoph Lameter
2010-02-19 17:39 ` Balbir Singh
2010-02-23 1:55 ` Anton Blanchard [this message]
2010-02-23 16:23 ` Mel Gorman
2010-02-24 15:43 ` Christoph Lameter
2010-03-01 12:06 ` Mel Gorman
2010-03-01 15:19 ` Christoph Lameter
2010-02-19 15:43 ` Balbir Singh
2010-02-23 1:38 ` Anton Blanchard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100223015551.GG31681@kryten \
--to=anton@samba.org \
--cc=cl@linux-foundation.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).