From: Anton Blanchard <anton@samba.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: cl@linux-foundation.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim
Date: Tue, 23 Feb 2010 12:55:51 +1100 [thread overview]
Message-ID: <20100223015551.GG31681@kryten> (raw)
In-Reply-To: <20100219145523.GN30258@csn.ul.ie>
[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]
Hi Mel,
> You're pretty much on the button here. Only one thread at a time enters
> zone_reclaim. The others back off and try the next zone in the zonelist
> instead. I'm not sure what the original intention was but most likely it
> was to prevent too many parallel reclaimers in the same zone potentially
> dumping out way more data than necessary.
>
> > I'm not sure if there is an easy way to fix this without penalising other
> > workloads though.
> >
>
> You could experiment with waiting on the bit if the GFP flags allowi it? The
> expectation would be that the reclaim operation does not take long. Wait
> on the bit, if you are making the forward progress, recheck the
> watermarks before continueing.
Thanks to you and Christoph for some suggestions to try. Attached is a
chart showing the results of the following tests:
baseline.txt
The current ppc64 default of zone_reclaim_mode = 0. As expected we see
no change in remote node memory usage even after 10 iterations.
zone_reclaim_mode.txt
Now we set zone_reclaim_mode = 1. On each iteration we continue to improve,
but even after 10 runs of stream we have > 10% remote node memory usage.
reclaim_4096_pages.txt
Instead of reclaiming 32 pages at a time, we try for a much larger batch
of 4096. The slope is much steeper but it still takes around 6 iterations
to get almost all local node memory.
wait_on_busy_flag.txt
Here we busy wait if the ZONE_RECLAIM_LOCKED flag is set. As you suggest
we would need to check the GFP flags etc, but so far it looks the most
promising. We only get a few percent of remote node memory on the first
iteration and get all local node by the second.
Perhaps a combination of larger batch size and waiting on the busy
flag is the way to go?
Anton
[-- Attachment #2: stream_test:_percentage_off_node_memory.png --]
[-- Type: image/png, Size: 34767 bytes --]
[-- Attachment #3: reclaim_4096_pages.patch --]
[-- Type: text/x-diff, Size: 376 bytes --]
--- mm/vmscan.c~ 2010-02-21 23:47:14.000000000 -0600
+++ mm/vmscan.c 2010-02-22 03:22:01.000000000 -0600
@@ -2534,7 +2534,7 @@
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
.nr_to_reclaim = max_t(unsigned long, nr_pages,
- SWAP_CLUSTER_MAX),
+ 4096),
.gfp_mask = gfp_mask,
.swappiness = vm_swappiness,
.order = order,
[-- Attachment #4: wait_on_ZONE_RECLAIM_LOCKED.patch --]
[-- Type: text/x-diff, Size: 482 bytes --]
--- mm/vmscan.c~ 2010-02-21 23:47:14.000000000 -0600
+++ mm/vmscan.c 2010-02-21 23:47:31.000000000 -0600
@@ -2634,8 +2634,8 @@
if (node_state(node_id, N_CPU) && node_id != numa_node_id())
return ZONE_RECLAIM_NOSCAN;
- if (zone_test_and_set_flag(zone, ZONE_RECLAIM_LOCKED))
- return ZONE_RECLAIM_NOSCAN;
+ while (zone_test_and_set_flag(zone, ZONE_RECLAIM_LOCKED))
+ cpu_relax();
ret = __zone_reclaim(zone, gfp_mask, order);
zone_clear_flag(zone, ZONE_RECLAIM_LOCKED);
next prev parent reply other threads:[~2010-02-23 1:55 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-18 22:29 [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim Anton Blanchard
2010-02-19 0:07 ` Anton Blanchard
2010-02-19 14:55 ` Mel Gorman
2010-02-19 15:12 ` Christoph Lameter
2010-02-19 15:41 ` Balbir Singh
2010-02-19 15:51 ` Christoph Lameter
2010-02-19 17:39 ` Balbir Singh
2010-02-23 1:55 ` Anton Blanchard [this message]
2010-02-23 16:23 ` Mel Gorman
2010-02-24 15:43 ` Christoph Lameter
2010-03-01 12:06 ` Mel Gorman
2010-03-01 15:19 ` Christoph Lameter
2010-02-19 15:43 ` Balbir Singh
2010-02-23 1:38 ` Anton Blanchard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100223015551.GG31681@kryten \
--to=anton@samba.org \
--cc=cl@linux-foundation.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.