From: Mel Gorman <mgorman@suse.de>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Rik van Riel <riel@redhat.com>, Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 0/6] Configurable fair allocation zone policy v3
Date: Thu, 19 Dec 2013 11:20:51 +0000 [thread overview]
Message-ID: <20131219112051.GH11295@suse.de> (raw)
In-Reply-To: <20131218194813.GB20038@cmpxchg.org>
On Wed, Dec 18, 2013 at 02:48:13PM -0500, Johannes Weiner wrote:
> > <SNIP>
> >
> > Sure about the name?
> >
> > This is a boolean and "mode" implies it might be a bitmask. That said, I
> > recognise that my own naming also sucked because complaining about yours
> > I can see that mine also sucks.
>
> Is it because of how we use zone_reclaim_mode? I don't see anything
> wrong with a "mode" toggle that switches between only two modes of
> operation instead of three or more. But English being a second
> language and all...
>
It's not just zone_reclaim_mode. Most references to mode in the VM (but
not all because who needs consistentcy) refer to either a mask or multiple
potential values. isolate_mode_t, gfp masks referred to as mode, memory
policies described as mode, migration modes etc.
Intuitively, I expect "mode" to not be a binary value.
> > > @@ -1816,7 +1833,7 @@ static void zlc_clear_zones_full(struct zonelist *zonelist)
> > >
> > > static bool zone_local(struct zone *local_zone, struct zone *zone)
> > > {
> > > - return node_distance(local_zone->node, zone->node) == LOCAL_DISTANCE;
> > > + return local_zone->node == zone->node;
> > > }
> >
> > Does that not break on !CONFIG_NUMA?
> >
> > It's why I used zone_to_nid
>
> There is a separate definition for !CONFIG_NUMA, it fit nicely next to
> the zlc stuff.
>
Ah, fair enough.
> > > static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
> > > @@ -1908,22 +1925,25 @@ zonelist_scan:
> > > if (unlikely(alloc_flags & ALLOC_NO_WATERMARKS))
> > > goto try_this_zone;
> > > /*
> > > - * Distribute pages in proportion to the individual
> > > - * zone size to ensure fair page aging. The zone a
> > > - * page was allocated in should have no effect on the
> > > - * time the page has in memory before being reclaimed.
> > > + * Distribute pagecache pages in proportion to the
> > > + * individual zone size to ensure fair page aging.
> > > + * The zone a page was allocated in should have no
> > > + * effect on the time the page has in memory before
> > > + * being reclaimed.
> > > *
> > > - * When zone_reclaim_mode is enabled, try to stay in
> > > - * local zones in the fastpath. If that fails, the
> > > + * When pagecache_mempolicy_mode or zone_reclaim_mode
> > > + * is enabled, try to allocate from zones within the
> > > + * preferred node in the fastpath. If that fails, the
> > > * slowpath is entered, which will do another pass
> > > * starting with the local zones, but ultimately fall
> > > * back to remote zones that do not partake in the
> > > * fairness round-robin cycle of this zonelist.
> > > */
> > > - if (alloc_flags & ALLOC_WMARK_LOW) {
> > > + if ((alloc_flags & ALLOC_WMARK_LOW) &&
> > > + (gfp_mask & __GFP_PAGECACHE)) {
> > > if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0)
> > > continue;
> >
> > NR_ALLOC_BATCH is updated regardless of zone_reclaim_mode or
> > pagecache_mempolicy_mode. We only reset batch in the prepare_slowpath in
> > some cases. Looks a bit fishy even though I can't quite put my finger on it.
> >
> > I also got details wrong here in the v3 of the series. In an unreleased
> > v4 of the series I had corrected the treatment of slab pages in line
> > with your wishes and reused the broken out helper in prepare_slowpath to
> > keep the decision in sync.
> >
> > It's still in development but even if it gets rejected it'll act as a
> > comparison point to yours.
> >
> > > - if (zone_reclaim_mode &&
> > > + if ((zone_reclaim_mode || pagecache_mempolicy_mode) &&
> > > !zone_local(preferred_zone, zone))
> > > continue;
> > > }
> >
> > Documention says "enabling pagecache_mempolicy_mode, in which case page cache
> > allocations will be placed according to the configured memory policy". Should
> > that be !pagecache_mempolicy_mode? I'm getting confused with the double nots.
>
> Yes, it's a bit weird.
>
> We want to consider the round-robin batches for local zones but at the
> same time avoid exhausted batches from pushing the allocation off-node
> when either of those modes are enabled. So in the fastpath we filter
> for both and in the slowpath, once kswapd has been woken at the same
> time that the batches have been reset to launch the new aging cycle,
> we try in order of zonelist preference.
>
> However, to answer your question above, if the slowpath still has to
> fall back to a remote zone, we don't want to reset its batch because
> we didn't verify it was actually exhausted in the fastpath and we
> could risk cutting short the aging cycle for that particular zone.
Understood, thanks.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-12-19 11:20 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 16:48 [RFC PATCH 0/6] Configurable fair allocation zone policy v3 Mel Gorman
2013-12-17 16:48 ` [PATCH 1/6] mm: page_alloc: exclude unreclaimable allocations from zone fairness policy Mel Gorman
2013-12-17 16:48 ` [PATCH 2/6] mm: page_alloc: Break out zone page aging distribution into its own helper Mel Gorman
2013-12-17 16:48 ` [PATCH 3/6] mm: page_alloc: Use zone node IDs to approximate locality Mel Gorman
2013-12-17 16:48 ` [PATCH 4/6] mm: Annotate page cache allocations Mel Gorman
2013-12-17 16:48 ` [PATCH 5/6] mm: page_alloc: Make zone distribution page aging policy configurable Mel Gorman
2013-12-17 16:48 ` [PATCH 6/6] mm: page_alloc: add vm.pagecache_interleave to control default mempolicy for page cache Mel Gorman
2013-12-17 20:02 ` [RFC PATCH 0/6] Configurable fair allocation zone policy v3 Johannes Weiner
2013-12-18 6:17 ` Johannes Weiner
2013-12-18 13:47 ` Rik van Riel
2013-12-18 14:17 ` Johannes Weiner
2013-12-18 15:00 ` Mel Gorman
2013-12-18 16:09 ` Mel Gorman
2013-12-18 19:48 ` Johannes Weiner
2013-12-19 11:20 ` Mel Gorman [this message]
2013-12-18 14:51 ` Michal Hocko
2013-12-18 15:18 ` Johannes Weiner
2013-12-18 16:20 ` Michal Hocko
2013-12-18 19:20 ` Johannes Weiner
2013-12-19 12:59 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131219112051.GH11295@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).