From: Andrea Arcangeli <aarcange@redhat.com>
To: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@suse.de>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 3/3] mm: page_alloc: fair zone allocator policy
Date: Thu, 1 Aug 2013 17:51:11 +0200 [thread overview]
Message-ID: <20130801155111.GO25926@redhat.com> (raw)
In-Reply-To: <51F9E4A6.2090909@redhat.com>
On Thu, Aug 01, 2013 at 12:31:34AM -0400, Rik van Riel wrote:
> On 07/31/2013 10:56 PM, Minchan Kim wrote:
>
> > Yes, it's not really slow path because it could return to normal status
> > without calling significant slow functions by reset batchcount of
> > prepare_slowpath.
> >
> > I think it's tradeoff and I am biased your approach although we would
> > lose a little performance because fair aging would recover the loss by
> > fastpath's overhead. But who knows? Someone has a concern.
> >
> > So we should mention about such problems.
>
> If the atomic operation in the fast path turns out to be a problem,
> I suspect we may be able to fix it by using per-cpu counters, and
> consolidating those every once in a while.
>
> However, it may be good to see whether there is a problem in the
> first place, before adding complexity.
prepare_slowpath is racy anyway, so I don't see the point in wanting
to do atomic ops in the first place.
Much better to do the increment in assembly in a single insn to reduce
the window, but there's no point in using the lock prefix when
atomic_set will screw it over while somebody does atomic_dec under it
anyway. atomic_set cannot be atomic, so it'll screw over any
perfectly accurate accounted atomic_dec anyway.
I think it should be done in C without atomic_dec or by introducing an
__atomic_dec that isn't using the lock prefix and just tries to do the
dec in a single (or as few as possible) asm insn.
And because the whole thing is racy, after prepare_slowpath (which I
think is a too generic name and should be renamed
reset_zonelist_alloc_batch), this last attempt before invoking
reclaim:
/* This is the last chance, in general, before the goto nopage. */
page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
preferred_zone, migratetype);
which is trying again with the MIN wmark, should be also altered so
that we do one pass totally disregarding alloc_batch, just before
declaring MIN-wmark failure in the allocation, and invoking direct
reclaim.
Either we add a (alloc_flags | ALLOC_NO_ALLOC_BATCH) to the above
alloc_flags in the above call, or we run a second
get_page_from_freelist if the above one fails with
ALLOC_NO_ALLOC_BATCH set.
Otherwise because of the inherent racy behavior of the alloc_batch
logic, what will happen is that we'll eventually have enough hugepage
allocations in enough CPUs to hit the alloc_batch limit again in
between the prepare_slowpath and the above get_page_from_freelist,
that will lead us to call spurious direct_reclaim invocations, when it
was just a false negative of alloc_batch.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-08-01 15:51 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-19 20:55 [patch 0/3] mm: improve page aging fairness between zones/nodes Johannes Weiner
2013-07-19 20:55 ` [patch 1/3] mm: vmscan: fix numa reclaim balance problem in kswapd Johannes Weiner
2013-07-22 19:47 ` Rik van Riel
2013-07-22 20:14 ` Johannes Weiner
2013-07-26 22:53 ` Andrew Morton
2013-07-30 17:45 ` Johannes Weiner
2013-07-31 12:43 ` Johannes Weiner
2013-07-19 20:55 ` [patch 2/3] mm: page_alloc: rearrange watermark checking in get_page_from_freelist Johannes Weiner
2013-07-22 19:51 ` Rik van Riel
2013-07-19 20:55 ` [patch 3/3] mm: page_alloc: fair zone allocator policy Johannes Weiner
2013-07-22 20:21 ` Rik van Riel
2013-07-22 21:04 ` Johannes Weiner
2013-07-22 22:48 ` Rik van Riel
2013-07-25 6:50 ` Paul Bolle
2013-07-25 15:10 ` Johannes Weiner
2013-07-25 15:20 ` Paul Bolle
2013-07-29 17:48 ` Andrea Arcangeli
2013-07-29 22:24 ` Johannes Weiner
2013-08-01 2:56 ` Minchan Kim
2013-08-01 4:31 ` Rik van Riel
2013-08-01 15:51 ` Andrea Arcangeli [this message]
2013-08-01 19:58 ` Johannes Weiner
2013-08-01 22:16 ` Andrea Arcangeli
2013-08-02 6:22 ` Johannes Weiner
2013-08-02 7:32 ` Minchan Kim
2013-07-22 16:48 ` [patch 0/3] mm: improve page aging fairness between zones/nodes Zlatko Calusic
2013-07-22 17:01 ` Johannes Weiner
2013-07-22 17:14 ` Zlatko Calusic
2013-07-24 11:18 ` Zlatko Calusic
2013-07-24 12:46 ` Hush Bensen
2013-07-24 13:59 ` Zlatko Calusic
2013-07-31 9:33 ` Zlatko Calusic
2013-07-26 22:45 ` Andrew Morton
2013-07-26 23:14 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130801155111.GO25926@redhat.com \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).