From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Alex Thorlton <athorlton@sgi.com>
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Hugh Dickins <hughd@google.com>, Bob Liu <lliubbo@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm@kvack.org
Subject: Re: [BUG] mm, thp: khugepaged can't allocate on requested node when confined to a cpuset
Date: Tue, 14 Oct 2014 14:48:28 +0300 [thread overview]
Message-ID: <20141014114828.GA6524@node.dhcp.inet.fi> (raw)
In-Reply-To: <20141008191050.GK3778@sgi.com>
On Wed, Oct 08, 2014 at 02:10:50PM -0500, Alex Thorlton wrote:
> Hey everyone,
>
> I've run into a some frustrating behavior from the khugepaged thread,
> that I'm hoping to get sorted out. It appears that if you pin
> khugepaged to a cpuset (i.e. node 0),
Why whould you want to pin khugpeaged? Is there a valid use-case?
Looks like userspace shoots to its leg.
> and it begins scanning/collapsing pages for a process on a cpuset that
> doesn't have any memory nodes in common with kugepaged (i.e. node 1),
> then the collapsed pages will all be allocated khugepaged's node (in
> this case node 0), clearly breaking the cpuset boundary set up for the
> process in question.
>
> I'm aware that there are some known issues with khugepaged performing
> off-node allocations in certain situations, but I believe this is a bit
> of a special circumstance since, in this situation, there's no way for
> khugepaged to perform an allocation on the desired node.
>
> The problem really stems from the way that we determine the allowed
> memory nodes in get_page_from_freelist. When we call down to
> cpuset_zone_allowed_softwall, we check current->mems_allowed to
> determine what nodes we're allowed on. In the case of khugepaged, we'll
> be making allocations for the mm of the process we're collapsing for,
> but we'll be checking the mems_allowed of khugepaged, which can
> obviously cause some problems.
Is there a reason why we should respect cpuset limitation for kernel
threads?
Should we bypass cpuset for PF_KTHREAD completely?
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 736d8e1b6381..03a74878ad46 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1960,6 +1960,9 @@ get_page_from_freelist(gfp_t gfp_mask, nodemask_t *nodemask, unsigned int order,
zonelist_scan:
zonelist_rescan = false;
+ /* Bypass cpuset limitation if allocate from kernel thread context */
+ if (current->flags & PF_KTHREAD)
+ alloc_flags &= ~ALLOC_CPUSET;
/*
* Scan zonelist, looking for a zone with enough free.
* See also __cpuset_node_allowed_softwall() comment in kernel/cpuset.c.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-10-14 11:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-08 19:10 [BUG] mm, thp: khugepaged can't allocate on requested node when confined to a cpuset Alex Thorlton
2014-10-10 9:20 ` Peter Zijlstra
2014-10-10 18:56 ` Alex Thorlton
2014-10-10 21:57 ` Vlastimil Babka
2014-10-14 14:58 ` Alex Thorlton
2014-10-21 10:59 ` Peter Zijlstra
2014-10-21 10:55 ` Peter Zijlstra
2014-10-21 16:25 ` Alex Thorlton
2014-10-14 11:48 ` Kirill A. Shutemov [this message]
2014-10-14 14:54 ` Peter Zijlstra
2014-10-14 15:31 ` Rik van Riel
2014-10-14 17:38 ` Kirill A. Shutemov
2014-10-21 10:17 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141014114828.GA6524@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=athorlton@sgi.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lliubbo@gmail.com \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).