linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alex Thorlton <athorlton@sgi.com>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Hugh Dickins <hughd@google.com>, Bob Liu <lliubbo@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org
Subject: [BUG] mm, thp: khugepaged can't allocate on requested node when confined to a cpuset
Date: Wed, 8 Oct 2014 14:10:50 -0500	[thread overview]
Message-ID: <20141008191050.GK3778@sgi.com> (raw)

Hey everyone,

I've run into a some frustrating behavior from the khugepaged thread,
that I'm hoping to get sorted out.  It appears that if you pin
khugepaged to a cpuset (i.e. node 0), and it begins scanning/collapsing
pages for a process on a cpuset that doesn't have any memory nodes in
common with kugepaged (i.e. node 1), then the collapsed pages will all
be allocated khugepaged's node (in this case node 0), clearly breaking
the cpuset boundary set up for the process in question.

I'm aware that there are some known issues with khugepaged performing
off-node allocations in certain situations, but I believe this is a bit
of a special circumstance since, in this situation, there's no way for
khugepaged to perform an allocation on the desired node.

The problem really stems from the way that we determine the allowed
memory nodes in get_page_from_freelist.  When we call down to
cpuset_zone_allowed_softwall, we check current->mems_allowed to
determine what nodes we're allowed on.  In the case of khugepaged, we'll
be making allocations for the mm of the process we're collapsing for,
but we'll be checking the mems_allowed of khugepaged, which can
obviously cause some problems.

Is this particular bug a known issue?  I've been trying to come up with
a simple way to fix the bug, but it's a bit difficult since we no longer
have a way to trace back to the task_struct that we're collapsing for
once we've reached get_page_from_freelist.  I'm wondering if we might
want to make the cpuset check higher up in the call-chain and then pass
that nodemask down instead of sending a NULL nodemask, as we end up
doing in many (most?) situations.  I can think of several problems with
that approach as well, but it's all I've come up with so far.

The obvious workaround is to not isolate khugepaged to a cpuset, but
since we're allowed to do so, I think the thread should probably behave
appropriately when pinned to a cpuset.

Any input on this issue is greatly appreciated.  Thanks, guys!

- Alex

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2014-10-08 19:10 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-08 19:10 Alex Thorlton [this message]
2014-10-10  9:20 ` [BUG] mm, thp: khugepaged can't allocate on requested node when confined to a cpuset Peter Zijlstra
2014-10-10 18:56   ` Alex Thorlton
2014-10-10 21:57     ` Vlastimil Babka
2014-10-14 14:58       ` Alex Thorlton
2014-10-21 10:59       ` Peter Zijlstra
2014-10-21 10:55     ` Peter Zijlstra
2014-10-21 16:25       ` Alex Thorlton
2014-10-14 11:48 ` Kirill A. Shutemov
2014-10-14 14:54   ` Peter Zijlstra
2014-10-14 15:31     ` Rik van Riel
2014-10-14 17:38     ` Kirill A. Shutemov
2014-10-21 10:17       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141008191050.GK3778@sgi.com \
    --to=athorlton@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lliubbo@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).