Re: hugetlb page migration vs. overcommit

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: linux-mm@kvack.org
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: hugetlb page migration vs. overcommit
Date: Tue, 28 Nov 2017 15:12:09 +0100	[thread overview]
Message-ID: <20171128141211.11117-1-mhocko@kernel.org> (raw)
In-Reply-To: <20171128101907.jtjthykeuefxu7gl@dhcp22.suse.cz>

On Tue 28-11-17 11:19:07, Michal Hocko wrote:
> On Wed 22-11-17 16:28:32, Michal Hocko wrote:
> > Hi,
> > is there any reason why we enforce the overcommit limit during hugetlb
> > pages migration? It's in alloc_huge_page_node->__alloc_buddy_huge_page
> > path. I am wondering whether this is really an intentional behavior.
> > The page migration allocates a page just temporarily so we should be
> > able to go over the overcommit limit for the migration duration. The
> > reason I am asking is that hugetlb pages tend to be utilized usually
> > (otherwise the memory would be just wasted and pool shrunk) but then
> > the migration simply fails which breaks memory hotplug and other
> > migration dependent functionality which is quite suboptimal. You can
> > workaround that by increasing the overcommit limit.
> > 
> > Why don't we simply migrate as long as we are able to allocate the
> > target hugetlb page? I have a half baked patch to remove this
> > restriction, would there be an opposition to do something like that?
> 
> So I finally got to think about this some more and looked at how we
> actually account things more thoroughly. And it is, you both of you
> expected, quite subtle and not easy to get around. Per NUMA pools make
> things quite complicated. Why? Migration can really increase the overall
> pool size. Say we are migrating from Node1 to Node2. Node2 doesn't have
> any pre-allocated pages but assume that the overcommit allows us to move
> on. All good. Except that the original page will return to the pool
> because free_huge_page will see Node1 without any surplus pages and
> therefore moves back the page to the pool. Node2 will release the
> surplus page only after it is freed which can be an unbound amount of
> time. 
> 
> While we are still effectively under the overcommit limit the semantic
> is kind of strange and I am not sure the behavior is really intended.
> I see why per node surplus counter is used here. We simply want to
> maintain per node counts after regular page free. So I was thinking
> to add a temporary/migrate state to the huge page for migration pages
> (start with new page, state transfered to the old page on success) and
> free such a page to the allocator regardless of the surplus counters.
> 
> This would mean that the page migration might change inter node pool
> sizes but I guess that should be acceptable. What do you guys think?
> I can send a draft patch if that helps you to understand the idea.

This is what I have currently and it seems to work (or at least it
doesn't it doesn't blow up immediately). The first patch is a cleanup
and patch2 implements the temporary page idea.

Does this make any sense to you at all?
-- 
Michal Hocko

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-11-28 14:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22 15:28 hugetlb page migration vs. overcommit Michal Hocko
2017-11-22 19:11 ` Mike Kravetz
2017-11-23  9:21   ` Michal Hocko
2017-11-27  6:27   ` Naoya Horiguchi
2017-11-28 10:19 ` Michal Hocko
2017-11-28 14:12   ` Michal Hocko [this message]
2017-11-28 14:12     ` [PATCH RFC 1/2] mm, hugetlb: unify core page allocation accounting and initialization Michal Hocko
2017-11-28 21:34       ` Mike Kravetz
2017-11-29  6:57         ` Michal Hocko
2017-11-29 19:09           ` Mike Kravetz
2017-11-28 14:12     ` [PATCH RFC 2/2] mm, hugetlb: do not rely on overcommit limit during migration Michal Hocko
2017-11-29  1:39       ` Mike Kravetz
2017-11-29  7:17         ` Michal Hocko
2017-11-29  9:22       ` Michal Hocko
2017-11-29  9:40         ` Michal Hocko
2017-11-29 11:23         ` Michal Hocko
2017-11-29 19:52         ` Mike Kravetz
2017-11-30  7:57           ` Michal Hocko
2017-11-30 19:35             ` Mike Kravetz
2017-11-30 19:57               ` Michal Hocko
2017-11-30 20:06                 ` Michal Hocko
2017-11-29  9:51       ` Michal Hocko
2017-11-29 11:33       ` [PATCH RFC v2 " Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171128141211.11117-1-mhocko@kernel.org \
    --to=mhocko@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).