linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: gerald.schaefer@de.ibm.com (Gerald Schaefer)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 2/2] mm: hugetlb: support gigantic surplus pages
Date: Wed, 9 Nov 2016 16:55:49 +0100	[thread overview]
Message-ID: <20161109165549.1cf320c5@thinkpad> (raw)
In-Reply-To: <1478675294-2507-1-git-send-email-shijie.huang@arm.com>

On Wed, 9 Nov 2016 15:08:14 +0800
Huang Shijie <shijie.huang@arm.com> wrote:

> When testing the gigantic page whose order is too large for the buddy
> allocator, the libhugetlbfs test case "counter.sh" will fail.
> 
> The failure is caused by:
>  1) kernel fails to allocate a gigantic page for the surplus case.
>     And the gather_surplus_pages() will return NULL in the end.
> 
>  2) The condition checks for "over-commit" is wrong.
> 
> This patch adds code to allocate the gigantic page in the
> __alloc_huge_page(). After this patch, gather_surplus_pages()
> can return a gigantic page for the surplus case.
> 
> This patch changes the condition checks for:
>      return_unused_surplus_pages()
>      nr_overcommit_hugepages_store()
>      hugetlb_overcommit_handler()
> 
> This patch also set @nid with proper value when NUMA_NO_NODE is
> passed to alloc_gigantic_page().
> 
> After this patch, the counter.sh can pass for the gigantic page.
> 
> Acked-by: Steve Capper <steve.capper@arm.com>
> Signed-off-by: Huang Shijie <shijie.huang@arm.com>
> ---
>   1. fix the wrong check in hugetlb_overcommit_handler();
>   2. try to fix the s390 issue.
> ---
>  mm/hugetlb.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 9fdfc24..5dbfd62 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1095,6 +1095,9 @@ static struct page *alloc_gigantic_page(int nid, unsigned int order)
>  	unsigned long ret, pfn, flags;
>  	struct zone *z;
> 
> +	if (nid == NUMA_NO_NODE)
> +		nid = numa_mem_id();
> +

Now counter.sh works (on s390) w/o the lockdep warning. However, it looks
like this change will now result in inconsistent behavior compared to the
normal sized hugepages, regarding surplus page allocation. Setting nid to
numa_mem_id() means that only the node of the current CPU will be considered
for allocating a gigantic page, as opposed to just "preferring" the current
node in the normal size case (__hugetlb_alloc_buddy_huge_page() ->
alloc_pages_node()) with a fallback to using other nodes.

I am not really familiar with NUMA, and I might be wrong here, but if
this is true then gigantic pages, which may be hard allocate at runtime
in general, will be even harder to find (as surplus pages) because you
only look on the current node.

I honestly do not understand why alloc_gigantic_page() needs a nid
parameter at all, since it looks like it will only be called from
alloc_fresh_gigantic_page_node(), which in turn is only called
from alloc_fresh_gigantic_page() in a "for_each_node" loop (at least
before your patch).

Now it could be an option to also use alloc_fresh_gigantic_page()
in your patch, instead of directly calling alloc_gigantic_page(),
in __alloc_huge_page(). This would fix the "local node only" issue,
but I am not sure how to handle the required nodes_allowed parameter.

Maybe someone with more NUMA insight could have a look at this.
The patch as it is also seems to work, with the "local node only"
restriction, so it may be an option to just accept this restriction.

  reply	other threads:[~2016-11-09 15:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-03  2:51 [PATCH 0/2] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
2016-11-03  2:51 ` [PATCH 1/2] mm: hugetlb: rename some allocation functions Huang Shijie
2016-11-04  3:11   ` [PATCH] " Huang Shijie
2016-11-03  2:51 ` [PATCH 2/2] mm: hugetlb: support gigantic surplus pages Huang Shijie
2016-11-03  3:13   ` kbuild test robot
2016-11-07 15:25   ` Gerald Schaefer
2016-11-08  2:19     ` Huang Shijie
2016-11-08  7:08       ` Huang Shijie
2016-11-08  9:17         ` Huang Shijie
2016-11-08 19:27           ` Gerald Schaefer
2016-11-09  7:12             ` Huang Shijie
2016-11-09  7:08   ` [PATCH v2 " Huang Shijie
2016-11-09 15:55     ` Gerald Schaefer [this message]
2016-11-10  7:03       ` Huang Shijie
2016-11-03 17:22 ` [PATCH 0/2] mm: fix the "counter.sh" failure for libhugetlbfs Randy Dunlap
2016-11-04  1:59   ` Huang Shijie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161109165549.1cf320c5@thinkpad \
    --to=gerald.schaefer@de.ibm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).