All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Li Xinhai" <lixinhai.lxh@gmail.com>
To: mhocko <mhocko@suse.com>,  "Mike Kravetz" <mike.kravetz@oracle.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	 akpm <akpm@linux-foundation.org>,  guro <guro@fb.com>
Subject: Re: [PATCH] mm/hugetlb: try preferred node first when alloc gigantic page from cma
Date: Tue, 1 Sep 2020 22:20:44 +0800	[thread overview]
Message-ID: <202009012220421669005@gmail.com> (raw)
In-Reply-To: 20200901134119.GE16650@dhcp22.suse.cz

On 2020-09-01 at 21:41 Michal Hocko wrote:
>On Mon 31-08-20 14:44:40, Mike Kravetz wrote:
>> On 8/30/20 7:04 AM, Li Xinhai wrote:
>> > Since commit cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic
>> > hugepages using cma"), the gigantic page would be allocated from node
>> > which is not the preferred node, although there are pages available from
>> > that node. The reason is that the nid parameter has been ignored in
>> > alloc_gigantic_page().
>> >
>> > After this patch, the preferred node is tried first before other allowed
>> > nodes.
>>
>> Thank you!
>> This is an issue that needs to be fixed.
>>
>> > Fixes: cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>> > Cc: Roman Gushchin <guro@fb.com>
>> > Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> > Cc: Michal Hocko <mhocko@kernel.org>
>> > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>> > ---
>> >  mm/hugetlb.c | 9 ++++++++-
>> >  1 file changed, 8 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> > index a301c2d672bf..4a28b8853d47 100644
>> > --- a/mm/hugetlb.c
>> > +++ b/mm/hugetlb.c
>> > @@ -1256,8 +1256,15 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
>> >  struct page *page;
>> >  int node;
>> > 
>> > +	if (hugetlb_cma[nid]) {
>> > +	page = cma_alloc(hugetlb_cma[nid], nr_pages,
>> > +	huge_page_order(h), true);
>> > +	if (page)
>> > +	return page;
>> > +	}
>> > +
>>
>> When looking at your changes, I noticed that this code for allocation
>> from CMA does not take gfp_mask into account.  The 'normal' use case
>> is to allocate pool pages with something similar to:
>>
>> echo 16 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>>
>> The routine alloc_pool_huge_page will try to interleave pages among nodes:
>>
>> ...
>>         gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
>>
>>         for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
>> ...
>>
>> which will eventually call alloc_gigantic_page.  If __GFP_THISNODE is
>> set we really do not want to execute the below for loop in alloc_gigantic_page.
>
>Yes, this is the case indeed.
>
>> I think the convention in the mm code is that only the lowest level
>> allocation routines should interpret the GFP flags.  We may need to make
>> an exception here and check for __GFP_THISNODE.
>
>Yes this is true, But alloc_gigantic_page is actually low level
>allocation routine in fact.
> 
Thanks for the review, we need to consider the __GFP_THISNODE flag.

>I would go with the following
>diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>index a301c2d672bf..124754240b56 100644
>--- a/mm/hugetlb.c
>+++ b/mm/hugetlb.c
>@@ -1256,6 +1256,16 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
> struct page *page;
> int node;
>
>+	if (nid != NUMA_NO_NODE && hugetlb_cma[nid]) {
>+	page = cma_alloc(hugetlb_cma[nid], nr_pages,
>+	huge_page_order(h), true);
>+	if (page)
>+	return page;
>+	}
>+
>+	if (gfp_mask & __GFP_THISNODE)
>+	return NULL;
>+ 
I think in case of failed to allocate on THISNODE, it still needs to call below
alloc_contig_pages(), so we have one more chance to allcoate successfully
on the preferred node.

> for_each_node_mask(node, *nodemask) {
> if (!hugetlb_cma[node])
> continue;
> 
>I do not think we actually do have an explicit NUMA_NO_NODE user but it
>is safer to not asume that here.
>--
>Michal Hocko
>SUSE Labs

  reply	other threads:[~2020-09-01 14:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-30 14:04 [PATCH] mm/hugetlb: try preferred node first when alloc gigantic page from cma Li Xinhai
2020-08-31 21:44 ` Mike Kravetz
2020-09-01 13:41   ` Michal Hocko
2020-09-01 14:20     ` Li Xinhai [this message]
2020-09-01 14:53       ` Michal Hocko
2020-09-01 14:59         ` Li Xinhai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202009012220421669005@gmail.com \
    --to=lixinhai.lxh@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.