linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Xueshi Hu <xueshi.hu@smartx.com>
Cc: muchun.song@linux.dev, akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH 3/3] mm/hugeltb: fix nodes huge page allocation when there are surplus pages
Date: Mon, 31 Jul 2023 15:56:40 -0700	[thread overview]
Message-ID: <20230731225640.GB3351@monkey> (raw)
In-Reply-To: <20230730125156.207301-4-xueshi.hu@smartx.com>

On 07/30/23 20:51, Xueshi Hu wrote:
> In set_nr_huge_pages(), local variable "count" is used to record
> persistent_huge_pages(), but when it cames to nodes huge page allocation,
> the semantics changes to nr_huge_pages. When there exists surplus huge
> pages and using the interface under
> /sys/devices/system/node/node*/hugepages to change huge page pool size,
> this difference can result in the allocation of an unexpected number of
> huge pages.
> 
> Steps to reproduce the bug:
> 
> Starting with:
> 
> 				  Node 0          Node 1    Total
> 	HugePages_Total             0.00            0.00     0.00
> 	HugePages_Free              0.00            0.00     0.00
> 	HugePages_Surp              0.00            0.00     0.00
> 
> create 100 huge pages in Node 0 and consume it, then set Node 0 's
> nr_hugepages to 0.
> 
> yields:
> 
> 				  Node 0          Node 1    Total
> 	HugePages_Total           200.00            0.00   200.00
> 	HugePages_Free              0.00            0.00     0.00
> 	HugePages_Surp            200.00            0.00   200.00
> 
> write 100 to Node 1's nr_hugepages
> 
> 		echo 100 > /sys/devices/system/node/node1/\
> 	hugepages/hugepages-2048kB/nr_hugepages
> 
> gets:
> 
> 				  Node 0          Node 1    Total
> 	HugePages_Total           200.00          400.00   600.00
> 	HugePages_Free              0.00          400.00   400.00
> 	HugePages_Surp            200.00            0.00   200.00
> 
> Kernel is expected to create only 100 huge pages and it gives 200.
> 
> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
> ---
>  mm/hugetlb.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Good catch!

I added the code modified in this patch with commit fd875dca7c717.  However,
my commit moved the specific line that was the root case of the bug.  That
specific line was added with the commit 9a30523066cde that added hugetlb
node specific support way back in 2009 (2.6.32 timeframe).

Fix looks good, but waiting on resolution of max_huge_pages usage.
-- 
Mike Kravetz
  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 56647235ab21..8ed4fffdebda 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3490,7 +3490,9 @@ static int set_nr_huge_pages(struct hstate *h, unsigned long count, int nid,
>  	if (nid != NUMA_NO_NODE) {
>  		unsigned long old_count = count;
>  
> -		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
> +		count += persistent_huge_pages(h) -
> +			 (h->nr_huge_pages_node[nid] -
> +			  h->surplus_huge_pages_node[nid]);
>  		/*
>  		 * User may have specified a large count value which caused the
>  		 * above calculation to overflow.  In this case, they wanted
> -- 
> 2.40.1
> 
> 


      reply	other threads:[~2023-07-31 22:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-30 12:51 [PATCH 0/3] mm/hugetlb: fix /sys and /proc fs dealing with persistent hugepages Xueshi Hu
2023-07-30 12:51 ` [PATCH 1/3] mm/hugetlb: fix the inconsistency of /proc/sys/vm/nr_huge_pages Xueshi Hu
2023-07-31 22:17   ` Mike Kravetz
2023-08-01 12:22     ` Xueshi Hu
2023-08-01 18:49       ` Mike Kravetz
2023-08-02  7:31         ` Xueshi Hu
2023-08-02 18:20           ` Mike Kravetz
2023-07-30 12:51 ` [PATCH 2/3] mm/hugeltb: clean up hstate::max_huge_pages Xueshi Hu
2023-07-30 12:51 ` [PATCH 3/3] mm/hugeltb: fix nodes huge page allocation when there are surplus pages Xueshi Hu
2023-07-31 22:56   ` Mike Kravetz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230731225640.GB3351@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=xueshi.hu@smartx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).