Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mike Kravetz <mike.kravetz@oracle.com>
To: Muchun Song <muchun.song@linux.dev>
Cc: Yuan Can <yuancan@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	wangkefeng.wang@huawei.com, David Hildenbrand <david@redhat.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
Date: Tue, 5 Sep 2023 17:28:14 -0700	[thread overview]
Message-ID: <20230906002814.GA3740@monkey> (raw)
In-Reply-To: <EB154303-CFE3-436A-B601-51B4F3002D73@linux.dev>

On 09/05/23 17:06, Muchun Song wrote:
> 
> 
> > On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
> > 
> > The decreasing of hugetlb pages number failed with the following message
> > given:
> > 
> > sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
> > CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
> > Hardware name: linux,dummy-virt (DT)
> > Call trace:
> >  dump_backtrace.part.6+0x84/0xe4
> >  show_stack+0x18/0x24
> >  dump_stack_lvl+0x48/0x60
> >  dump_stack+0x18/0x24
> >  warn_alloc+0x100/0x1bc
> >  __alloc_pages_slowpath.constprop.107+0xa40/0xad8
> >  __alloc_pages+0x244/0x2d0
> >  hugetlb_vmemmap_restore+0x104/0x1e4
> >  __update_and_free_hugetlb_folio+0x44/0x1f4
> >  update_and_free_hugetlb_folio+0x20/0x68
> >  update_and_free_pages_bulk+0x4c/0xac
> >  set_max_huge_pages+0x198/0x334
> >  nr_hugepages_store_common+0x118/0x178
> >  nr_hugepages_store+0x18/0x24
> >  kobj_attr_store+0x18/0x2c
> >  sysfs_kf_write+0x40/0x54
> >  kernfs_fop_write_iter+0x164/0x1dc
> >  vfs_write+0x3a8/0x460
> >  ksys_write+0x6c/0x100
> >  __arm64_sys_write+0x1c/0x28
> >  invoke_syscall+0x44/0x100
> >  el0_svc_common.constprop.1+0x6c/0xe4
> >  do_el0_svc+0x38/0x94
> >  el0_svc+0x28/0x74
> >  el0t_64_sync_handler+0xa0/0xc4
> >  el0t_64_sync+0x174/0x178
> > Mem-Info:
> >  ...
> > 
> > The reason is that the hugetlb pages being released are allocated from
> > movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
> > need to be allocated from the same node during the hugetlb pages
> 
> Thanks for your fix, I think it should be a real word issue, it's better
> to add a Fixes tag to indicate backporting. Thanks.
> 

I thought we might get get the same error (Unable to allocate on movable
node) when creating the hugetlb page.  Why?  Because we replace the head
vmemmap page.  However, I see that failure to allocate there is not a
fatal error and we fallback to the currently mapped page.  We also pass
__GFP_NOWARN to that allocation attempt so there will be no report of the
failure.

We might want to change this as well?

> > releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
> > node is always failed. Fix this problem by removing __GFP_THISNODE.
> > 
> > Signed-off-by: Yuan Can <yuancan@huawei.com>
> > ---
> > mm/hugetlb_vmemmap.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index c2007ef5e9b0..0485e471d224 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
> > static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
> >   				   struct list_head *list)
> > {
> > - 	gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
> > + 	gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
> 
> There is a little change for non-movable case after this change, we fist try
> to allocate memory from the preferred node (it is same as original), if it
> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
> huge pages could be freed once other nodes could satisfy the allocation of
> vmemmap pages.
> 
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>

This looks reasonable to me as well.

Cc'ing David and Michal as they are expert in hotplug.
-- 
Mike Kravetz

> 
> Thanks.
> 
> > 	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> > 	int nid = page_to_nid((struct page *)start);
> > 	struct page *page, *next;
> > -- 
> > 2.17.1

next prev parent reply	other threads:[~2023-09-06  0:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-05  3:13 [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Yuan Can
2023-09-05  3:13 ` [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks Yuan Can
2023-09-05  6:59   ` Muchun Song
2023-09-05  9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
2023-09-05 10:43   ` Kefeng Wang
2023-09-05 12:41   ` Yuan Can
2023-09-06  0:28   ` Mike Kravetz [this message]
2023-09-06  2:32     ` Muchun Song
2023-09-06  2:59       ` Yuan Can
2023-09-06  7:25     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230906002814.GA3740@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=wangkefeng.wang@huawei.com \
    --cc=yuancan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.