All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"Andrew Morton" <akpm@linux-foundation.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Muchun Song <songmuchun@bytedance.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>, Peter Xu <peterx@redhat.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Vlastimil Babka <vbabka@kernel.org>
Subject: Re: [PATCH v2] hugetlb: freeze allocated pages before creating hugetlb pages
Date: Tue, 20 Sep 2022 20:17:31 -0700	[thread overview]
Message-ID: <YyqCS6+OXAgoqI8T@monkey> (raw)
In-Reply-To: <20220921014810.GA2053328@hori.linux.bs1.fc.nec.co.jp>

On 09/21/22 01:48, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Fri, Sep 16, 2022 at 02:46:38PM -0700, Mike Kravetz wrote:
> > v1 -> v2
> > - Fixed up head page in error path of __prep_compound_gigantic_page as
> >   discovered by Miaohe Lin.
> > - Updated link to Matthew's Allocate and free frozen pages series.
> > - Rebased on next-20220916
> > 
> >  mm/hugetlb.c | 102 +++++++++++++++++++--------------------------------
> >  1 file changed, 38 insertions(+), 64 deletions(-)
> 
> Hello Mike,
> 
> I accidentally found a NULL pointer dereference when testing the latest
> mm-unstable, which seems to be caused (or exposed?) by this patch
> (I confirmed that it disappeared by reverting this patch).
> 
> It's reproduced by doing like `sysctl vm.nr_hugepages=1000000` to allocate
> hugepages as much as possible.
> 
> Could you check that this patch is related to the issue?
> 
> Thanks,
> Naoya Horiguchi
> 
> ---
> [   25.634476] BUG: kernel NULL pointer dereference, address: 0000000000000034
> [   25.635980] #PF: supervisor write access in kernel mode
> [   25.637283] #PF: error_code(0x0002) - not-present page
> [   25.638365] PGD 0 P4D 0
> [   25.638906] Oops: 0002 [#1] PREEMPT SMP PTI
> [   25.639779] CPU: 4 PID: 819 Comm: sysctl Tainted: G            E    N 6.0.0-rc3-v6.0-rc1-220920-1758-1398-g2b3f5+ #12
> [   25.641928] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> [   25.643727] RIP: 0010:alloc_buddy_huge_page.isra.0+0x8c/0x140
> [   25.645071] Code: fe ff 41 83 fc 01 0f 84 54 94 8b 00 41 bc 01 00 00 00 44 89 f7 4c 89 f9 44 89 ea 89 de e8 7c b9 fe ff 48 89 c7 b8 01 00 00 00 <f0> 0f b1 6f 34 66 90 83 f8 01 75 c5 48 85 ff 74 52 65 48 ff 05 03
> [   25.649006] RSP: 0018:ffffaa7181fffc18 EFLAGS: 00010286
> [   25.650215] RAX: 0000000000000001 RBX: 0000000000000009 RCX: 0000000000000009
> [   25.651672] RDX: ffffffffae3b6df0 RSI: ffffffffae8f7ce0 RDI: 0000000000000000
> [   25.653115] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000c01
> [   25.654579] R10: 0000000000000f90 R11: 0000000000000000 R12: 0000000000000002
> [   25.656176] R13: 0000000000000000 R14: 0000000000346cca R15: ffffffffae8f7ce0
> [   25.657637] FS:  00007f9252f2a740(0000) GS:ffff98cebbc00000(0000) knlGS:0000000000000000
> [   25.659292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   25.660469] CR2: 0000000000000034 CR3: 000000014924c004 CR4: 0000000000170ee0
> [   25.661928] Call Trace:
> [   25.662469]  <TASK>
> [   25.662927]  alloc_fresh_huge_page+0x16f/0x1d0
> [   25.663859]  alloc_pool_huge_page+0x6d/0xb0
> [   25.664734]  __nr_hugepages_store_common+0x189/0x3e0
> [   25.665764]  ? __do_proc_doulongvec_minmax+0x31f/0x340
> [   25.666832]  hugetlb_sysctl_handler_common+0xbf/0xd0
> [   25.667861]  ? hugetlb_register_node+0xe0/0xe0
> [   25.668786]  proc_sys_call_handler+0x196/0x2b0
> [   25.669724]  vfs_write+0x29b/0x3a0
> [   25.670454]  ksys_write+0x4f/0xd0
> [   25.671153]  do_syscall_64+0x3b/0x90
> [   25.671909]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   25.672958] RIP: 0033:0x7f9252d3e727
> [   25.673712] Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> [   25.677470] RSP: 002b:00007ffcdf9904a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [   25.679002] RAX: ffffffffffffffda RBX: 000055c6ae683210 RCX: 00007f9252d3e727
> [   25.680456] RDX: 0000000000000006 RSI: 000055c6ae683250 RDI: 0000000000000003
> [   25.681910] RBP: 000055c6ae685380 R08: 0000000000000003 R09: 0000000000000077
> [   25.683373] R10: 000000000000006b R11: 0000000000000246 R12: 0000000000000006
> [   25.684824] R13: 0000000000000006 R14: 0000000000000006 R15: 00007f9252df59e0
> [   25.686293]  </TASK>

Hello Naoya,

My bad for an obvious mistake!  Note this change in the patch,

> @@ -1951,7 +1953,21 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
>  		gfp_mask |= __GFP_RETRY_MAYFAIL;
>  	if (nid == NUMA_NO_NODE)
>  		nid = numa_mem_id();
> +retry:
>  	page = __alloc_pages(gfp_mask, order, nid, nmask);
> +
> +	/* Freeze head page */
> +	if (!page_ref_freeze(page, 1)) {
> +		__free_pages(page, order);
> +		if (retry) {	/* retry once */
> +			retry = false;
> +			goto retry;
> +		}
> +		/* WOW!  twice in a row. */
> +		pr_warn("HugeTLB head page unexpected inflated ref count\n");
> +		page = NULL;
> +	}
> +
>  	if (page)
>  		__count_vm_event(HTLB_BUDDY_PGALLOC);
>  	else

It does not check for a successful return from __alloc_pages before trying to
freeze 'page'.  It should obviously check for page == NULL before trying to
freeze with something like this.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index db90658db171..a092dd639687 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1961,7 +1961,7 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
 	page = __alloc_pages(gfp_mask, order, nid, nmask);
 
 	/* Freeze head page */
-	if (!page_ref_freeze(page, 1)) {
+	if (page && !page_ref_freeze(page, 1)) {
 		__free_pages(page, order);
 		if (retry) {	/* retry once */
 			retry = false;

I will send out a v3 (my) tomorrow.
-- 
Mike Kravetz


      reply	other threads:[~2022-09-21  3:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-16 21:46 [PATCH v2] hugetlb: freeze allocated pages before creating hugetlb pages Mike Kravetz
2022-09-20  4:02 ` Oscar Salvador
2022-09-20 17:10   ` Mike Kravetz
2022-09-21  3:54     ` Oscar Salvador
2022-09-21  1:48 ` HORIGUCHI NAOYA(堀口 直也)
2022-09-21  3:17   ` Mike Kravetz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YyqCS6+OXAgoqI8T@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=joao.m.martins@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=peterx@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=vbabka@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.