linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>, Baoquan He <bhe@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Uladzislau Rezki <urezki@gmail.com>
Subject: Re: [PATCH] mm/page_alloc: avoid high-order page allocation warn with __GFP_NOFAIL
Date: Mon, 6 Mar 2023 08:51:40 +0100	[thread overview]
Message-ID: <ZAWbjIJCarmxGa8k@dhcp22.suse.cz> (raw)
In-Reply-To: <20230305053035.1911-1-hsiangkao@linux.alibaba.com>

[Cc couple of more people recently involved with vmalloc code]

On Sun 05-03-23 13:30:35, Gao Xiang wrote:
> My knowledge of this is somewhat limited, however, since vmalloc already
> supported __GFP_NOFAIL in commit 9376130c390a ("mm/vmalloc: add
> support for __GFP_NOFAIL").  __GFP_NOFAIL could trigger the following
> stack and allocate high-order pages when CONFIG_HAVE_ARCH_HUGE_VMALLOC
> is enabled:
> 
>  __alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
>  alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286
>  vm_area_alloc_pages mm/vmalloc.c:2989 [inline]
>
>  __vmalloc_area_node mm/vmalloc.c:3057 [inline]
>  __vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227
>  kvmalloc_node+0x156/0x1a0 mm/util.c:606
>  kvmalloc include/linux/slab.h:737 [inline]
>  kvmalloc_array include/linux/slab.h:755 [inline]
>  kvcalloc include/linux/slab.h:760 [inline]
>  (codebase: Linux 6.2-rc2)
> 
> Don't warn such cases since high-order pages with __GFP_NOFAIL is
> somewhat legel.

OK, this is definitely a bug and it seems my 9376130c390a was
incomplete because it hasn't covered the high order case. Not sure how
that happened but removing the warning is not the right thing to do
here. The higher order allocation is an optimization rather than a must.
So it is perfectly fine to fail that allocation and retry rather than
go into a very expensive and potentially impossible higher order
allocation that must not fail.

The proper fix should look like this unless I am missing something. I
would appreciate another pair of eyes on this because I am not fully
familiar with the high order optimization part much.

Thanks!
--- 
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ef910bf349e1..a8aa2765618a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2883,6 +2883,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		unsigned int order, unsigned int nr_pages, struct page **pages)
 {
 	unsigned int nr_allocated = 0;
+	gfp_t alloc_gfp = gfp;
+	bool nofail = false;
 	struct page *page;
 	int i;
 
@@ -2931,20 +2933,30 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			if (nr != nr_pages_request)
 				break;
 		}
+	} else {
+		alloc_gfp &= ~__GFP_NOFAIL;
+		nofail = true;
 	}
 
 	/* High-order pages or fallback path if "bulk" fails. */
-
 	while (nr_allocated < nr_pages) {
 		if (fatal_signal_pending(current))
 			break;
 
 		if (nid == NUMA_NO_NODE)
-			page = alloc_pages(gfp, order);
+			page = alloc_pages(alloc_gfp, order);
 		else
-			page = alloc_pages_node(nid, gfp, order);
-		if (unlikely(!page))
-			break;
+			page = alloc_pages_node(nid, alloc_gfp, order);
+		if (unlikely(!page)) {
+			if (!nofail)
+				break;
+
+			/* fall back to the zero order allocations */
+			alloc_gfp |= __GFP_NOFAIL;
+			order = 0;
+			continue;
+		}
+
 		/*
 		 * Higher order allocations must be able to be treated as
 		 * indepdenent small pages by callers (as they can with
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2023-03-06  7:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-05  5:30 [PATCH] mm/page_alloc: avoid high-order page allocation warn with __GFP_NOFAIL Gao Xiang
2023-03-06  7:51 ` Michal Hocko [this message]
2023-03-06  8:03   ` Gao Xiang
2023-03-06 12:14   ` Uladzislau Rezki
2023-03-06 14:03     ` [PATCH] mm, vmalloc: fix high order __GFP_NOFAIL allocations Michal Hocko
2023-03-06 16:37       ` Uladzislau Rezki
2023-03-06 17:29       ` Vlastimil Babka
2023-03-06 17:38         ` Michal Hocko
2023-03-07  0:58       ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAWbjIJCarmxGa8k@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=hch@lst.de \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).