linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn
@ 2024-08-30 20:28 Barry Song
  2024-08-30 20:28 ` [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL Barry Song
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Barry Song @ 2024-08-30 20:28 UTC (permalink / raw)
  To: akpm, linux-mm, virtualization
  Cc: david, 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko,
	penberg, rientjes, roman.gushchin, torvalds, urezki, v-songbaohua,
	vbabka, laoar.shao

From: Barry Song <v-songbaohua@oppo.com>


-v4:
 * drop all BUG_ON, Linus, David, Yafang;
 * warn in a better place, Vlastimil;
 * drop patch 3/4, the maximum supported size for __GFP_NOFAIL will be
   discussed separately, Michal;
 
-v3:
 https://lore.kernel.org/linux-mm/20240817062449.21164-1-21cnbao@gmail.com/
 * collect reviewed-by, acked-by etc. Michal, Christoph, Vlastimil, Davidlohr,
   thanks!
 * use Jason's patch[1] to fix vdpa and refine his changelog.
 * refine changelogs
[1] https://lore.kernel.org/all/20240808054320.10017-1-jasowang@redhat.com/

-v2:
 https://lore.kernel.org/linux-mm/20240731000155.109583-1-21cnbao@gmail.com/

 * adjust vpda fix according to Jason and Michal's feedback, I would
   expect Jason to test it, thanks!
 * split BUG_ON of unavoidable failure and the case GFP_ATOMIC |
   __GFP_NOFAIL into two patches according to Vlastimil and Michal.
 * collect Michal's acked-by for patch 2 - the doc;
 * remove the new GFP_NOFAIL from this series, that one would be a
   separate enhancement patchset later on.

-v1:
 https://lore.kernel.org/linux-mm/20240724085544.299090-1-21cnbao@gmail.com/

__GFP_NOFAIL carries the semantics of never failing, so its callers
do not check the return value:
  %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
  cannot handle allocation failures. The allocation could block
  indefinitely but will never return with failure. Testing for
  failure is pointless.

However, __GFP_NOFAIL can sometimes fail if it exceeds size limits
or is used with GFP_ATOMIC/GFP_NOWAIT in a non-sleepable context.
This patchset handles illegal using __GFP_NOFAIL together with
GFP_ATOMIC lacking __GFP_DIRECT_RECLAIM(without this, we can't do
anything to reclaim memory to satisfy the nofail requirement) and
improve related document and warnings.

The proper size limits for __GFP_NOFAIL will be handled separately
after more discussions.

* The discussion started from this topic:
 [PATCH RFC] mm: warn potential return NULL for kmalloc_array and
             kvmalloc_array with __GFP_NOFAIL

 https://lore.kernel.org/linux-mm/20240717230025.77361-1-21cnbao@gmail.com/


Barry Song (2):
  mm: document __GFP_NOFAIL must be blockable
  mm: warn about illegal __GFP_NOFAIL usage in a more appropriate
    location and manner

Jason Wang (1):
  vduse: avoid using __GFP_NOFAIL

 drivers/vdpa/vdpa_user/iova_domain.c | 19 ++++++-----
 drivers/vdpa/vdpa_user/iova_domain.h |  1 +
 include/linux/gfp_types.h            |  5 ++-
 mm/page_alloc.c                      | 50 ++++++++++++++--------------
 4 files changed, 41 insertions(+), 34 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL
  2024-08-30 20:28 [PATCH v4 0/3] mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn Barry Song
@ 2024-08-30 20:28 ` Barry Song
  2024-09-02  7:33   ` David Hildenbrand
  2024-08-30 20:28 ` [PATCH v4 2/3] mm: document __GFP_NOFAIL must be blockable Barry Song
  2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
  2 siblings, 1 reply; 17+ messages in thread
From: Barry Song @ 2024-08-30 20:28 UTC (permalink / raw)
  To: akpm, linux-mm, virtualization
  Cc: david, 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko,
	penberg, rientjes, roman.gushchin, torvalds, urezki, v-songbaohua,
	vbabka, laoar.shao, Jason Wang, Xie Yongji

From: Jason Wang <jasowang@redhat.com>

mm doesn't support non-blockable __GFP_NOFAIL allocation. Because
persisting in providing __GFP_NOFAIL services for non-block users
who cannot perform direct memory reclaim may only result in an
endless busy loop.

Therefore, in such cases, the current mm-core may directly return
a NULL pointer:

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
                                                struct alloc_context *ac)
{
        ...
        if (gfp_mask & __GFP_NOFAIL) {
                /*
                 * All existing users of the __GFP_NOFAIL are blockable, so warn
                 * of any new users that actually require GFP_NOWAIT
                 */
                if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
                        goto fail;
                ...
        }
        ...
fail:
        warn_alloc(gfp_mask, ac->nodemask,
                        "page allocation failure: order:%u", order);
got_pg:
        return page;
}

Unfortuantely, vpda does that nofail allocation under non-sleepable
lock. A possible way to fix that is to move the pages allocation out
of the lock into the caller, but having to allocate a huge number of
pages and auxiliary page array seems to be problematic as well per
Tetsuon: " You should implement proper error handling instead of
using __GFP_NOFAIL if count can become large."

So I choose another way, which does not release kernel bounce pages
when user tries to register userspace bounce pages. Then we can
avoid allocating in paths where failure is not expected.(e.g in
the release). We pay this for more memory usage as we don't release
kernel bounce pages but further optimizations could be done on top.

Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer")
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Tested-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
[v-songbaohua@oppo.com: Refine the changelog]
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++--------
 drivers/vdpa/vdpa_user/iova_domain.h |  1 +
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
index 791d38d6284c..58116f89d8da 100644
--- a/drivers/vdpa/vdpa_user/iova_domain.c
+++ b/drivers/vdpa/vdpa_user/iova_domain.c
@@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
 				enum dma_data_direction dir)
 {
 	struct vduse_bounce_map *map;
+	struct page *page;
 	unsigned int offset;
 	void *addr;
 	size_t sz;
@@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
 			    map->orig_phys == INVALID_PHYS_ADDR))
 			return;
 
-		addr = kmap_local_page(map->bounce_page);
+		page = domain->user_bounce_pages ?
+		       map->user_bounce_page : map->bounce_page;
+
+		addr = kmap_local_page(page);
 		do_bounce(map->orig_phys + offset, addr + offset, sz, dir);
 		kunmap_local(addr);
 		size -= sz;
@@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain,
 				memcpy_to_page(pages[i], 0,
 					       page_address(map->bounce_page),
 					       PAGE_SIZE);
-			__free_page(map->bounce_page);
 		}
-		map->bounce_page = pages[i];
+		map->user_bounce_page = pages[i];
 		get_page(pages[i]);
 	}
 	domain->user_bounce_pages = true;
@@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain)
 		struct page *page = NULL;
 
 		map = &domain->bounce_maps[i];
-		if (WARN_ON(!map->bounce_page))
+		if (WARN_ON(!map->user_bounce_page))
 			continue;
 
 		/* Copy user page to kernel page if it's in use */
 		if (map->orig_phys != INVALID_PHYS_ADDR) {
-			page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL);
+			page = map->bounce_page;
 			memcpy_from_page(page_address(page),
-					 map->bounce_page, 0, PAGE_SIZE);
+					 map->user_bounce_page, 0, PAGE_SIZE);
 		}
-		put_page(map->bounce_page);
-		map->bounce_page = page;
+		put_page(map->user_bounce_page);
+		map->user_bounce_page = NULL;
 	}
 	domain->user_bounce_pages = false;
 out:
diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_user/iova_domain.h
index f92f22a7267d..7f3f0928ec78 100644
--- a/drivers/vdpa/vdpa_user/iova_domain.h
+++ b/drivers/vdpa/vdpa_user/iova_domain.h
@@ -21,6 +21,7 @@
 
 struct vduse_bounce_map {
 	struct page *bounce_page;
+	struct page *user_bounce_page;
 	u64 orig_phys;
 };
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 2/3] mm: document __GFP_NOFAIL must be blockable
  2024-08-30 20:28 [PATCH v4 0/3] mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn Barry Song
  2024-08-30 20:28 ` [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL Barry Song
@ 2024-08-30 20:28 ` Barry Song
  2024-09-02  7:34   ` David Hildenbrand
  2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
  2 siblings, 1 reply; 17+ messages in thread
From: Barry Song @ 2024-08-30 20:28 UTC (permalink / raw)
  To: akpm, linux-mm, virtualization
  Cc: david, 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko,
	penberg, rientjes, roman.gushchin, torvalds, urezki, v-songbaohua,
	vbabka, laoar.shao, Christoph Hellwig, Davidlohr Bueso,
	Eugenio Pérez, Jason Wang, Kees Cook, Lorenzo Stoakes,
	Maxime Coquelin, Michael S. Tsirkin, Xuan Zhuo

From: Barry Song <v-songbaohua@oppo.com>

Non-blocking allocation with __GFP_NOFAIL is not supported and may still
result in NULL pointers (if we don't return NULL, we result in busy-loop
within non-sleepable contexts):

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
						struct alloc_context *ac)
{
	...
	/*
	 * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
	 * we always retry
	 */
	if (gfp_mask & __GFP_NOFAIL) {
		/*
		 * All existing users of the __GFP_NOFAIL are blockable, so warn
		 * of any new users that actually require GFP_NOWAIT
		 */
		if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
			goto fail;
		...
	}
	...
fail:
	warn_alloc(gfp_mask, ac->nodemask,
			"page allocation failure: order:%u", order);
got_pg:
	return page;
}

Highlight this in the documentation of __GFP_NOFAIL so that non-mm
subsystems can reject any illegal usage of __GFP_NOFAIL with GFP_ATOMIC,
GFP_NOWAIT, etc.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: Hailong.Liu <hailong.liu@oppo.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 include/linux/gfp_types.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
index 313be4ad79fd..4a1fa7706b0c 100644
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -215,7 +215,8 @@ enum {
  * the caller still has to check for failures) while costly requests try to be
  * not disruptive and back off even without invoking the OOM killer.
  * The following three modifiers might be used to override some of these
- * implicit rules.
+ * implicit rules. Please note that all of them must be used along with
+ * %__GFP_DIRECT_RECLAIM flag.
  *
  * %__GFP_NORETRY: The VM implementation will try only very lightweight
  * memory direct reclaim to get some memory under memory pressure (thus
@@ -246,6 +247,8 @@ enum {
  * cannot handle allocation failures. The allocation could block
  * indefinitely but will never return with failure. Testing for
  * failure is pointless.
+ * It _must_ be blockable and used together with __GFP_DIRECT_RECLAIM.
+ * It should _never_ be used in non-sleepable contexts.
  * New users should be evaluated carefully (and the flag should be
  * used only when there is no reasonable failure policy) but it is
  * definitely preferable to use the flag rather than opencode endless
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-08-30 20:28 [PATCH v4 0/3] mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn Barry Song
  2024-08-30 20:28 ` [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL Barry Song
  2024-08-30 20:28 ` [PATCH v4 2/3] mm: document __GFP_NOFAIL must be blockable Barry Song
@ 2024-08-30 20:28 ` Barry Song
  2024-09-01 20:18   ` Vlastimil Babka
                     ` (3 more replies)
  2 siblings, 4 replies; 17+ messages in thread
From: Barry Song @ 2024-08-30 20:28 UTC (permalink / raw)
  To: akpm, linux-mm, virtualization
  Cc: david, 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko,
	penberg, rientjes, roman.gushchin, torvalds, urezki, v-songbaohua,
	vbabka, laoar.shao

From: Barry Song <v-songbaohua@oppo.com>

Three points for this change:

1. We should consolidate all warnings in one place. Currently, the
   order > 1 warning is in the hotpath, while others are in less
   likely scenarios. Moving all warnings to the slowpath will reduce
   the overhead for order > 1 and increase the visibility of other
   warnings.

2. We currently have two warnings for order: one for order > 1 in
   the hotpath and another for order > costly_order in the laziest
   path. I suggest standardizing on order > 1 since it’s been in
   use for a long time.

3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
   is meant to suppress allocation failure reports, but here we're
   dealing with bug detection, not allocation failures. So replace
   WARN_ON_ONCE_GFP by WARN_ON_ONCE.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c81ee5662cc7..e790b4227322 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
 {
 	struct page *page;
 
-	/*
-	 * We most definitely don't want callers attempting to
-	 * allocate greater than order-1 page units with __GFP_NOFAIL.
-	 */
-	WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
-
 	if (likely(pcp_allowed_order(order))) {
 		page = rmqueue_pcplist(preferred_zone, zone, order,
 				       migratetype, alloc_flags);
@@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 {
 	bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
 	bool can_compact = gfp_compaction_allowed(gfp_mask);
+	bool nofail = gfp_mask & __GFP_NOFAIL;
 	const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
 	struct page *page = NULL;
 	unsigned int alloc_flags;
@@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	unsigned int zonelist_iter_cookie;
 	int reserve_flags;
 
+	if (unlikely(nofail)) {
+		/*
+		 * We most definitely don't want callers attempting to
+		 * allocate greater than order-1 page units with __GFP_NOFAIL.
+		 */
+		WARN_ON_ONCE(order > 1);
+		/*
+		 * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
+		 * otherwise, we may result in lockup.
+		 */
+		WARN_ON_ONCE(!can_direct_reclaim);
+		/*
+		 * PF_MEMALLOC request from this context is rather bizarre
+		 * because we cannot reclaim anything and only can loop waiting
+		 * for somebody to do a work for us.
+		 */
+		WARN_ON_ONCE(current->flags & PF_MEMALLOC);
+	}
+
 restart:
 	compaction_retries = 0;
 	no_progress_loops = 0;
@@ -4404,29 +4418,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
 	 * we always retry
 	 */
-	if (gfp_mask & __GFP_NOFAIL) {
+	if (unlikely(nofail)) {
 		/*
-		 * All existing users of the __GFP_NOFAIL are blockable, so warn
-		 * of any new users that actually require GFP_NOWAIT
+		 * Lacking direct_reclaim we can't do anything to reclaim memory,
+		 * we disregard these unreasonable nofail requests and still
+		 * return NULL
 		 */
-		if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
+		if (!can_direct_reclaim)
 			goto fail;
 
-		/*
-		 * PF_MEMALLOC request from this context is rather bizarre
-		 * because we cannot reclaim anything and only can loop waiting
-		 * for somebody to do a work for us
-		 */
-		WARN_ON_ONCE_GFP(current->flags & PF_MEMALLOC, gfp_mask);
-
-		/*
-		 * non failing costly orders are a hard requirement which we
-		 * are not prepared for much so let's warn about these users
-		 * so that we can identify them and convert them to something
-		 * else.
-		 */
-		WARN_ON_ONCE_GFP(costly_order, gfp_mask);
-
 		/*
 		 * Help non-failing allocations by giving some access to memory
 		 * reserves normally used for high priority non-blocking
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
@ 2024-09-01 20:18   ` Vlastimil Babka
  2024-09-02  3:23   ` Yafang Shao
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2024-09-01 20:18 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm, virtualization
  Cc: david, 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko,
	penberg, rientjes, roman.gushchin, torvalds, urezki, v-songbaohua,
	laoar.shao

On 8/30/24 22:28, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Three points for this change:
> 
> 1. We should consolidate all warnings in one place. Currently, the
>    order > 1 warning is in the hotpath, while others are in less
>    likely scenarios. Moving all warnings to the slowpath will reduce
>    the overhead for order > 1 and increase the visibility of other
>    warnings.
> 
> 2. We currently have two warnings for order: one for order > 1 in
>    the hotpath and another for order > costly_order in the laziest
>    path. I suggest standardizing on order > 1 since it’s been in
>    use for a long time.
> 
> 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
>    is meant to suppress allocation failure reports, but here we're
>    dealing with bug detection, not allocation failures. So replace
>    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks!

> ---
>  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c81ee5662cc7..e790b4227322 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
>  {
>  	struct page *page;
>  
> -	/*
> -	 * We most definitely don't want callers attempting to
> -	 * allocate greater than order-1 page units with __GFP_NOFAIL.
> -	 */
> -	WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> -
>  	if (likely(pcp_allowed_order(order))) {
>  		page = rmqueue_pcplist(preferred_zone, zone, order,
>  				       migratetype, alloc_flags);
> @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  {
>  	bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
>  	bool can_compact = gfp_compaction_allowed(gfp_mask);
> +	bool nofail = gfp_mask & __GFP_NOFAIL;
>  	const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
>  	struct page *page = NULL;
>  	unsigned int alloc_flags;
> @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	unsigned int zonelist_iter_cookie;
>  	int reserve_flags;
>  
> +	if (unlikely(nofail)) {
> +		/*
> +		 * We most definitely don't want callers attempting to
> +		 * allocate greater than order-1 page units with __GFP_NOFAIL.
> +		 */
> +		WARN_ON_ONCE(order > 1);
> +		/*
> +		 * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> +		 * otherwise, we may result in lockup.
> +		 */
> +		WARN_ON_ONCE(!can_direct_reclaim);
> +		/*
> +		 * PF_MEMALLOC request from this context is rather bizarre
> +		 * because we cannot reclaim anything and only can loop waiting
> +		 * for somebody to do a work for us.
> +		 */
> +		WARN_ON_ONCE(current->flags & PF_MEMALLOC);
> +	}
> +
>  restart:
>  	compaction_retries = 0;
>  	no_progress_loops = 0;
> @@ -4404,29 +4418,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	 * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
>  	 * we always retry
>  	 */
> -	if (gfp_mask & __GFP_NOFAIL) {
> +	if (unlikely(nofail)) {
>  		/*
> -		 * All existing users of the __GFP_NOFAIL are blockable, so warn
> -		 * of any new users that actually require GFP_NOWAIT
> +		 * Lacking direct_reclaim we can't do anything to reclaim memory,
> +		 * we disregard these unreasonable nofail requests and still
> +		 * return NULL
>  		 */
> -		if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> +		if (!can_direct_reclaim)
>  			goto fail;
>  
> -		/*
> -		 * PF_MEMALLOC request from this context is rather bizarre
> -		 * because we cannot reclaim anything and only can loop waiting
> -		 * for somebody to do a work for us
> -		 */
> -		WARN_ON_ONCE_GFP(current->flags & PF_MEMALLOC, gfp_mask);
> -
> -		/*
> -		 * non failing costly orders are a hard requirement which we
> -		 * are not prepared for much so let's warn about these users
> -		 * so that we can identify them and convert them to something
> -		 * else.
> -		 */
> -		WARN_ON_ONCE_GFP(costly_order, gfp_mask);
> -
>  		/*
>  		 * Help non-failing allocations by giving some access to memory
>  		 * reserves normally used for high priority non-blocking



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
  2024-09-01 20:18   ` Vlastimil Babka
@ 2024-09-02  3:23   ` Yafang Shao
  2024-09-02  4:00     ` Barry Song
  2024-09-02  7:40   ` David Hildenbrand
  2024-09-02  7:58   ` Michal Hocko
  3 siblings, 1 reply; 17+ messages in thread
From: Yafang Shao @ 2024-09-02  3:23 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, virtualization, david, 42.hyeyoo, cl, hailong.liu,
	hch, iamjoonsoo.kim, mhocko, penberg, rientjes, roman.gushchin,
	torvalds, urezki, v-songbaohua, vbabka

On Sat, Aug 31, 2024 at 4:29 AM Barry Song <21cnbao@gmail.com> wrote:
>
> From: Barry Song <v-songbaohua@oppo.com>
>
> Three points for this change:
>
> 1. We should consolidate all warnings in one place. Currently, the
>    order > 1 warning is in the hotpath, while others are in less
>    likely scenarios. Moving all warnings to the slowpath will reduce
>    the overhead for order > 1 and increase the visibility of other
>    warnings.
>
> 2. We currently have two warnings for order: one for order > 1 in
>    the hotpath and another for order > costly_order in the laziest
>    path. I suggest standardizing on order > 1 since it’s been in
>    use for a long time.
>
> 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
>    is meant to suppress allocation failure reports, but here we're
>    dealing with bug detection, not allocation failures. So replace
>    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
>
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c81ee5662cc7..e790b4227322 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
>  {
>         struct page *page;
>
> -       /*
> -        * We most definitely don't want callers attempting to
> -        * allocate greater than order-1 page units with __GFP_NOFAIL.
> -        */
> -       WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> -
>         if (likely(pcp_allowed_order(order))) {
>                 page = rmqueue_pcplist(preferred_zone, zone, order,
>                                        migratetype, alloc_flags);
> @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  {
>         bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
>         bool can_compact = gfp_compaction_allowed(gfp_mask);
> +       bool nofail = gfp_mask & __GFP_NOFAIL;
>         const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
>         struct page *page = NULL;
>         unsigned int alloc_flags;
> @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>         unsigned int zonelist_iter_cookie;
>         int reserve_flags;
>
> +       if (unlikely(nofail)) {
> +               /*
> +                * We most definitely don't want callers attempting to
> +                * allocate greater than order-1 page units with __GFP_NOFAIL.
> +                */
> +               WARN_ON_ONCE(order > 1);
> +               /*
> +                * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> +                * otherwise, we may result in lockup.
> +                */
> +               WARN_ON_ONCE(!can_direct_reclaim);
> +               /*
> +                * PF_MEMALLOC request from this context is rather bizarre
> +                * because we cannot reclaim anything and only can loop waiting
> +                * for somebody to do a work for us.
> +                */
> +               WARN_ON_ONCE(current->flags & PF_MEMALLOC);

I believe we should add below warning as well:

  WARN_ON_ONCE(gfp_mask & __GFP_NOMEMALLOC);
  WARN_ON_ONCE(gfp_mask & __GFP_NORETRY);
  WARN_ON_ONCE(gfp_mask & __GFP_RETRY_MAYFAIL);
  ...

I'm not sure if that is enough.
__GFP_NOFAIL is a really horrible thing.

-- 
Regards
Yafang


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-09-02  3:23   ` Yafang Shao
@ 2024-09-02  4:00     ` Barry Song
  2024-09-02  5:47       ` Yafang Shao
  0 siblings, 1 reply; 17+ messages in thread
From: Barry Song @ 2024-09-02  4:00 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, linux-mm, virtualization, david, 42.hyeyoo, cl, hailong.liu,
	hch, iamjoonsoo.kim, mhocko, penberg, rientjes, roman.gushchin,
	torvalds, urezki, v-songbaohua, vbabka

On Mon, Sep 2, 2024 at 3:23 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Sat, Aug 31, 2024 at 4:29 AM Barry Song <21cnbao@gmail.com> wrote:
> >
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Three points for this change:
> >
> > 1. We should consolidate all warnings in one place. Currently, the
> >    order > 1 warning is in the hotpath, while others are in less
> >    likely scenarios. Moving all warnings to the slowpath will reduce
> >    the overhead for order > 1 and increase the visibility of other
> >    warnings.
> >
> > 2. We currently have two warnings for order: one for order > 1 in
> >    the hotpath and another for order > costly_order in the laziest
> >    path. I suggest standardizing on order > 1 since it’s been in
> >    use for a long time.
> >
> > 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
> >    is meant to suppress allocation failure reports, but here we're
> >    dealing with bug detection, not allocation failures. So replace
> >    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> >
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > ---
> >  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
> >  1 file changed, 25 insertions(+), 25 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index c81ee5662cc7..e790b4227322 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
> >  {
> >         struct page *page;
> >
> > -       /*
> > -        * We most definitely don't want callers attempting to
> > -        * allocate greater than order-1 page units with __GFP_NOFAIL.
> > -        */
> > -       WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> > -
> >         if (likely(pcp_allowed_order(order))) {
> >                 page = rmqueue_pcplist(preferred_zone, zone, order,
> >                                        migratetype, alloc_flags);
> > @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >  {
> >         bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
> >         bool can_compact = gfp_compaction_allowed(gfp_mask);
> > +       bool nofail = gfp_mask & __GFP_NOFAIL;
> >         const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
> >         struct page *page = NULL;
> >         unsigned int alloc_flags;
> > @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >         unsigned int zonelist_iter_cookie;
> >         int reserve_flags;
> >
> > +       if (unlikely(nofail)) {
> > +               /*
> > +                * We most definitely don't want callers attempting to
> > +                * allocate greater than order-1 page units with __GFP_NOFAIL.
> > +                */
> > +               WARN_ON_ONCE(order > 1);
> > +               /*
> > +                * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> > +                * otherwise, we may result in lockup.
> > +                */
> > +               WARN_ON_ONCE(!can_direct_reclaim);
> > +               /*
> > +                * PF_MEMALLOC request from this context is rather bizarre
> > +                * because we cannot reclaim anything and only can loop waiting
> > +                * for somebody to do a work for us.
> > +                */
> > +               WARN_ON_ONCE(current->flags & PF_MEMALLOC);
>
> I believe we should add below warning as well:
>
>   WARN_ON_ONCE(gfp_mask & __GFP_NOMEMALLOC);
>   WARN_ON_ONCE(gfp_mask & __GFP_NORETRY);
>   WARN_ON_ONCE(gfp_mask & __GFP_RETRY_MAYFAIL);
>   ...
>
> I'm not sure if that is enough.
> __GFP_NOFAIL is a really horrible thing.

Thanks! I'd prefer to keep this patchset focused on the existing
warnings and bugs. Any new warnings about size limits or checks
for new flags can be addressed separately.

>
> --
> Regards
> Yafang

Thanks
Barry


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-09-02  4:00     ` Barry Song
@ 2024-09-02  5:47       ` Yafang Shao
  0 siblings, 0 replies; 17+ messages in thread
From: Yafang Shao @ 2024-09-02  5:47 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, virtualization, david, 42.hyeyoo, cl, hailong.liu,
	hch, iamjoonsoo.kim, mhocko, penberg, rientjes, roman.gushchin,
	torvalds, urezki, v-songbaohua, vbabka

On Mon, Sep 2, 2024 at 12:00 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Mon, Sep 2, 2024 at 3:23 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Sat, Aug 31, 2024 at 4:29 AM Barry Song <21cnbao@gmail.com> wrote:
> > >
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > Three points for this change:
> > >
> > > 1. We should consolidate all warnings in one place. Currently, the
> > >    order > 1 warning is in the hotpath, while others are in less
> > >    likely scenarios. Moving all warnings to the slowpath will reduce
> > >    the overhead for order > 1 and increase the visibility of other
> > >    warnings.
> > >
> > > 2. We currently have two warnings for order: one for order > 1 in
> > >    the hotpath and another for order > costly_order in the laziest
> > >    path. I suggest standardizing on order > 1 since it’s been in
> > >    use for a long time.
> > >
> > > 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
> > >    is meant to suppress allocation failure reports, but here we're
> > >    dealing with bug detection, not allocation failures. So replace
> > >    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> > >
> > > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > ---
> > >  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
> > >  1 file changed, 25 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index c81ee5662cc7..e790b4227322 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
> > >  {
> > >         struct page *page;
> > >
> > > -       /*
> > > -        * We most definitely don't want callers attempting to
> > > -        * allocate greater than order-1 page units with __GFP_NOFAIL.
> > > -        */
> > > -       WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> > > -
> > >         if (likely(pcp_allowed_order(order))) {
> > >                 page = rmqueue_pcplist(preferred_zone, zone, order,
> > >                                        migratetype, alloc_flags);
> > > @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > >  {
> > >         bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
> > >         bool can_compact = gfp_compaction_allowed(gfp_mask);
> > > +       bool nofail = gfp_mask & __GFP_NOFAIL;
> > >         const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
> > >         struct page *page = NULL;
> > >         unsigned int alloc_flags;
> > > @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > >         unsigned int zonelist_iter_cookie;
> > >         int reserve_flags;
> > >
> > > +       if (unlikely(nofail)) {
> > > +               /*
> > > +                * We most definitely don't want callers attempting to
> > > +                * allocate greater than order-1 page units with __GFP_NOFAIL.
> > > +                */
> > > +               WARN_ON_ONCE(order > 1);
> > > +               /*
> > > +                * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> > > +                * otherwise, we may result in lockup.
> > > +                */
> > > +               WARN_ON_ONCE(!can_direct_reclaim);
> > > +               /*
> > > +                * PF_MEMALLOC request from this context is rather bizarre
> > > +                * because we cannot reclaim anything and only can loop waiting
> > > +                * for somebody to do a work for us.
> > > +                */
> > > +               WARN_ON_ONCE(current->flags & PF_MEMALLOC);
> >
> > I believe we should add below warning as well:
> >
> >   WARN_ON_ONCE(gfp_mask & __GFP_NOMEMALLOC);
> >   WARN_ON_ONCE(gfp_mask & __GFP_NORETRY);
> >   WARN_ON_ONCE(gfp_mask & __GFP_RETRY_MAYFAIL);
> >   ...
> >
> > I'm not sure if that is enough.
> > __GFP_NOFAIL is a really horrible thing.
>
> Thanks! I'd prefer to keep this patchset focused on the existing
> warnings and bugs. Any new warnings about size limits or checks
> for new flags can be addressed separately.

OK
Thanks for your work.

-- 
Regards
Yafang


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL
  2024-08-30 20:28 ` [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL Barry Song
@ 2024-09-02  7:33   ` David Hildenbrand
  2024-09-02  7:58     ` Jason Wang
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2024-09-02  7:33 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm, virtualization
  Cc: 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko, penberg,
	rientjes, roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	laoar.shao, Jason Wang, Xie Yongji

On 30.08.24 22:28, Barry Song wrote:
> From: Jason Wang <jasowang@redhat.com>
> 
> mm doesn't support non-blockable __GFP_NOFAIL allocation. Because
> persisting in providing __GFP_NOFAIL services for non-block users
> who cannot perform direct memory reclaim may only result in an
> endless busy loop.
> 
> Therefore, in such cases, the current mm-core may directly return
> a NULL pointer:
> 
> static inline struct page *
> __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>                                                  struct alloc_context *ac)
> {
>          ...
>          if (gfp_mask & __GFP_NOFAIL) {
>                  /*
>                   * All existing users of the __GFP_NOFAIL are blockable, so warn
>                   * of any new users that actually require GFP_NOWAIT
>                   */
>                  if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
>                          goto fail;
>                  ...
>          }
>          ...
> fail:
>          warn_alloc(gfp_mask, ac->nodemask,
>                          "page allocation failure: order:%u", order);
> got_pg:
>          return page;
> }
> 
> Unfortuantely, vpda does that nofail allocation under non-sleepable
> lock. A possible way to fix that is to move the pages allocation out
> of the lock into the caller, but having to allocate a huge number of
> pages and auxiliary page array seems to be problematic as well per
> Tetsuon: " You should implement proper error handling instead of
> using __GFP_NOFAIL if count can become large."
> 
> So I choose another way, which does not release kernel bounce pages
> when user tries to register userspace bounce pages. Then we can
> avoid allocating in paths where failure is not expected.(e.g in
> the release). We pay this for more memory usage as we don't release
> kernel bounce pages but further optimizations could be done on top.
> 
> Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer")
> Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
> Tested-by: Xie Yongji <xieyongji@bytedance.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> [v-songbaohua@oppo.com: Refine the changelog]
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>   drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++--------
>   drivers/vdpa/vdpa_user/iova_domain.h |  1 +
>   2 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 791d38d6284c..58116f89d8da 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
>   				enum dma_data_direction dir)
>   {
>   	struct vduse_bounce_map *map;
> +	struct page *page;
>   	unsigned int offset;
>   	void *addr;
>   	size_t sz;
> @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
>   			    map->orig_phys == INVALID_PHYS_ADDR))
>   			return;
>   
> -		addr = kmap_local_page(map->bounce_page);
> +		page = domain->user_bounce_pages ?
> +		       map->user_bounce_page : map->bounce_page;
> +
> +		addr = kmap_local_page(page);
>   		do_bounce(map->orig_phys + offset, addr + offset, sz, dir);
>   		kunmap_local(addr);
>   		size -= sz;
> @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain,
>   				memcpy_to_page(pages[i], 0,
>   					       page_address(map->bounce_page),
>   					       PAGE_SIZE);
> -			__free_page(map->bounce_page);
>   		}
> -		map->bounce_page = pages[i];
> +		map->user_bounce_page = pages[i];
>   		get_page(pages[i]);
>   	}
>   	domain->user_bounce_pages = true;
> @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain)
>   		struct page *page = NULL;
>   
>   		map = &domain->bounce_maps[i];
> -		if (WARN_ON(!map->bounce_page))
> +		if (WARN_ON(!map->user_bounce_page))
>   			continue;
>   
>   		/* Copy user page to kernel page if it's in use */
>   		if (map->orig_phys != INVALID_PHYS_ADDR) {
> -			page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL);
> +			page = map->bounce_page;

Why don't we need a kmap_local_page(map->bounce_page) here, but we might 
perform one / have performed one in vduse_domain_bounce?

Maybe we should simply use

memcpy_page(map->bounce_page, 0, map->user_bounce_page, 0, PAGE_SIZE)

?


-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 2/3] mm: document __GFP_NOFAIL must be blockable
  2024-08-30 20:28 ` [PATCH v4 2/3] mm: document __GFP_NOFAIL must be blockable Barry Song
@ 2024-09-02  7:34   ` David Hildenbrand
  0 siblings, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2024-09-02  7:34 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm, virtualization
  Cc: 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko, penberg,
	rientjes, roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	laoar.shao, Christoph Hellwig, Davidlohr Bueso,
	Eugenio Pérez, Jason Wang, Kees Cook, Lorenzo Stoakes,
	Maxime Coquelin, Michael S. Tsirkin, Xuan Zhuo

On 30.08.24 22:28, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Non-blocking allocation with __GFP_NOFAIL is not supported and may still
> result in NULL pointers (if we don't return NULL, we result in busy-loop
> within non-sleepable contexts):
> 
> static inline struct page *
> __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> 						struct alloc_context *ac)
> {
> 	...
> 	/*
> 	 * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
> 	 * we always retry
> 	 */
> 	if (gfp_mask & __GFP_NOFAIL) {
> 		/*
> 		 * All existing users of the __GFP_NOFAIL are blockable, so warn
> 		 * of any new users that actually require GFP_NOWAIT
> 		 */
> 		if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> 			goto fail;
> 		...
> 	}
> 	...
> fail:
> 	warn_alloc(gfp_mask, ac->nodemask,
> 			"page allocation failure: order:%u", order);
> got_pg:
> 	return page;
> }
> 
> Highlight this in the documentation of __GFP_NOFAIL so that non-mm
> subsystems can reject any illegal usage of __GFP_NOFAIL with GFP_ATOMIC,
> GFP_NOWAIT, etc.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: "Eugenio Pérez" <eperezma@redhat.com>
> Cc: Hailong.Liu <hailong.liu@oppo.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
  2024-09-01 20:18   ` Vlastimil Babka
  2024-09-02  3:23   ` Yafang Shao
@ 2024-09-02  7:40   ` David Hildenbrand
  2024-09-02  7:58   ` Michal Hocko
  3 siblings, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2024-09-02  7:40 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm, virtualization
  Cc: 42.hyeyoo, cl, hailong.liu, hch, iamjoonsoo.kim, mhocko, penberg,
	rientjes, roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	laoar.shao

On 30.08.24 22:28, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Three points for this change:
> 
> 1. We should consolidate all warnings in one place. Currently, the
>     order > 1 warning is in the hotpath, while others are in less
>     likely scenarios. Moving all warnings to the slowpath will reduce
>     the overhead for order > 1 and increase the visibility of other
>     warnings.
> 
> 2. We currently have two warnings for order: one for order > 1 in
>     the hotpath and another for order > costly_order in the laziest
>     path. I suggest standardizing on order > 1 since it’s been in
>     use for a long time.
> 
> 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
>     is meant to suppress allocation failure reports, but here we're
>     dealing with bug detection, not allocation failures. So replace
>     WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>   mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
>   1 file changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c81ee5662cc7..e790b4227322 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
>   {
>   	struct page *page;
>   
> -	/*
> -	 * We most definitely don't want callers attempting to
> -	 * allocate greater than order-1 page units with __GFP_NOFAIL.
> -	 */
> -	WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> -
>   	if (likely(pcp_allowed_order(order))) {
>   		page = rmqueue_pcplist(preferred_zone, zone, order,
>   				       migratetype, alloc_flags);
> @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>   {
>   	bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
>   	bool can_compact = gfp_compaction_allowed(gfp_mask);
> +	bool nofail = gfp_mask & __GFP_NOFAIL;
>   	const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
>   	struct page *page = NULL;
>   	unsigned int alloc_flags;
> @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>   	unsigned int zonelist_iter_cookie;
>   	int reserve_flags;
>   
> +	if (unlikely(nofail)) {
> +		/*
> +		 * We most definitely don't want callers attempting to
> +		 * allocate greater than order-1 page units with __GFP_NOFAIL.
> +		 */

Acked-by: David Hildenbrand <david@redhat.com>

Should we also clarify that in the docs? Currently we have "Using this 
flag for costly allocations is _highly_ discouraged."

We'd likely want to say something like "Allocating pages from the buddy 
with __GFP_NOFAIL and order > 1 is not supported."

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL
  2024-09-02  7:33   ` David Hildenbrand
@ 2024-09-02  7:58     ` Jason Wang
  2024-09-02  8:30       ` David Hildenbrand
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Wang @ 2024-09-02  7:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Barry Song, akpm, linux-mm, virtualization, 42.hyeyoo, cl,
	hailong.liu, hch, iamjoonsoo.kim, mhocko, penberg, rientjes,
	roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	laoar.shao, Xie Yongji

On Mon, Sep 2, 2024 at 3:33 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 30.08.24 22:28, Barry Song wrote:
> > From: Jason Wang <jasowang@redhat.com>
> >
> > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because
> > persisting in providing __GFP_NOFAIL services for non-block users
> > who cannot perform direct memory reclaim may only result in an
> > endless busy loop.
> >
> > Therefore, in such cases, the current mm-core may directly return
> > a NULL pointer:
> >
> > static inline struct page *
> > __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >                                                  struct alloc_context *ac)
> > {
> >          ...
> >          if (gfp_mask & __GFP_NOFAIL) {
> >                  /*
> >                   * All existing users of the __GFP_NOFAIL are blockable, so warn
> >                   * of any new users that actually require GFP_NOWAIT
> >                   */
> >                  if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> >                          goto fail;
> >                  ...
> >          }
> >          ...
> > fail:
> >          warn_alloc(gfp_mask, ac->nodemask,
> >                          "page allocation failure: order:%u", order);
> > got_pg:
> >          return page;
> > }
> >
> > Unfortuantely, vpda does that nofail allocation under non-sleepable
> > lock. A possible way to fix that is to move the pages allocation out
> > of the lock into the caller, but having to allocate a huge number of
> > pages and auxiliary page array seems to be problematic as well per
> > Tetsuon: " You should implement proper error handling instead of
> > using __GFP_NOFAIL if count can become large."
> >
> > So I choose another way, which does not release kernel bounce pages
> > when user tries to register userspace bounce pages. Then we can
> > avoid allocating in paths where failure is not expected.(e.g in
> > the release). We pay this for more memory usage as we don't release
> > kernel bounce pages but further optimizations could be done on top.
> >
> > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer")
> > Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
> > Tested-by: Xie Yongji <xieyongji@bytedance.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > [v-songbaohua@oppo.com: Refine the changelog]
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > ---
> >   drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++--------
> >   drivers/vdpa/vdpa_user/iova_domain.h |  1 +
> >   2 files changed, 12 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> > index 791d38d6284c..58116f89d8da 100644
> > --- a/drivers/vdpa/vdpa_user/iova_domain.c
> > +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> > @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
> >                               enum dma_data_direction dir)
> >   {
> >       struct vduse_bounce_map *map;
> > +     struct page *page;
> >       unsigned int offset;
> >       void *addr;
> >       size_t sz;
> > @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
> >                           map->orig_phys == INVALID_PHYS_ADDR))
> >                       return;
> >
> > -             addr = kmap_local_page(map->bounce_page);
> > +             page = domain->user_bounce_pages ?
> > +                    map->user_bounce_page : map->bounce_page;
> > +
> > +             addr = kmap_local_page(page);
> >               do_bounce(map->orig_phys + offset, addr + offset, sz, dir);
> >               kunmap_local(addr);
> >               size -= sz;
> > @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain,
> >                               memcpy_to_page(pages[i], 0,
> >                                              page_address(map->bounce_page),
> >                                              PAGE_SIZE);
> > -                     __free_page(map->bounce_page);
> >               }
> > -             map->bounce_page = pages[i];
> > +             map->user_bounce_page = pages[i];
> >               get_page(pages[i]);
> >       }
> >       domain->user_bounce_pages = true;
> > @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain)
> >               struct page *page = NULL;
> >
> >               map = &domain->bounce_maps[i];
> > -             if (WARN_ON(!map->bounce_page))
> > +             if (WARN_ON(!map->user_bounce_page))
> >                       continue;
> >
> >               /* Copy user page to kernel page if it's in use */
> >               if (map->orig_phys != INVALID_PHYS_ADDR) {
> > -                     page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL);
> > +                     page = map->bounce_page;
>
> Why don't we need a kmap_local_page(map->bounce_page) here, but we might
> perform one / have performed one in vduse_domain_bounce?

I think it's another bug that needs to be fixed.

Yongji, do you want to fix this?

Thanks

>
> Maybe we should simply use
>
> memcpy_page(map->bounce_page, 0, map->user_bounce_page, 0, PAGE_SIZE)
>
> ?
>
>
> --
> Cheers,
>
> David / dhildenb
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
                     ` (2 preceding siblings ...)
  2024-09-02  7:40   ` David Hildenbrand
@ 2024-09-02  7:58   ` Michal Hocko
  2024-09-03 22:39     ` Barry Song
  3 siblings, 1 reply; 17+ messages in thread
From: Michal Hocko @ 2024-09-02  7:58 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, virtualization, david, 42.hyeyoo, cl, hailong.liu,
	hch, iamjoonsoo.kim, penberg, rientjes, roman.gushchin, torvalds,
	urezki, v-songbaohua, vbabka, laoar.shao

On Sat 31-08-24 08:28:23, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Three points for this change:
> 
> 1. We should consolidate all warnings in one place. Currently, the
>    order > 1 warning is in the hotpath, while others are in less
>    likely scenarios. Moving all warnings to the slowpath will reduce
>    the overhead for order > 1 and increase the visibility of other
>    warnings.
> 
> 2. We currently have two warnings for order: one for order > 1 in
>    the hotpath and another for order > costly_order in the laziest
>    path. I suggest standardizing on order > 1 since it’s been in
>    use for a long time.
> 
> 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
>    is meant to suppress allocation failure reports, but here we're
>    dealing with bug detection, not allocation failures. So replace
>    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Updating the doc about order > 1 sounds like it would still fall into
the scope of this patch. I don not think we absolutely have to document
each unsupported gfp flags combination for GFP_NOFAIL but the order is a
good addition with a note that kvmalloc should be used instead in such a
case.

> ---
>  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c81ee5662cc7..e790b4227322 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
>  {
>  	struct page *page;
>  
> -	/*
> -	 * We most definitely don't want callers attempting to
> -	 * allocate greater than order-1 page units with __GFP_NOFAIL.
> -	 */
> -	WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> -
>  	if (likely(pcp_allowed_order(order))) {
>  		page = rmqueue_pcplist(preferred_zone, zone, order,
>  				       migratetype, alloc_flags);
> @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  {
>  	bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
>  	bool can_compact = gfp_compaction_allowed(gfp_mask);
> +	bool nofail = gfp_mask & __GFP_NOFAIL;
>  	const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
>  	struct page *page = NULL;
>  	unsigned int alloc_flags;
> @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	unsigned int zonelist_iter_cookie;
>  	int reserve_flags;
>  
> +	if (unlikely(nofail)) {
> +		/*
> +		 * We most definitely don't want callers attempting to
> +		 * allocate greater than order-1 page units with __GFP_NOFAIL.
> +		 */
> +		WARN_ON_ONCE(order > 1);
> +		/*
> +		 * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> +		 * otherwise, we may result in lockup.
> +		 */
> +		WARN_ON_ONCE(!can_direct_reclaim);
> +		/*
> +		 * PF_MEMALLOC request from this context is rather bizarre
> +		 * because we cannot reclaim anything and only can loop waiting
> +		 * for somebody to do a work for us.
> +		 */
> +		WARN_ON_ONCE(current->flags & PF_MEMALLOC);
> +	}
> +
>  restart:
>  	compaction_retries = 0;
>  	no_progress_loops = 0;
> @@ -4404,29 +4418,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	 * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
>  	 * we always retry
>  	 */
> -	if (gfp_mask & __GFP_NOFAIL) {
> +	if (unlikely(nofail)) {
>  		/*
> -		 * All existing users of the __GFP_NOFAIL are blockable, so warn
> -		 * of any new users that actually require GFP_NOWAIT
> +		 * Lacking direct_reclaim we can't do anything to reclaim memory,
> +		 * we disregard these unreasonable nofail requests and still
> +		 * return NULL
>  		 */
> -		if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> +		if (!can_direct_reclaim)
>  			goto fail;
>  
> -		/*
> -		 * PF_MEMALLOC request from this context is rather bizarre
> -		 * because we cannot reclaim anything and only can loop waiting
> -		 * for somebody to do a work for us
> -		 */
> -		WARN_ON_ONCE_GFP(current->flags & PF_MEMALLOC, gfp_mask);
> -
> -		/*
> -		 * non failing costly orders are a hard requirement which we
> -		 * are not prepared for much so let's warn about these users
> -		 * so that we can identify them and convert them to something
> -		 * else.
> -		 */
> -		WARN_ON_ONCE_GFP(costly_order, gfp_mask);
> -
>  		/*
>  		 * Help non-failing allocations by giving some access to memory
>  		 * reserves normally used for high priority non-blocking
> -- 
> 2.34.1

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL
  2024-09-02  7:58     ` Jason Wang
@ 2024-09-02  8:30       ` David Hildenbrand
  2024-09-03  0:35         ` Jason Wang
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2024-09-02  8:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: Barry Song, akpm, linux-mm, virtualization, 42.hyeyoo, cl,
	hailong.liu, hch, iamjoonsoo.kim, mhocko, penberg, rientjes,
	roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	laoar.shao, Xie Yongji

On 02.09.24 09:58, Jason Wang wrote:
> On Mon, Sep 2, 2024 at 3:33 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 30.08.24 22:28, Barry Song wrote:
>>> From: Jason Wang <jasowang@redhat.com>
>>>
>>> mm doesn't support non-blockable __GFP_NOFAIL allocation. Because
>>> persisting in providing __GFP_NOFAIL services for non-block users
>>> who cannot perform direct memory reclaim may only result in an
>>> endless busy loop.
>>>
>>> Therefore, in such cases, the current mm-core may directly return
>>> a NULL pointer:
>>>
>>> static inline struct page *
>>> __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>>>                                                   struct alloc_context *ac)
>>> {
>>>           ...
>>>           if (gfp_mask & __GFP_NOFAIL) {
>>>                   /*
>>>                    * All existing users of the __GFP_NOFAIL are blockable, so warn
>>>                    * of any new users that actually require GFP_NOWAIT
>>>                    */
>>>                   if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
>>>                           goto fail;
>>>                   ...
>>>           }
>>>           ...
>>> fail:
>>>           warn_alloc(gfp_mask, ac->nodemask,
>>>                           "page allocation failure: order:%u", order);
>>> got_pg:
>>>           return page;
>>> }
>>>
>>> Unfortuantely, vpda does that nofail allocation under non-sleepable
>>> lock. A possible way to fix that is to move the pages allocation out
>>> of the lock into the caller, but having to allocate a huge number of
>>> pages and auxiliary page array seems to be problematic as well per
>>> Tetsuon: " You should implement proper error handling instead of
>>> using __GFP_NOFAIL if count can become large."
>>>
>>> So I choose another way, which does not release kernel bounce pages
>>> when user tries to register userspace bounce pages. Then we can
>>> avoid allocating in paths where failure is not expected.(e.g in
>>> the release). We pay this for more memory usage as we don't release
>>> kernel bounce pages but further optimizations could be done on top.
>>>
>>> Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer")
>>> Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
>>> Tested-by: Xie Yongji <xieyongji@bytedance.com>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> [v-songbaohua@oppo.com: Refine the changelog]
>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>>> ---
>>>    drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++--------
>>>    drivers/vdpa/vdpa_user/iova_domain.h |  1 +
>>>    2 files changed, 12 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
>>> index 791d38d6284c..58116f89d8da 100644
>>> --- a/drivers/vdpa/vdpa_user/iova_domain.c
>>> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
>>> @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
>>>                                enum dma_data_direction dir)
>>>    {
>>>        struct vduse_bounce_map *map;
>>> +     struct page *page;
>>>        unsigned int offset;
>>>        void *addr;
>>>        size_t sz;
>>> @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
>>>                            map->orig_phys == INVALID_PHYS_ADDR))
>>>                        return;
>>>
>>> -             addr = kmap_local_page(map->bounce_page);
>>> +             page = domain->user_bounce_pages ?
>>> +                    map->user_bounce_page : map->bounce_page;
>>> +
>>> +             addr = kmap_local_page(page);
>>>                do_bounce(map->orig_phys + offset, addr + offset, sz, dir);
>>>                kunmap_local(addr);
>>>                size -= sz;
>>> @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain,
>>>                                memcpy_to_page(pages[i], 0,
>>>                                               page_address(map->bounce_page),
>>>                                               PAGE_SIZE);
>>> -                     __free_page(map->bounce_page);
>>>                }
>>> -             map->bounce_page = pages[i];
>>> +             map->user_bounce_page = pages[i];
>>>                get_page(pages[i]);
>>>        }
>>>        domain->user_bounce_pages = true;
>>> @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain)
>>>                struct page *page = NULL;
>>>
>>>                map = &domain->bounce_maps[i];
>>> -             if (WARN_ON(!map->bounce_page))
>>> +             if (WARN_ON(!map->user_bounce_page))
>>>                        continue;
>>>
>>>                /* Copy user page to kernel page if it's in use */
>>>                if (map->orig_phys != INVALID_PHYS_ADDR) {
>>> -                     page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL);
>>> +                     page = map->bounce_page;
>>
>> Why don't we need a kmap_local_page(map->bounce_page) here, but we might
>> perform one / have performed one in vduse_domain_bounce?
> 
> I think it's another bug that needs to be fixed.
> 
> Yongji, do you want to fix this?

Or maybe it works because "map->bounce_page" is now always a kernel 
page, and never one from user space that might reside in highmem.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL
  2024-09-02  8:30       ` David Hildenbrand
@ 2024-09-03  0:35         ` Jason Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Jason Wang @ 2024-09-03  0:35 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Barry Song, akpm, linux-mm, virtualization, 42.hyeyoo, cl,
	hailong.liu, hch, iamjoonsoo.kim, mhocko, penberg, rientjes,
	roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	laoar.shao, Xie Yongji

On Mon, Sep 2, 2024 at 4:30 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 02.09.24 09:58, Jason Wang wrote:
> > On Mon, Sep 2, 2024 at 3:33 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 30.08.24 22:28, Barry Song wrote:
> >>> From: Jason Wang <jasowang@redhat.com>
> >>>
> >>> mm doesn't support non-blockable __GFP_NOFAIL allocation. Because
> >>> persisting in providing __GFP_NOFAIL services for non-block users
> >>> who cannot perform direct memory reclaim may only result in an
> >>> endless busy loop.
> >>>
> >>> Therefore, in such cases, the current mm-core may directly return
> >>> a NULL pointer:
> >>>
> >>> static inline struct page *
> >>> __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >>>                                                   struct alloc_context *ac)
> >>> {
> >>>           ...
> >>>           if (gfp_mask & __GFP_NOFAIL) {
> >>>                   /*
> >>>                    * All existing users of the __GFP_NOFAIL are blockable, so warn
> >>>                    * of any new users that actually require GFP_NOWAIT
> >>>                    */
> >>>                   if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> >>>                           goto fail;
> >>>                   ...
> >>>           }
> >>>           ...
> >>> fail:
> >>>           warn_alloc(gfp_mask, ac->nodemask,
> >>>                           "page allocation failure: order:%u", order);
> >>> got_pg:
> >>>           return page;
> >>> }
> >>>
> >>> Unfortuantely, vpda does that nofail allocation under non-sleepable
> >>> lock. A possible way to fix that is to move the pages allocation out
> >>> of the lock into the caller, but having to allocate a huge number of
> >>> pages and auxiliary page array seems to be problematic as well per
> >>> Tetsuon: " You should implement proper error handling instead of
> >>> using __GFP_NOFAIL if count can become large."
> >>>
> >>> So I choose another way, which does not release kernel bounce pages
> >>> when user tries to register userspace bounce pages. Then we can
> >>> avoid allocating in paths where failure is not expected.(e.g in
> >>> the release). We pay this for more memory usage as we don't release
> >>> kernel bounce pages but further optimizations could be done on top.
> >>>
> >>> Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer")
> >>> Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
> >>> Tested-by: Xie Yongji <xieyongji@bytedance.com>
> >>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>> [v-songbaohua@oppo.com: Refine the changelog]
> >>> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> >>> ---
> >>>    drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++--------
> >>>    drivers/vdpa/vdpa_user/iova_domain.h |  1 +
> >>>    2 files changed, 12 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> >>> index 791d38d6284c..58116f89d8da 100644
> >>> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> >>> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> >>> @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
> >>>                                enum dma_data_direction dir)
> >>>    {
> >>>        struct vduse_bounce_map *map;
> >>> +     struct page *page;
> >>>        unsigned int offset;
> >>>        void *addr;
> >>>        size_t sz;
> >>> @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain,
> >>>                            map->orig_phys == INVALID_PHYS_ADDR))
> >>>                        return;
> >>>
> >>> -             addr = kmap_local_page(map->bounce_page);
> >>> +             page = domain->user_bounce_pages ?
> >>> +                    map->user_bounce_page : map->bounce_page;
> >>> +
> >>> +             addr = kmap_local_page(page);
> >>>                do_bounce(map->orig_phys + offset, addr + offset, sz, dir);
> >>>                kunmap_local(addr);
> >>>                size -= sz;
> >>> @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain,
> >>>                                memcpy_to_page(pages[i], 0,
> >>>                                               page_address(map->bounce_page),
> >>>                                               PAGE_SIZE);
> >>> -                     __free_page(map->bounce_page);
> >>>                }
> >>> -             map->bounce_page = pages[i];
> >>> +             map->user_bounce_page = pages[i];
> >>>                get_page(pages[i]);
> >>>        }
> >>>        domain->user_bounce_pages = true;
> >>> @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain)
> >>>                struct page *page = NULL;
> >>>
> >>>                map = &domain->bounce_maps[i];
> >>> -             if (WARN_ON(!map->bounce_page))
> >>> +             if (WARN_ON(!map->user_bounce_page))
> >>>                        continue;
> >>>
> >>>                /* Copy user page to kernel page if it's in use */
> >>>                if (map->orig_phys != INVALID_PHYS_ADDR) {
> >>> -                     page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL);
> >>> +                     page = map->bounce_page;
> >>
> >> Why don't we need a kmap_local_page(map->bounce_page) here, but we might
> >> perform one / have performed one in vduse_domain_bounce?
> >
> > I think it's another bug that needs to be fixed.
> >
> > Yongji, do you want to fix this?
>
> Or maybe it works because "map->bounce_page" is now always a kernel
> page,

Yes, the userspace bounce page is not user_bounce_page.

> and never one from user space that might reside in highmem.

Right. So we are actually fine :)

Thanks

>
> --
> Cheers,
>
> David / dhildenb
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-09-02  7:58   ` Michal Hocko
@ 2024-09-03 22:39     ` Barry Song
  2024-09-04  7:22       ` Michal Hocko
  0 siblings, 1 reply; 17+ messages in thread
From: Barry Song @ 2024-09-03 22:39 UTC (permalink / raw)
  To: mhocko
  Cc: 21cnbao, 42.hyeyoo, akpm, cl, david, hailong.liu, hch,
	iamjoonsoo.kim, laoar.shao, linux-mm, penberg, rientjes,
	roman.gushchin, torvalds, urezki, v-songbaohua, vbabka,
	virtualization

On Mon, Sep 2, 2024 at 7:58 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Sat 31-08-24 08:28:23, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Three points for this change:
> >
> > 1. We should consolidate all warnings in one place. Currently, the
> >    order > 1 warning is in the hotpath, while others are in less
> >    likely scenarios. Moving all warnings to the slowpath will reduce
> >    the overhead for order > 1 and increase the visibility of other
> >    warnings.
> >
> > 2. We currently have two warnings for order: one for order > 1 in
> >    the hotpath and another for order > costly_order in the laziest
> >    path. I suggest standardizing on order > 1 since it’s been in
> >    use for a long time.
> >
> > 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
> >    is meant to suppress allocation failure reports, but here we're
> >    dealing with bug detection, not allocation failures. So replace
> >    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> >
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>
> Acked-by: Michal Hocko <mhocko@suse.com>
>
> Updating the doc about order > 1 sounds like it would still fall into
> the scope of this patch. I don not think we absolutely have to document
> each unsupported gfp flags combination for GFP_NOFAIL but the order is a
> good addition with a note that kvmalloc should be used instead in such a
> case.

Hi Andrew,
If there are no objections from Michal and David, could you please
squash the following:

From fc7a2a49e8d0811d706d13d2080393274f316806 Mon Sep 17 00:00:00 2001
From: Barry Song <v-songbaohua@oppo.com>
Date: Wed, 4 Sep 2024 10:26:19 +1200
Subject: [PATCH] mm: also update the doc for __GFP_NOFAIL with order > 1

Obviously we only support order <= 1 __GFP_NOFAIL allocation and if
someone wants larger memory, they should consider using kvmalloc()
instead.

Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 include/linux/gfp_types.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
index 4a1fa7706b0c..65db9349f905 100644
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -253,7 +253,8 @@ enum {
  * used only when there is no reasonable failure policy) but it is
  * definitely preferable to use the flag rather than opencode endless
  * loop around allocator.
- * Using this flag for costly allocations is _highly_ discouraged.
+ * Allocating pages from the buddy with __GFP_NOFAIL and order > 1 is
+ * not supported. Please consider using kvmalloc() instead.
  */
 #define __GFP_IO	((__force gfp_t)___GFP_IO)
 #define __GFP_FS	((__force gfp_t)___GFP_FS)
-- 
2.34.1


>
> > ---
> >  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
> >  1 file changed, 25 insertions(+), 25 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index c81ee5662cc7..e790b4227322 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
> >  {
> >       struct page *page;
> > 
> > -     /*
> > -      * We most definitely don't want callers attempting to
> > -      * allocate greater than order-1 page units with __GFP_NOFAIL.
> > -      */
> > -     WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> > -
> >       if (likely(pcp_allowed_order(order))) {
> >               page = rmqueue_pcplist(preferred_zone, zone, order,
> >                                      migratetype, alloc_flags);
> > @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >  {
> >       bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
> >       bool can_compact = gfp_compaction_allowed(gfp_mask);
> > +     bool nofail = gfp_mask & __GFP_NOFAIL;
> >       const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
> >       struct page *page = NULL;
> >       unsigned int alloc_flags;
> > @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >       unsigned int zonelist_iter_cookie;
> >       int reserve_flags;
> > 
> > +     if (unlikely(nofail)) {
> > +             /*
> > +              * We most definitely don't want callers attempting to
> > +              * allocate greater than order-1 page units with __GFP_NOFAIL.
> > +              */
> > +             WARN_ON_ONCE(order > 1);
> > +             /*
> > +              * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> > +              * otherwise, we may result in lockup.
> > +              */
> > +             WARN_ON_ONCE(!can_direct_reclaim);
> > +             /*
> > +              * PF_MEMALLOC request from this context is rather bizarre
> > +              * because we cannot reclaim anything and only can loop waiting
> > +              * for somebody to do a work for us.
> > +              */
> > +             WARN_ON_ONCE(current->flags & PF_MEMALLOC);
> > +     }
> > +
> >  restart:
> >       compaction_retries = 0;
> >       no_progress_loops = 0;
> > @@ -4404,29 +4418,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >        * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
> >        * we always retry
> >        */
> > -     if (gfp_mask & __GFP_NOFAIL) {
> > +     if (unlikely(nofail)) {
> >               /*
> > -              * All existing users of the __GFP_NOFAIL are blockable, so warn
> > -              * of any new users that actually require GFP_NOWAIT
> > +              * Lacking direct_reclaim we can't do anything to reclaim memory,
> > +              * we disregard these unreasonable nofail requests and still
> > +              * return NULL
> >                */
> > -             if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> > +             if (!can_direct_reclaim)
> >                       goto fail;
> > 
> > -             /*
> > -              * PF_MEMALLOC request from this context is rather bizarre
> > -              * because we cannot reclaim anything and only can loop waiting
> > -              * for somebody to do a work for us
> > -              */
> > -             WARN_ON_ONCE_GFP(current->flags & PF_MEMALLOC, gfp_mask);
> > -
> > -             /*
> > -              * non failing costly orders are a hard requirement which we
> > -              * are not prepared for much so let's warn about these users
> > -              * so that we can identify them and convert them to something
> > -              * else.
> > -              */
> > -             WARN_ON_ONCE_GFP(costly_order, gfp_mask);
> > -
> >               /*
> >                * Help non-failing allocations by giving some access to memory
> >                * reserves normally used for high priority non-blocking
> > --
> > 2.34.1
>
> --
> Michal Hocko
> SUSE Labs

Thanks
Barry



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner
  2024-09-03 22:39     ` Barry Song
@ 2024-09-04  7:22       ` Michal Hocko
  0 siblings, 0 replies; 17+ messages in thread
From: Michal Hocko @ 2024-09-04  7:22 UTC (permalink / raw)
  To: Barry Song
  Cc: 42.hyeyoo, akpm, cl, david, hailong.liu, hch, iamjoonsoo.kim,
	laoar.shao, linux-mm, penberg, rientjes, roman.gushchin, torvalds,
	urezki, v-songbaohua, vbabka, virtualization

On Wed 04-09-24 10:39:35, Barry Song wrote:
> On Mon, Sep 2, 2024 at 7:58 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Sat 31-08-24 08:28:23, Barry Song wrote:
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > Three points for this change:
> > >
> > > 1. We should consolidate all warnings in one place. Currently, the
> > >    order > 1 warning is in the hotpath, while others are in less
> > >    likely scenarios. Moving all warnings to the slowpath will reduce
> > >    the overhead for order > 1 and increase the visibility of other
> > >    warnings.
> > >
> > > 2. We currently have two warnings for order: one for order > 1 in
> > >    the hotpath and another for order > costly_order in the laziest
> > >    path. I suggest standardizing on order > 1 since it’s been in
> > >    use for a long time.
> > >
> > > 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN
> > >    is meant to suppress allocation failure reports, but here we're
> > >    dealing with bug detection, not allocation failures. So replace
> > >    WARN_ON_ONCE_GFP by WARN_ON_ONCE.
> > >
> > > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> >
> > Acked-by: Michal Hocko <mhocko@suse.com>
> >
> > Updating the doc about order > 1 sounds like it would still fall into
> > the scope of this patch. I don not think we absolutely have to document
> > each unsupported gfp flags combination for GFP_NOFAIL but the order is a
> > good addition with a note that kvmalloc should be used instead in such a
> > case.
> 
> Hi Andrew,
> If there are no objections from Michal and David, could you please
> squash the following:
> 
> >From fc7a2a49e8d0811d706d13d2080393274f316806 Mon Sep 17 00:00:00 2001
> From: Barry Song <v-songbaohua@oppo.com>
> Date: Wed, 4 Sep 2024 10:26:19 +1200
> Subject: [PATCH] mm: also update the doc for __GFP_NOFAIL with order > 1
> 
> Obviously we only support order <= 1 __GFP_NOFAIL allocation and if
> someone wants larger memory, they should consider using kvmalloc()
> instead.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

LGTM. Thanks!

> ---
>  include/linux/gfp_types.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
> index 4a1fa7706b0c..65db9349f905 100644
> --- a/include/linux/gfp_types.h
> +++ b/include/linux/gfp_types.h
> @@ -253,7 +253,8 @@ enum {
>   * used only when there is no reasonable failure policy) but it is
>   * definitely preferable to use the flag rather than opencode endless
>   * loop around allocator.
> - * Using this flag for costly allocations is _highly_ discouraged.
> + * Allocating pages from the buddy with __GFP_NOFAIL and order > 1 is
> + * not supported. Please consider using kvmalloc() instead.
>   */
>  #define __GFP_IO	((__force gfp_t)___GFP_IO)
>  #define __GFP_FS	((__force gfp_t)___GFP_FS)
> -- 
> 2.34.1
> 
> 
> >
> > > ---
> > >  mm/page_alloc.c | 50 ++++++++++++++++++++++++-------------------------
> > >  1 file changed, 25 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index c81ee5662cc7..e790b4227322 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -3033,12 +3033,6 @@ struct page *rmqueue(struct zone *preferred_zone,
> > >  {
> > >       struct page *page;
> > > 
> > > -     /*
> > > -      * We most definitely don't want callers attempting to
> > > -      * allocate greater than order-1 page units with __GFP_NOFAIL.
> > > -      */
> > > -     WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> > > -
> > >       if (likely(pcp_allowed_order(order))) {
> > >               page = rmqueue_pcplist(preferred_zone, zone, order,
> > >                                      migratetype, alloc_flags);
> > > @@ -4175,6 +4169,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > >  {
> > >       bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
> > >       bool can_compact = gfp_compaction_allowed(gfp_mask);
> > > +     bool nofail = gfp_mask & __GFP_NOFAIL;
> > >       const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
> > >       struct page *page = NULL;
> > >       unsigned int alloc_flags;
> > > @@ -4187,6 +4182,25 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > >       unsigned int zonelist_iter_cookie;
> > >       int reserve_flags;
> > > 
> > > +     if (unlikely(nofail)) {
> > > +             /*
> > > +              * We most definitely don't want callers attempting to
> > > +              * allocate greater than order-1 page units with __GFP_NOFAIL.
> > > +              */
> > > +             WARN_ON_ONCE(order > 1);
> > > +             /*
> > > +              * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> > > +              * otherwise, we may result in lockup.
> > > +              */
> > > +             WARN_ON_ONCE(!can_direct_reclaim);
> > > +             /*
> > > +              * PF_MEMALLOC request from this context is rather bizarre
> > > +              * because we cannot reclaim anything and only can loop waiting
> > > +              * for somebody to do a work for us.
> > > +              */
> > > +             WARN_ON_ONCE(current->flags & PF_MEMALLOC);
> > > +     }
> > > +
> > >  restart:
> > >       compaction_retries = 0;
> > >       no_progress_loops = 0;
> > > @@ -4404,29 +4418,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > >        * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
> > >        * we always retry
> > >        */
> > > -     if (gfp_mask & __GFP_NOFAIL) {
> > > +     if (unlikely(nofail)) {
> > >               /*
> > > -              * All existing users of the __GFP_NOFAIL are blockable, so warn
> > > -              * of any new users that actually require GFP_NOWAIT
> > > +              * Lacking direct_reclaim we can't do anything to reclaim memory,
> > > +              * we disregard these unreasonable nofail requests and still
> > > +              * return NULL
> > >                */
> > > -             if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> > > +             if (!can_direct_reclaim)
> > >                       goto fail;
> > > 
> > > -             /*
> > > -              * PF_MEMALLOC request from this context is rather bizarre
> > > -              * because we cannot reclaim anything and only can loop waiting
> > > -              * for somebody to do a work for us
> > > -              */
> > > -             WARN_ON_ONCE_GFP(current->flags & PF_MEMALLOC, gfp_mask);
> > > -
> > > -             /*
> > > -              * non failing costly orders are a hard requirement which we
> > > -              * are not prepared for much so let's warn about these users
> > > -              * so that we can identify them and convert them to something
> > > -              * else.
> > > -              */
> > > -             WARN_ON_ONCE_GFP(costly_order, gfp_mask);
> > > -
> > >               /*
> > >                * Help non-failing allocations by giving some access to memory
> > >                * reserves normally used for high priority non-blocking
> > > --
> > > 2.34.1
> >
> > --
> > Michal Hocko
> > SUSE Labs
> 
> Thanks
> Barry

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-09-04  7:22 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30 20:28 [PATCH v4 0/3] mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn Barry Song
2024-08-30 20:28 ` [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL Barry Song
2024-09-02  7:33   ` David Hildenbrand
2024-09-02  7:58     ` Jason Wang
2024-09-02  8:30       ` David Hildenbrand
2024-09-03  0:35         ` Jason Wang
2024-08-30 20:28 ` [PATCH v4 2/3] mm: document __GFP_NOFAIL must be blockable Barry Song
2024-09-02  7:34   ` David Hildenbrand
2024-08-30 20:28 ` [PATCH v4 3/3] mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner Barry Song
2024-09-01 20:18   ` Vlastimil Babka
2024-09-02  3:23   ` Yafang Shao
2024-09-02  4:00     ` Barry Song
2024-09-02  5:47       ` Yafang Shao
2024-09-02  7:40   ` David Hildenbrand
2024-09-02  7:58   ` Michal Hocko
2024-09-03 22:39     ` Barry Song
2024-09-04  7:22       ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).