All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Hailong Liu <hailong.liu@oppo.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <21cnbao@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Tangquan Zheng <zhengtangquan@oppo.com>,
	stable@vger.kernel.org, Baoquan He <bhe@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
Date: Tue, 27 Aug 2024 08:49:35 +0200	[thread overview]
Message-ID: <Zs12_8AZ0k_WRWUE@tiehlicka> (raw)
In-Reply-To: <Zsx3ULRaVu5Lh46Q@pc636>

On Mon 26-08-24 14:38:40, Uladzislau Rezki wrote:
> On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
> > On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote:
> > [...]
> > > @@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > >  	set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
> > >  	page_order = vm_area_page_order(area);
> > >  
> > > -	area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
> > > +	/*
> > > +	 * Higher order nofail allocations are really expensive and
> > > +	 * potentially dangerous (pre-mature OOM, disruptive reclaim
> > > +	 * and compaction etc.
> > > +	 *
> > > +	 * Please note, the __vmalloc_node_range_noprof() falls-back
> > > +	 * to order-0 pages if high-order attempt has been unsuccessful.
> > > +	 */
> > > +	area->nr_pages = vm_area_alloc_pages(page_order ?
> > > +		gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
> > >  		node, page_order, nr_small_pages, area->pages);
> > >  
> > >  	atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
> > > <snip>
> > > 
> > > Is that aligned with your wish?
> > 
> > I am not a great fan of modifying gfp_mask inside the ternary operator
> > like that. It makes the code harder to read. Is there any actual reason
> > to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL
> > handling for all orders at the same place?
> > 
> 1. So, for bulk we have below:
> 
> /* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
> 
> I am not sure if we need it but it says it does not support it which
> is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There
> is a fallback to a single page allocator. If passing __GFP_NOFAIL does
> not trigger any warning or panic a system, then i do not follow why
> we drop that flag.
> 
> Is that odd?

I suspect this was a pre-caution more than anything.

> 2. High-order allocations. Do you think we should not care much about
> it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> keeping NOFAIL for high-order sounds like not a good approach to me.

We should avoid high order allocations with GFP_NOFAIL at all cost.

> 3. "... at the same place?"
> Do you mean in the __vmalloc_node_range_noprof()?
> 
> __vmalloc_node_range_noprof()
>     -> __vmalloc_area_node(gfp_mask)
>         -> vm_area_alloc_pages()
> 
> if, so it is not straight forward, i.e. there is one more allocation:
> 
> <snip>
> static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> 				 pgprot_t prot, unsigned int page_shift,
> 				 int node)
> {
> ...
> 	/* Please note that the recursion is strictly bounded. */
> 	if (array_size > PAGE_SIZE) {
> 		area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node,
> 					area->caller);
> 	} else {
> 		area->pages = kmalloc_node_noprof(array_size, nested_gfp, node);
> 	}
> ...
> }
> <snip>
> 
> whereas it is easier to do it inside of the __vmalloc_area_node().

Right. The allocation path is quite convoluted here. If it is just too
much of a hassle to implement NOFAIL at a single place then we should
aim at reducing that. Having that at 3 different layers is just begging
for inconsistences.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2024-08-27  6:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-08 12:19 [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0 Hailong Liu
2024-08-08 13:01 ` Baoquan He
2024-08-08 14:57 ` Uladzislau Rezki
2024-08-08 21:05 ` Barry Song
2024-08-09  9:33   ` Michal Hocko
2024-08-09  9:41     ` Uladzislau Rezki
2024-08-16  5:07       ` Andrew Morton
2024-08-16  7:19         ` Uladzislau Rezki
2024-08-16  9:12         ` Hailong Liu
2024-08-16 10:13           ` Uladzislau Rezki
2024-08-16 11:46             ` Hailong Liu
2024-08-16 12:32               ` Michal Hocko
2024-08-23 16:42                 ` Uladzislau Rezki
2024-08-26  7:52                   ` Michal Hocko
2024-08-26 12:38                     ` Uladzislau Rezki
2024-08-27  6:49                       ` Michal Hocko [this message]
2024-08-27 12:47                         ` Uladzislau Rezki
2024-08-27 13:37                           ` Michal Hocko
2024-08-27 15:29                             ` Uladzislau Rezki
2024-08-28  7:14                               ` Michal Hocko
2024-08-28 17:23                                 ` Uladzislau Rezki
2024-08-19 11:59               ` Uladzislau Rezki
2024-08-19 12:57                 ` Hailong Liu
2024-08-19 13:38                   ` Uladzislau Rezki
2024-08-19 13:45                     ` Uladzislau Rezki
2024-08-20  1:59                     ` Hailong Liu
2024-08-20  6:44                       ` Uladzislau Rezki
2024-08-20  6:54                         ` Hailong Liu
2024-08-16 16:11             ` Baoquan He
2024-08-16 16:15               ` Baoquan He
  -- strict thread matches above, loose matches on Subject: below --
2024-08-08 12:04 Hailong Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zs12_8AZ0k_WRWUE@tiehlicka \
    --to=mhocko@suse.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=hailong.liu@oppo.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=zhengtangquan@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.