* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
[not found] <20240808122019.3361-1-hailong.liu@oppo.com>
@ 2024-08-08 21:05 ` Barry Song
2024-08-09 9:33 ` Michal Hocko
0 siblings, 1 reply; 27+ messages in thread
From: Barry Song @ 2024-08-08 21:05 UTC (permalink / raw)
To: Hailong Liu
Cc: Andrew Morton, Uladzislau Rezki, Christoph Hellwig,
Vlastimil Babka, Michal Hocko, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu <hailong.liu@oppo.com> wrote:
>
> The __vmap_pages_range_noflush() assumes its argument pages** contains
> pages with the same page shift. However, since commit e9c3cda4d86e
> ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags
> includes __GFP_NOFAIL with high order in vm_area_alloc_pages()
> and page allocation failed for high order, the pages** may contain
> two different page shifts (high order and order-0). This could
> lead __vmap_pages_range_noflush() to perform incorrect mappings,
> potentially resulting in memory corruption.
>
> Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE):
> kvmalloc(2M, __GFP_NOFAIL|GFP_X)
> __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP)
> vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0
> vmap_pages_range()
> vmap_pages_range_noflush()
> __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
>
> We can remove the fallback code because if a high-order
> allocation fails, __vmalloc_node_range_noprof() will retry with
> order-0. Therefore, it is unnecessary to fallback to order-0
> here. Therefore, fix this by removing the fallback code.
>
> Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations")
> Signed-off-by: Hailong Liu <hailong.liu@oppo.com>
> Reported-by: Tangquan Zheng <zhengtangquan@oppo.com>
> Cc: <stable@vger.kernel.org>
> CC: Barry Song <21cnbao@gmail.com>
> CC: Baoquan He <bhe@redhat.com>
> CC: Matthew Wilcox <willy@infradead.org>
> ---
Acked-by: Barry Song <baohua@kernel.org>
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail:
if (shift > PAGE_SHIFT) {
shift = PAGE_SHIFT;
align = real_align;
size = real_size;
goto again;
}
> mm/vmalloc.c | 11 ++---------
> 1 file changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6b783baf12a1..af2de36549d6 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3584,15 +3584,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> page = alloc_pages_noprof(alloc_gfp, order);
> else
> page = alloc_pages_node_noprof(nid, alloc_gfp, order);
> - if (unlikely(!page)) {
> - if (!nofail)
> - break;
> -
> - /* fall back to the zero order allocations */
> - alloc_gfp |= __GFP_NOFAIL;
> - order = 0;
> - continue;
> - }
> + if (unlikely(!page))
> + break;
>
> /*
> * Higher order allocations must be able to be treated as
> ---
> Sorry for fat fingers. with .rej file. resend this.
>
> Baoquan suggests set page_shift to 0 if fallback in (2 and concern about
> performance of retry with order-0. But IMO with retry,
> - Save memory usage if high order allocation failed.
> - Keep consistancy with align and page-shift.
> - make use of bulk allocator with order-0
>
> [2] https://lore.kernel.org/lkml/20240725035318.471-1-hailong.liu@oppo.com/
> --
> 2.30.0
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-08 21:05 ` [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0 Barry Song
@ 2024-08-09 9:33 ` Michal Hocko
2024-08-09 9:41 ` Uladzislau Rezki
0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2024-08-09 9:33 UTC (permalink / raw)
To: Barry Song
Cc: Hailong Liu, Andrew Morton, Uladzislau Rezki, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Fri 09-08-24 09:05:05, Barry Song wrote:
> On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu <hailong.liu@oppo.com> wrote:
> >
> > The __vmap_pages_range_noflush() assumes its argument pages** contains
> > pages with the same page shift. However, since commit e9c3cda4d86e
> > ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags
> > includes __GFP_NOFAIL with high order in vm_area_alloc_pages()
> > and page allocation failed for high order, the pages** may contain
> > two different page shifts (high order and order-0). This could
> > lead __vmap_pages_range_noflush() to perform incorrect mappings,
> > potentially resulting in memory corruption.
> >
> > Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE):
> > kvmalloc(2M, __GFP_NOFAIL|GFP_X)
> > __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP)
> > vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0
> > vmap_pages_range()
> > vmap_pages_range_noflush()
> > __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
> >
> > We can remove the fallback code because if a high-order
> > allocation fails, __vmalloc_node_range_noprof() will retry with
> > order-0. Therefore, it is unnecessary to fallback to order-0
> > here. Therefore, fix this by removing the fallback code.
> >
> > Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations")
> > Signed-off-by: Hailong Liu <hailong.liu@oppo.com>
> > Reported-by: Tangquan Zheng <zhengtangquan@oppo.com>
> > Cc: <stable@vger.kernel.org>
> > CC: Barry Song <21cnbao@gmail.com>
> > CC: Baoquan He <bhe@redhat.com>
> > CC: Matthew Wilcox <willy@infradead.org>
> > ---
>
> Acked-by: Barry Song <baohua@kernel.org>
>
> because we already have a fallback here:
>
> void *__vmalloc_node_range_noprof :
>
> fail:
> if (shift > PAGE_SHIFT) {
> shift = PAGE_SHIFT;
> align = real_align;
> size = real_size;
> goto again;
> }
This really deserves a comment because this is not really clear at all.
The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko <mhocko@suse.com>
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-09 9:33 ` Michal Hocko
@ 2024-08-09 9:41 ` Uladzislau Rezki
2024-08-16 5:07 ` Andrew Morton
0 siblings, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-09 9:41 UTC (permalink / raw)
To: Michal Hocko
Cc: Barry Song, Hailong Liu, Andrew Morton, Uladzislau Rezki,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Fri, Aug 09, 2024 at 11:33:06AM +0200, Michal Hocko wrote:
> On Fri 09-08-24 09:05:05, Barry Song wrote:
> > On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu <hailong.liu@oppo.com> wrote:
> > >
> > > The __vmap_pages_range_noflush() assumes its argument pages** contains
> > > pages with the same page shift. However, since commit e9c3cda4d86e
> > > ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags
> > > includes __GFP_NOFAIL with high order in vm_area_alloc_pages()
> > > and page allocation failed for high order, the pages** may contain
> > > two different page shifts (high order and order-0). This could
> > > lead __vmap_pages_range_noflush() to perform incorrect mappings,
> > > potentially resulting in memory corruption.
> > >
> > > Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE):
> > > kvmalloc(2M, __GFP_NOFAIL|GFP_X)
> > > __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP)
> > > vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0
> > > vmap_pages_range()
> > > vmap_pages_range_noflush()
> > > __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
> > >
> > > We can remove the fallback code because if a high-order
> > > allocation fails, __vmalloc_node_range_noprof() will retry with
> > > order-0. Therefore, it is unnecessary to fallback to order-0
> > > here. Therefore, fix this by removing the fallback code.
> > >
> > > Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations")
> > > Signed-off-by: Hailong Liu <hailong.liu@oppo.com>
> > > Reported-by: Tangquan Zheng <zhengtangquan@oppo.com>
> > > Cc: <stable@vger.kernel.org>
> > > CC: Barry Song <21cnbao@gmail.com>
> > > CC: Baoquan He <bhe@redhat.com>
> > > CC: Matthew Wilcox <willy@infradead.org>
> > > ---
> >
> > Acked-by: Barry Song <baohua@kernel.org>
> >
> > because we already have a fallback here:
> >
> > void *__vmalloc_node_range_noprof :
> >
> > fail:
> > if (shift > PAGE_SHIFT) {
> > shift = PAGE_SHIFT;
> > align = real_align;
> > size = real_size;
> > goto again;
> > }
>
> This really deserves a comment because this is not really clear at all.
> The code is also fragile and it would benefit from some re-org.
>
> Thanks for the fix.
>
> Acked-by: Michal Hocko <mhocko@suse.com>
>
I agree. This is only clear for people who know the code. A "fallback"
to order-0 should be commented.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-09 9:41 ` Uladzislau Rezki
@ 2024-08-16 5:07 ` Andrew Morton
2024-08-16 7:19 ` Uladzislau Rezki
2024-08-16 9:12 ` Hailong Liu
0 siblings, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2024-08-16 5:07 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Michal Hocko, Barry Song, Hailong Liu, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > Acked-by: Barry Song <baohua@kernel.org>
> > >
> > > because we already have a fallback here:
> > >
> > > void *__vmalloc_node_range_noprof :
> > >
> > > fail:
> > > if (shift > PAGE_SHIFT) {
> > > shift = PAGE_SHIFT;
> > > align = real_align;
> > > size = real_size;
> > > goto again;
> > > }
> >
> > This really deserves a comment because this is not really clear at all.
> > The code is also fragile and it would benefit from some re-org.
> >
> > Thanks for the fix.
> >
> > Acked-by: Michal Hocko <mhocko@suse.com>
> >
> I agree. This is only clear for people who know the code. A "fallback"
> to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add
this comment?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 5:07 ` Andrew Morton
@ 2024-08-16 7:19 ` Uladzislau Rezki
2024-08-16 9:12 ` Hailong Liu
1 sibling, 0 replies; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-16 7:19 UTC (permalink / raw)
To: Andrew Morton
Cc: Uladzislau Rezki, Michal Hocko, Barry Song, Hailong Liu,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Thu, Aug 15, 2024 at 10:07:09PM -0700, Andrew Morton wrote:
> On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
>
> > > > Acked-by: Barry Song <baohua@kernel.org>
> > > >
> > > > because we already have a fallback here:
> > > >
> > > > void *__vmalloc_node_range_noprof :
> > > >
> > > > fail:
> > > > if (shift > PAGE_SHIFT) {
> > > > shift = PAGE_SHIFT;
> > > > align = real_align;
> > > > size = real_size;
> > > > goto again;
> > > > }
> > >
> > > This really deserves a comment because this is not really clear at all.
> > > The code is also fragile and it would benefit from some re-org.
> > >
> > > Thanks for the fix.
> > >
> > > Acked-by: Michal Hocko <mhocko@suse.com>
> > >
> > I agree. This is only clear for people who know the code. A "fallback"
> > to order-0 should be commented.
>
> It's been a week. Could someone please propose a fixup patch to add
> this comment?
>
I will send the patch. This is week i have a vacation, thus i am a bit
slow.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 5:07 ` Andrew Morton
2024-08-16 7:19 ` Uladzislau Rezki
@ 2024-08-16 9:12 ` Hailong Liu
2024-08-16 10:13 ` Uladzislau Rezki
1 sibling, 1 reply; 27+ messages in thread
From: Hailong Liu @ 2024-08-16 9:12 UTC (permalink / raw)
To: Andrew Morton
Cc: Uladzislau Rezki, Michal Hocko, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Thu, 15. Aug 22:07, Andrew Morton wrote:
> On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
>
> > > > Acked-by: Barry Song <baohua@kernel.org>
> > > >
> > > > because we already have a fallback here:
> > > >
> > > > void *__vmalloc_node_range_noprof :
> > > >
> > > > fail:
> > > > if (shift > PAGE_SHIFT) {
> > > > shift = PAGE_SHIFT;
> > > > align = real_align;
> > > > size = real_size;
> > > > goto again;
> > > > }
> > >
> > > This really deserves a comment because this is not really clear at all.
> > > The code is also fragile and it would benefit from some re-org.
> > >
> > > Thanks for the fix.
> > >
> > > Acked-by: Michal Hocko <mhocko@suse.com>
> > >
> > I agree. This is only clear for people who know the code. A "fallback"
> > to order-0 should be commented.
>
> It's been a week. Could someone please propose a fixup patch to add
> this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
--
Brs,
hailong.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 9:12 ` Hailong Liu
@ 2024-08-16 10:13 ` Uladzislau Rezki
2024-08-16 11:46 ` Hailong Liu
2024-08-16 16:11 ` Baoquan He
0 siblings, 2 replies; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-16 10:13 UTC (permalink / raw)
To: Hailong Liu
Cc: Andrew Morton, Uladzislau Rezki, Michal Hocko, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > >
> > > > > because we already have a fallback here:
> > > > >
> > > > > void *__vmalloc_node_range_noprof :
> > > > >
> > > > > fail:
> > > > > if (shift > PAGE_SHIFT) {
> > > > > shift = PAGE_SHIFT;
> > > > > align = real_align;
> > > > > size = real_size;
> > > > > goto again;
> > > > > }
> > > >
> > > > This really deserves a comment because this is not really clear at all.
> > > > The code is also fragile and it would benefit from some re-org.
> > > >
> > > > Thanks for the fix.
> > > >
> > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > >
> > > I agree. This is only clear for people who know the code. A "fallback"
> > > to order-0 should be commented.
> >
> > It's been a week. Could someone please propose a fixup patch to add
> > this comment?
>
> Hi Andrew:
>
> Do you mean that I need to send a v2 patch with the the comments included?
>
It is better to post v2.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single
page allocator. If an request is marked as NO_FAIL, i am talking about
order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
Thanks!
--
Uladzsislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 10:13 ` Uladzislau Rezki
@ 2024-08-16 11:46 ` Hailong Liu
2024-08-16 12:32 ` Michal Hocko
2024-08-19 11:59 ` Uladzislau Rezki
2024-08-16 16:11 ` Baoquan He
1 sibling, 2 replies; 27+ messages in thread
From: Hailong Liu @ 2024-08-16 11:46 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Andrew Morton, Michal Hocko, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > >
> > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > >
> > > > > > because we already have a fallback here:
> > > > > >
> > > > > > void *__vmalloc_node_range_noprof :
> > > > > >
> > > > > > fail:
> > > > > > if (shift > PAGE_SHIFT) {
> > > > > > shift = PAGE_SHIFT;
> > > > > > align = real_align;
> > > > > > size = real_size;
> > > > > > goto again;
> > > > > > }
> > > > >
> > > > > This really deserves a comment because this is not really clear at all.
> > > > > The code is also fragile and it would benefit from some re-org.
> > > > >
> > > > > Thanks for the fix.
> > > > >
> > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > >
> > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > to order-0 should be commented.
> > >
> > > It's been a week. Could someone please propose a fixup patch to add
> > > this comment?
> >
> > Hi Andrew:
> >
> > Do you mean that I need to send a v2 patch with the the comments included?
> >
> It is better to post v2.
Got it.
>
> But before, could you please comment on:
>
> in case of order-0, bulk path may easily fail and fallback to the single
> page allocator. If an request is marked as NO_FAIL, i am talking about
> order-0 request, your change breaks GFP_NOFAIL for !order.
>
> Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
the flag correctly. IMO we don't need to handle the flag here.
>
> Thanks!
>
> --
> Uladzsislau Rezki
--
help you, help me,
Hailong.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 11:46 ` Hailong Liu
@ 2024-08-16 12:32 ` Michal Hocko
2024-08-23 16:42 ` Uladzislau Rezki
2024-08-19 11:59 ` Uladzislau Rezki
1 sibling, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2024-08-16 12:32 UTC (permalink / raw)
To: Hailong Liu
Cc: Uladzislau Rezki, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Fri 16-08-24 19:46:26, Hailong Liu wrote:
> On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > >
> > > > > > > because we already have a fallback here:
> > > > > > >
> > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > >
> > > > > > > fail:
> > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > shift = PAGE_SHIFT;
> > > > > > > align = real_align;
> > > > > > > size = real_size;
> > > > > > > goto again;
> > > > > > > }
> > > > > >
> > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > >
> > > > > > Thanks for the fix.
> > > > > >
> > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > >
> > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > to order-0 should be commented.
> > > >
> > > > It's been a week. Could someone please propose a fixup patch to add
> > > > this comment?
> > >
> > > Hi Andrew:
> > >
> > > Do you mean that I need to send a v2 patch with the the comments included?
> > >
> > It is better to post v2.
> Got it.
>
> >
> > But before, could you please comment on:
> >
> > in case of order-0, bulk path may easily fail and fallback to the single
> > page allocator. If an request is marked as NO_FAIL, i am talking about
> > order-0 request, your change breaks GFP_NOFAIL for !order.
> >
> > Am i missing something obvious?
> For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> the flag correctly. IMO we don't need to handle the flag here.
Let me clarify what I would like to have clarified:
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6b783baf12a1..fea90a39f5c5 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3510,13 +3510,13 @@ void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot)
EXPORT_SYMBOL_GPL(vmap_pfn);
#endif /* CONFIG_VMAP_PFN */
+/* GFP_NOFAIL semantic is implemented by __vmalloc_node_range_noprof */
static inline unsigned int
vm_area_alloc_pages(gfp_t gfp, int nid,
unsigned int order, unsigned int nr_pages, struct page **pages)
{
unsigned int nr_allocated = 0;
- gfp_t alloc_gfp = gfp;
- bool nofail = gfp & __GFP_NOFAIL;
+ gfp_t alloc_gfp = gfp & ~ __GFP_NOFAIL;
struct page *page;
int i;
@@ -3527,9 +3527,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
* more permissive.
*/
if (!order) {
- /* bulk allocator doesn't support nofail req. officially */
- gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL;
-
while (nr_allocated < nr_pages) {
unsigned int nr, nr_pages_request;
@@ -3547,12 +3544,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
* but mempolicy wants to alloc memory by interleaving.
*/
if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
- nr = alloc_pages_bulk_array_mempolicy_noprof(bulk_gfp,
+ nr = alloc_pages_bulk_array_mempolicy_noprof(alloc_gfp,
nr_pages_request,
pages + nr_allocated);
else
- nr = alloc_pages_bulk_array_node_noprof(bulk_gfp, nid,
+ nr = alloc_pages_bulk_array_node_noprof(alloc_gfp, nid,
nr_pages_request,
pages + nr_allocated);
@@ -3566,13 +3563,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
if (nr != nr_pages_request)
break;
}
- } else if (gfp & __GFP_NOFAIL) {
- /*
- * Higher order nofail allocations are really expensive and
- * potentially dangerous (pre-mature OOM, disruptive reclaim
- * and compaction etc.
- */
- alloc_gfp &= ~__GFP_NOFAIL;
}
/* High-order pages or fallback path if "bulk" fails. */
--
Michal Hocko
SUSE Labs
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 10:13 ` Uladzislau Rezki
2024-08-16 11:46 ` Hailong Liu
@ 2024-08-16 16:11 ` Baoquan He
2024-08-16 16:15 ` Baoquan He
1 sibling, 1 reply; 27+ messages in thread
From: Baoquan He @ 2024-08-16 16:11 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Hailong Liu, Andrew Morton, Michal Hocko, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Matthew Wilcox, linux-mm, linux-kernel
On 08/16/24 at 12:13pm, Uladzislau Rezki wrote:
> On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > >
> > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > >
> > > > > > because we already have a fallback here:
> > > > > >
> > > > > > void *__vmalloc_node_range_noprof :
> > > > > >
> > > > > > fail:
> > > > > > if (shift > PAGE_SHIFT) {
> > > > > > shift = PAGE_SHIFT;
> > > > > > align = real_align;
> > > > > > size = real_size;
> > > > > > goto again;
> > > > > > }
> > > > >
> > > > > This really deserves a comment because this is not really clear at all.
> > > > > The code is also fragile and it would benefit from some re-org.
> > > > >
> > > > > Thanks for the fix.
> > > > >
> > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > >
> > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > to order-0 should be commented.
> > >
> > > It's been a week. Could someone please propose a fixup patch to add
> > > this comment?
> >
> > Hi Andrew:
> >
> > Do you mean that I need to send a v2 patch with the the comments included?
> >
> It is better to post v2.
>
> But before, could you please comment on:
>
> in case of order-0, bulk path may easily fail and fallback to the single
> page allocator. If an request is marked as NO_FAIL, i am talking about
> order-0 request, your change breaks GFP_NOFAIL for !order.
In case order-0, bulk_gfp masks off __GFP_NOFAIL, but alloc_gfp doesn't.
So alloc_gfp has __GFP_NOFAIL in fallback, it won't be failed by
alloc_pages().
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 16:11 ` Baoquan He
@ 2024-08-16 16:15 ` Baoquan He
0 siblings, 0 replies; 27+ messages in thread
From: Baoquan He @ 2024-08-16 16:15 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Hailong Liu, Andrew Morton, Michal Hocko, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Matthew Wilcox, linux-mm, linux-kernel
On 08/17/24 at 12:11am, Baoquan He wrote:
> On 08/16/24 at 12:13pm, Uladzislau Rezki wrote:
> > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > >
> > > > > > > because we already have a fallback here:
> > > > > > >
> > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > >
> > > > > > > fail:
> > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > shift = PAGE_SHIFT;
> > > > > > > align = real_align;
> > > > > > > size = real_size;
> > > > > > > goto again;
> > > > > > > }
> > > > > >
> > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > >
> > > > > > Thanks for the fix.
> > > > > >
> > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > >
> > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > to order-0 should be commented.
> > > >
> > > > It's been a week. Could someone please propose a fixup patch to add
> > > > this comment?
> > >
> > > Hi Andrew:
> > >
> > > Do you mean that I need to send a v2 patch with the the comments included?
> > >
> > It is better to post v2.
> >
> > But before, could you please comment on:
> >
> > in case of order-0, bulk path may easily fail and fallback to the single
> > page allocator. If an request is marked as NO_FAIL, i am talking about
> > order-0 request, your change breaks GFP_NOFAIL for !order.
>
> In case order-0, bulk_gfp masks off __GFP_NOFAIL, but alloc_gfp doesn't.
> So alloc_gfp has __GFP_NOFAIL in fallback, it won't be failed by
> alloc_pages().
Please ignore this, I didn't update my local mail box, didn't see
Hailong's reply.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 11:46 ` Hailong Liu
2024-08-16 12:32 ` Michal Hocko
@ 2024-08-19 11:59 ` Uladzislau Rezki
2024-08-19 12:57 ` Hailong Liu
1 sibling, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-19 11:59 UTC (permalink / raw)
To: Hailong Liu, Michal Hocko
Cc: Uladzislau Rezki, Andrew Morton, Michal Hocko, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > >
> > > > > > > because we already have a fallback here:
> > > > > > >
> > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > >
> > > > > > > fail:
> > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > shift = PAGE_SHIFT;
> > > > > > > align = real_align;
> > > > > > > size = real_size;
> > > > > > > goto again;
> > > > > > > }
> > > > > >
> > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > >
> > > > > > Thanks for the fix.
> > > > > >
> > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > >
> > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > to order-0 should be commented.
> > > >
> > > > It's been a week. Could someone please propose a fixup patch to add
> > > > this comment?
> > >
> > > Hi Andrew:
> > >
> > > Do you mean that I need to send a v2 patch with the the comments included?
> > >
> > It is better to post v2.
> Got it.
>
> >
> > But before, could you please comment on:
> >
> > in case of order-0, bulk path may easily fail and fallback to the single
> > page allocator. If an request is marked as NO_FAIL, i am talking about
> > order-0 request, your change breaks GFP_NOFAIL for !order.
> >
> > Am i missing something obvious?
> For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> the flag correctly. IMO we don't need to handle the flag here.
>
Agree. As for comment, i meant to comment the below fallback:
<snip>
fail:
if (shift > PAGE_SHIFT) {
shift = PAGE_SHIFT;
align = real_align;
size = real_size;
goto again;
}
<snip>
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-19 11:59 ` Uladzislau Rezki
@ 2024-08-19 12:57 ` Hailong Liu
2024-08-19 13:38 ` Uladzislau Rezki
0 siblings, 1 reply; 27+ messages in thread
From: Hailong Liu @ 2024-08-19 12:57 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Michal Hocko, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
> On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> > On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > >
> > > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > > >
> > > > > > > > because we already have a fallback here:
> > > > > > > >
> > > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > > >
> > > > > > > > fail:
> > > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > > shift = PAGE_SHIFT;
> > > > > > > > align = real_align;
> > > > > > > > size = real_size;
> > > > > > > > goto again;
> > > > > > > > }
> > > > > > >
> > > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > > >
> > > > > > > Thanks for the fix.
> > > > > > >
> > > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > > >
> > > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > > to order-0 should be commented.
> > > > >
> > > > > It's been a week. Could someone please propose a fixup patch to add
> > > > > this comment?
> > > >
> > > > Hi Andrew:
> > > >
> > > > Do you mean that I need to send a v2 patch with the the comments included?
> > > >
> > > It is better to post v2.
> > Got it.
> >
> > >
> > > But before, could you please comment on:
> > >
> > > in case of order-0, bulk path may easily fail and fallback to the single
> > > page allocator. If an request is marked as NO_FAIL, i am talking about
> > > order-0 request, your change breaks GFP_NOFAIL for !order.
> > >
> > > Am i missing something obvious?
> > For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> > the flag correctly. IMO we don't need to handle the flag here.
> >
> Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch
found Andrew already merged in -stable branch. So we can include these with a
new patch.
>
> <snip>
> fail:
> if (shift > PAGE_SHIFT) {
> shift = PAGE_SHIFT;
> align = real_align;
> size = real_size;
> goto again;
> }
> <snip>
>
> --
> Uladzislau Rezki
--
Help you, Help me,
Hailong.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-19 12:57 ` Hailong Liu
@ 2024-08-19 13:38 ` Uladzislau Rezki
2024-08-19 13:45 ` Uladzislau Rezki
2024-08-20 1:59 ` Hailong Liu
0 siblings, 2 replies; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-19 13:38 UTC (permalink / raw)
To: Hailong Liu
Cc: Uladzislau Rezki, Michal Hocko, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
> On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
> > On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> > > On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > > > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > >
> > > > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > > > >
> > > > > > > > > because we already have a fallback here:
> > > > > > > > >
> > > > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > > > >
> > > > > > > > > fail:
> > > > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > > > shift = PAGE_SHIFT;
> > > > > > > > > align = real_align;
> > > > > > > > > size = real_size;
> > > > > > > > > goto again;
> > > > > > > > > }
> > > > > > > >
> > > > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > > > >
> > > > > > > > Thanks for the fix.
> > > > > > > >
> > > > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > > > >
> > > > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > > > to order-0 should be commented.
> > > > > >
> > > > > > It's been a week. Could someone please propose a fixup patch to add
> > > > > > this comment?
> > > > >
> > > > > Hi Andrew:
> > > > >
> > > > > Do you mean that I need to send a v2 patch with the the comments included?
> > > > >
> > > > It is better to post v2.
> > > Got it.
> > >
> > > >
> > > > But before, could you please comment on:
> > > >
> > > > in case of order-0, bulk path may easily fail and fallback to the single
> > > > page allocator. If an request is marked as NO_FAIL, i am talking about
> > > > order-0 request, your change breaks GFP_NOFAIL for !order.
> > > >
> > > > Am i missing something obvious?
> > > For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> > > the flag correctly. IMO we don't need to handle the flag here.
> > >
> > Agree. As for comment, i meant to comment the below fallback:
> Michal send a craft that make nofail logic more clearer and I check the branch
> found Andrew already merged in -stable branch. So we can include these with a
> new patch.
>
Just to confirm. Will you send an extra patch with the comment?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-19 13:38 ` Uladzislau Rezki
@ 2024-08-19 13:45 ` Uladzislau Rezki
2024-08-20 1:59 ` Hailong Liu
1 sibling, 0 replies; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-19 13:45 UTC (permalink / raw)
To: Hailong Liu
Cc: Hailong Liu, Michal Hocko, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Mon, Aug 19, 2024 at 03:38:51PM +0200, Uladzislau Rezki wrote:
> On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
> > On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
> > > On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> > > > On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > > > > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > > > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > > >
> > > > > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > > > > >
> > > > > > > > > > because we already have a fallback here:
> > > > > > > > > >
> > > > > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > > > > >
> > > > > > > > > > fail:
> > > > > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > > > > shift = PAGE_SHIFT;
> > > > > > > > > > align = real_align;
> > > > > > > > > > size = real_size;
> > > > > > > > > > goto again;
> > > > > > > > > > }
> > > > > > > > >
> > > > > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > > > > >
> > > > > > > > > Thanks for the fix.
> > > > > > > > >
> > > > > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > > > > >
> > > > > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > > > > to order-0 should be commented.
> > > > > > >
> > > > > > > It's been a week. Could someone please propose a fixup patch to add
> > > > > > > this comment?
> > > > > >
> > > > > > Hi Andrew:
> > > > > >
> > > > > > Do you mean that I need to send a v2 patch with the the comments included?
> > > > > >
> > > > > It is better to post v2.
> > > > Got it.
> > > >
> > > > >
> > > > > But before, could you please comment on:
> > > > >
> > > > > in case of order-0, bulk path may easily fail and fallback to the single
> > > > > page allocator. If an request is marked as NO_FAIL, i am talking about
> > > > > order-0 request, your change breaks GFP_NOFAIL for !order.
> > > > >
> > > > > Am i missing something obvious?
> > > > For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> > > > the flag correctly. IMO we don't need to handle the flag here.
> > > >
> > > Agree. As for comment, i meant to comment the below fallback:
> > Michal send a craft that make nofail logic more clearer and I check the branch
> > found Andrew already merged in -stable branch. So we can include these with a
> > new patch.
> >
> Just to confirm. Will you send an extra patch with the comment?
>
Also, an idea to handle NOFAIL outside of vm_area_alloc_pages() looks
sounds good to me.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-19 13:38 ` Uladzislau Rezki
2024-08-19 13:45 ` Uladzislau Rezki
@ 2024-08-20 1:59 ` Hailong Liu
2024-08-20 6:44 ` Uladzislau Rezki
1 sibling, 1 reply; 27+ messages in thread
From: Hailong Liu @ 2024-08-20 1:59 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Michal Hocko, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Mon, 19. Aug 15:38, Uladzislau Rezki wrote:
> On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
> > On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
> > > On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> > > > On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > > > > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > > > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > > >
> > > > > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > > > > >
> > > > > > > > > > because we already have a fallback here:
> > > > > > > > > >
> > > > > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > > > > >
> > > > > > > > > > fail:
> > > > > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > > > > shift = PAGE_SHIFT;
> > > > > > > > > > align = real_align;
> > > > > > > > > > size = real_size;
> > > > > > > > > > goto again;
> > > > > > > > > > }
> > > > > > > > >
> > > > > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > > > > >
> > > > > > > > > Thanks for the fix.
> > > > > > > > >
> > > > > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > > > > >
> > > > > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > > > > to order-0 should be commented.
> > > > > > >
> > > > > > > It's been a week. Could someone please propose a fixup patch to add
> > > > > > > this comment?
> > > > > >
> > > > > > Hi Andrew:
> > > > > >
> > > > > > Do you mean that I need to send a v2 patch with the the comments included?
> > > > > >
> > > > > It is better to post v2.
> > > > Got it.
> > > >
> > > > >
> > > > > But before, could you please comment on:
> > > > >
> > > > > in case of order-0, bulk path may easily fail and fallback to the single
> > > > > page allocator. If an request is marked as NO_FAIL, i am talking about
> > > > > order-0 request, your change breaks GFP_NOFAIL for !order.
> > > > >
> > > > > Am i missing something obvious?
> > > > For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> > > > the flag correctly. IMO we don't need to handle the flag here.
> > > >
> > > Agree. As for comment, i meant to comment the below fallback:
> > Michal send a craft that make nofail logic more clearer and I check the branch
> > found Andrew already merged in -stable branch. So we can include these with a
> > new patch.
> >
> Just to confirm. Will you send an extra patch with the comment?
>
If this is not urgent, I can send this patch later this week. :)
> --
> Uladzislau Rezki
--
Help you, Help me,
Hailong.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-20 1:59 ` Hailong Liu
@ 2024-08-20 6:44 ` Uladzislau Rezki
2024-08-20 6:54 ` Hailong Liu
0 siblings, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-20 6:44 UTC (permalink / raw)
To: Hailong Liu
Cc: Uladzislau Rezki, Michal Hocko, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Tue, Aug 20, 2024 at 09:59:50AM +0800, Hailong Liu wrote:
> On Mon, 19. Aug 15:38, Uladzislau Rezki wrote:
> > On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
> > > On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
> > > > On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> > > > > On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > > > > > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > > > > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > > > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > > > >
> > > > > > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > > > > > >
> > > > > > > > > > > because we already have a fallback here:
> > > > > > > > > > >
> > > > > > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > > > > > >
> > > > > > > > > > > fail:
> > > > > > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > > > > > shift = PAGE_SHIFT;
> > > > > > > > > > > align = real_align;
> > > > > > > > > > > size = real_size;
> > > > > > > > > > > goto again;
> > > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > > > > > >
> > > > > > > > > > Thanks for the fix.
> > > > > > > > > >
> > > > > > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > > > > > >
> > > > > > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > > > > > to order-0 should be commented.
> > > > > > > >
> > > > > > > > It's been a week. Could someone please propose a fixup patch to add
> > > > > > > > this comment?
> > > > > > >
> > > > > > > Hi Andrew:
> > > > > > >
> > > > > > > Do you mean that I need to send a v2 patch with the the comments included?
> > > > > > >
> > > > > > It is better to post v2.
> > > > > Got it.
> > > > >
> > > > > >
> > > > > > But before, could you please comment on:
> > > > > >
> > > > > > in case of order-0, bulk path may easily fail and fallback to the single
> > > > > > page allocator. If an request is marked as NO_FAIL, i am talking about
> > > > > > order-0 request, your change breaks GFP_NOFAIL for !order.
> > > > > >
> > > > > > Am i missing something obvious?
> > > > > For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> > > > > the flag correctly. IMO we don't need to handle the flag here.
> > > > >
> > > > Agree. As for comment, i meant to comment the below fallback:
> > > Michal send a craft that make nofail logic more clearer and I check the branch
> > > found Andrew already merged in -stable branch. So we can include these with a
> > > new patch.
> > >
> > Just to confirm. Will you send an extra patch with the comment?
> >
> If this is not urgent, I can send this patch later this week. :)
>
This is for synchronization, so we both do not do a double work :)
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-20 6:44 ` Uladzislau Rezki
@ 2024-08-20 6:54 ` Hailong Liu
0 siblings, 0 replies; 27+ messages in thread
From: Hailong Liu @ 2024-08-20 6:54 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Michal Hocko, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Tue, 20. Aug 08:44, Uladzislau Rezki wrote:
> On Tue, Aug 20, 2024 at 09:59:50AM +0800, Hailong Liu wrote:
> > On Mon, 19. Aug 15:38, Uladzislau Rezki wrote:
> > > On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
> > > > On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
> > > > > On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
> > > > > > On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
> > > > > > > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
> > > > > > > > On Thu, 15. Aug 22:07, Andrew Morton wrote:
> > > > > > > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > > > Acked-by: Barry Song <baohua@kernel.org>
> > > > > > > > > > > >
> > > > > > > > > > > > because we already have a fallback here:
> > > > > > > > > > > >
> > > > > > > > > > > > void *__vmalloc_node_range_noprof :
> > > > > > > > > > > >
> > > > > > > > > > > > fail:
> > > > > > > > > > > > if (shift > PAGE_SHIFT) {
> > > > > > > > > > > > shift = PAGE_SHIFT;
> > > > > > > > > > > > align = real_align;
> > > > > > > > > > > > size = real_size;
> > > > > > > > > > > > goto again;
> > > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > This really deserves a comment because this is not really clear at all.
> > > > > > > > > > > The code is also fragile and it would benefit from some re-org.
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the fix.
> > > > > > > > > > >
> > > > > > > > > > > Acked-by: Michal Hocko <mhocko@suse.com>
> > > > > > > > > > >
> > > > > > > > > > I agree. This is only clear for people who know the code. A "fallback"
> > > > > > > > > > to order-0 should be commented.
> > > > > > > > >
> > > > > > > > > It's been a week. Could someone please propose a fixup patch to add
> > > > > > > > > this comment?
> > > > > > > >
> > > > > > > > Hi Andrew:
> > > > > > > >
> > > > > > > > Do you mean that I need to send a v2 patch with the the comments included?
> > > > > > > >
> > > > > > > It is better to post v2.
> > > > > > Got it.
> > > > > >
> > > > > > >
> > > > > > > But before, could you please comment on:
> > > > > > >
> > > > > > > in case of order-0, bulk path may easily fail and fallback to the single
> > > > > > > page allocator. If an request is marked as NO_FAIL, i am talking about
> > > > > > > order-0 request, your change breaks GFP_NOFAIL for !order.
> > > > > > >
> > > > > > > Am i missing something obvious?
> > > > > > For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle
> > > > > > the flag correctly. IMO we don't need to handle the flag here.
> > > > > >
> > > > > Agree. As for comment, i meant to comment the below fallback:
> > > > Michal send a craft that make nofail logic more clearer and I check the branch
> > > > found Andrew already merged in -stable branch. So we can include these with a
> > > > new patch.
> > > >
> > > Just to confirm. Will you send an extra patch with the comment?
> > >
> > If this is not urgent, I can send this patch later this week. :)
> >
> This is for synchronization, so we both do not do a double work :)
Ahh I guess you already have a plan, so I don't need to do this. Wating for your
patch :)
>
> --
> Uladzislau Rezki
--
Help you, Help me,
Hailong.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-16 12:32 ` Michal Hocko
@ 2024-08-23 16:42 ` Uladzislau Rezki
2024-08-26 7:52 ` Michal Hocko
0 siblings, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-23 16:42 UTC (permalink / raw)
To: Michal Hocko
Cc: Hailong Liu, Uladzislau Rezki, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
Hello, Michal.
>
> Let me clarify what I would like to have clarified:
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6b783baf12a1..fea90a39f5c5 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3510,13 +3510,13 @@ void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot)
> EXPORT_SYMBOL_GPL(vmap_pfn);
> #endif /* CONFIG_VMAP_PFN */
>
> +/* GFP_NOFAIL semantic is implemented by __vmalloc_node_range_noprof */
> static inline unsigned int
> vm_area_alloc_pages(gfp_t gfp, int nid,
> unsigned int order, unsigned int nr_pages, struct page **pages)
> {
> unsigned int nr_allocated = 0;
> - gfp_t alloc_gfp = gfp;
> - bool nofail = gfp & __GFP_NOFAIL;
> + gfp_t alloc_gfp = gfp & ~ __GFP_NOFAIL;
> struct page *page;
> int i;
>
> @@ -3527,9 +3527,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> * more permissive.
> */
> if (!order) {
> - /* bulk allocator doesn't support nofail req. officially */
> - gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL;
> -
> while (nr_allocated < nr_pages) {
> unsigned int nr, nr_pages_request;
>
> @@ -3547,12 +3544,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> * but mempolicy wants to alloc memory by interleaving.
> */
> if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
> - nr = alloc_pages_bulk_array_mempolicy_noprof(bulk_gfp,
> + nr = alloc_pages_bulk_array_mempolicy_noprof(alloc_gfp,
> nr_pages_request,
> pages + nr_allocated);
>
> else
> - nr = alloc_pages_bulk_array_node_noprof(bulk_gfp, nid,
> + nr = alloc_pages_bulk_array_node_noprof(alloc_gfp, nid,
> nr_pages_request,
> pages + nr_allocated);
>
> @@ -3566,13 +3563,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> if (nr != nr_pages_request)
> break;
> }
> - } else if (gfp & __GFP_NOFAIL) {
> - /*
> - * Higher order nofail allocations are really expensive and
> - * potentially dangerous (pre-mature OOM, disruptive reclaim
> - * and compaction etc.
> - */
> - alloc_gfp &= ~__GFP_NOFAIL;
> }
>
> /* High-order pages or fallback path if "bulk" fails. */
> --
>
See below the change. It does not do any functional change and it is rather
a small refactoring, which includes the comment i wanted to add and what you
wanted to be clarified(if i got you correctly):
<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3f9b6bd707d2..24fad2e48799 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3531,8 +3531,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
unsigned int order, unsigned int nr_pages, struct page **pages)
{
unsigned int nr_allocated = 0;
- gfp_t alloc_gfp = gfp;
- bool nofail = gfp & __GFP_NOFAIL;
struct page *page;
int i;
@@ -3543,9 +3541,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
* more permissive.
*/
if (!order) {
- /* bulk allocator doesn't support nofail req. officially */
- gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL;
-
while (nr_allocated < nr_pages) {
unsigned int nr, nr_pages_request;
@@ -3563,12 +3558,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
* but mempolicy wants to alloc memory by interleaving.
*/
if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
- nr = alloc_pages_bulk_array_mempolicy_noprof(bulk_gfp,
+ nr = alloc_pages_bulk_array_mempolicy_noprof(gfp & ~__GFP_NOFAIL,
nr_pages_request,
pages + nr_allocated);
-
else
- nr = alloc_pages_bulk_array_node_noprof(bulk_gfp, nid,
+ /* bulk allocator doesn't support nofail req. officially */
+ nr = alloc_pages_bulk_array_node_noprof(gfp & ~__GFP_NOFAIL, nid,
nr_pages_request,
pages + nr_allocated);
@@ -3582,24 +3577,18 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
if (nr != nr_pages_request)
break;
}
- } else if (gfp & __GFP_NOFAIL) {
- /*
- * Higher order nofail allocations are really expensive and
- * potentially dangerous (pre-mature OOM, disruptive reclaim
- * and compaction etc.
- */
- alloc_gfp &= ~__GFP_NOFAIL;
}
/* High-order pages or fallback path if "bulk" fails. */
while (nr_allocated < nr_pages) {
- if (!nofail && fatal_signal_pending(current))
+ if (!(gfp & __GFP_NOFAIL) && fatal_signal_pending(current))
break;
if (nid == NUMA_NO_NODE)
- page = alloc_pages_noprof(alloc_gfp, order);
+ page = alloc_pages_noprof(gfp, order);
else
- page = alloc_pages_node_noprof(nid, alloc_gfp, order);
+ page = alloc_pages_node_noprof(nid, gfp, order);
+
if (unlikely(!page))
break;
@@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
page_order = vm_area_page_order(area);
- area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
+ /*
+ * Higher order nofail allocations are really expensive and
+ * potentially dangerous (pre-mature OOM, disruptive reclaim
+ * and compaction etc.
+ *
+ * Please note, the __vmalloc_node_range_noprof() falls-back
+ * to order-0 pages if high-order attempt has been unsuccessful.
+ */
+ area->nr_pages = vm_area_alloc_pages(page_order ?
+ gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
node, page_order, nr_small_pages, area->pages);
atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
<snip>
Is that aligned with your wish?
Thanks!
--
Uladzislau Rezki
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-23 16:42 ` Uladzislau Rezki
@ 2024-08-26 7:52 ` Michal Hocko
2024-08-26 12:38 ` Uladzislau Rezki
0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2024-08-26 7:52 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Hailong Liu, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote:
[...]
> @@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
> page_order = vm_area_page_order(area);
>
> - area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
> + /*
> + * Higher order nofail allocations are really expensive and
> + * potentially dangerous (pre-mature OOM, disruptive reclaim
> + * and compaction etc.
> + *
> + * Please note, the __vmalloc_node_range_noprof() falls-back
> + * to order-0 pages if high-order attempt has been unsuccessful.
> + */
> + area->nr_pages = vm_area_alloc_pages(page_order ?
> + gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
> node, page_order, nr_small_pages, area->pages);
>
> atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
> <snip>
>
> Is that aligned with your wish?
I am not a great fan of modifying gfp_mask inside the ternary operator
like that. It makes the code harder to read. Is there any actual reason
to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL
handling for all orders at the same place?
Not that I care about this much TBH. It is an improvement to drop all
the NOFAIL specifics from vm_area_alloc_pages.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-26 7:52 ` Michal Hocko
@ 2024-08-26 12:38 ` Uladzislau Rezki
2024-08-27 6:49 ` Michal Hocko
0 siblings, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-26 12:38 UTC (permalink / raw)
To: Michal Hocko
Cc: Uladzislau Rezki, Hailong Liu, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
> On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote:
> [...]
> > @@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
> > page_order = vm_area_page_order(area);
> >
> > - area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
> > + /*
> > + * Higher order nofail allocations are really expensive and
> > + * potentially dangerous (pre-mature OOM, disruptive reclaim
> > + * and compaction etc.
> > + *
> > + * Please note, the __vmalloc_node_range_noprof() falls-back
> > + * to order-0 pages if high-order attempt has been unsuccessful.
> > + */
> > + area->nr_pages = vm_area_alloc_pages(page_order ?
> > + gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
> > node, page_order, nr_small_pages, area->pages);
> >
> > atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
> > <snip>
> >
> > Is that aligned with your wish?
>
> I am not a great fan of modifying gfp_mask inside the ternary operator
> like that. It makes the code harder to read. Is there any actual reason
> to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL
> handling for all orders at the same place?
>
1. So, for bulk we have below:
/* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
I am not sure if we need it but it says it does not support it which
is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There
is a fallback to a single page allocator. If passing __GFP_NOFAIL does
not trigger any warning or panic a system, then i do not follow why
we drop that flag.
Is that odd?
2. High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
keeping NOFAIL for high-order sounds like not a good approach to me.
3. "... at the same place?"
Do you mean in the __vmalloc_node_range_noprof()?
__vmalloc_node_range_noprof()
-> __vmalloc_area_node(gfp_mask)
-> vm_area_alloc_pages()
if, so it is not straight forward, i.e. there is one more allocation:
<snip>
static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
pgprot_t prot, unsigned int page_shift,
int node)
{
...
/* Please note that the recursion is strictly bounded. */
if (array_size > PAGE_SIZE) {
area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node,
area->caller);
} else {
area->pages = kmalloc_node_noprof(array_size, nested_gfp, node);
}
...
}
<snip>
whereas it is easier to do it inside of the __vmalloc_area_node().
>
> Not that I care about this much TBH. It is an improvement to drop all
> the NOFAIL specifics from vm_area_alloc_pages.
>
I agree. I also do not like modifying gfp flags on different levels and
different cases. To me there is only one case. It is high-order requests
with NOFAIL. For this i think we should keep our approach, i mean
dropping NOFAIL and repeat because we have a fallback.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-26 12:38 ` Uladzislau Rezki
@ 2024-08-27 6:49 ` Michal Hocko
2024-08-27 12:47 ` Uladzislau Rezki
0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2024-08-27 6:49 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Hailong Liu, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Mon 26-08-24 14:38:40, Uladzislau Rezki wrote:
> On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
> > On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote:
> > [...]
> > > @@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > > set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
> > > page_order = vm_area_page_order(area);
> > >
> > > - area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
> > > + /*
> > > + * Higher order nofail allocations are really expensive and
> > > + * potentially dangerous (pre-mature OOM, disruptive reclaim
> > > + * and compaction etc.
> > > + *
> > > + * Please note, the __vmalloc_node_range_noprof() falls-back
> > > + * to order-0 pages if high-order attempt has been unsuccessful.
> > > + */
> > > + area->nr_pages = vm_area_alloc_pages(page_order ?
> > > + gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
> > > node, page_order, nr_small_pages, area->pages);
> > >
> > > atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
> > > <snip>
> > >
> > > Is that aligned with your wish?
> >
> > I am not a great fan of modifying gfp_mask inside the ternary operator
> > like that. It makes the code harder to read. Is there any actual reason
> > to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL
> > handling for all orders at the same place?
> >
> 1. So, for bulk we have below:
>
> /* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
>
> I am not sure if we need it but it says it does not support it which
> is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There
> is a fallback to a single page allocator. If passing __GFP_NOFAIL does
> not trigger any warning or panic a system, then i do not follow why
> we drop that flag.
>
> Is that odd?
I suspect this was a pre-caution more than anything.
> 2. High-order allocations. Do you think we should not care much about
> it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
> 3. "... at the same place?"
> Do you mean in the __vmalloc_node_range_noprof()?
>
> __vmalloc_node_range_noprof()
> -> __vmalloc_area_node(gfp_mask)
> -> vm_area_alloc_pages()
>
> if, so it is not straight forward, i.e. there is one more allocation:
>
> <snip>
> static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> pgprot_t prot, unsigned int page_shift,
> int node)
> {
> ...
> /* Please note that the recursion is strictly bounded. */
> if (array_size > PAGE_SIZE) {
> area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node,
> area->caller);
> } else {
> area->pages = kmalloc_node_noprof(array_size, nested_gfp, node);
> }
> ...
> }
> <snip>
>
> whereas it is easier to do it inside of the __vmalloc_area_node().
Right. The allocation path is quite convoluted here. If it is just too
much of a hassle to implement NOFAIL at a single place then we should
aim at reducing that. Having that at 3 different layers is just begging
for inconsistences.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-27 6:49 ` Michal Hocko
@ 2024-08-27 12:47 ` Uladzislau Rezki
2024-08-27 13:37 ` Michal Hocko
0 siblings, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-27 12:47 UTC (permalink / raw)
To: Michal Hocko
Cc: Uladzislau Rezki, Hailong Liu, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Tue, Aug 27, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
> On Mon 26-08-24 14:38:40, Uladzislau Rezki wrote:
> > On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
> > > On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote:
> > > [...]
> > > > @@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > > > set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
> > > > page_order = vm_area_page_order(area);
> > > >
> > > > - area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
> > > > + /*
> > > > + * Higher order nofail allocations are really expensive and
> > > > + * potentially dangerous (pre-mature OOM, disruptive reclaim
> > > > + * and compaction etc.
> > > > + *
> > > > + * Please note, the __vmalloc_node_range_noprof() falls-back
> > > > + * to order-0 pages if high-order attempt has been unsuccessful.
> > > > + */
> > > > + area->nr_pages = vm_area_alloc_pages(page_order ?
> > > > + gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
> > > > node, page_order, nr_small_pages, area->pages);
> > > >
> > > > atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
> > > > <snip>
> > > >
> > > > Is that aligned with your wish?
> > >
> > > I am not a great fan of modifying gfp_mask inside the ternary operator
> > > like that. It makes the code harder to read. Is there any actual reason
> > > to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL
> > > handling for all orders at the same place?
> > >
> > 1. So, for bulk we have below:
> >
> > /* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
> >
> > I am not sure if we need it but it says it does not support it which
> > is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There
> > is a fallback to a single page allocator. If passing __GFP_NOFAIL does
> > not trigger any warning or panic a system, then i do not follow why
> > we drop that flag.
> >
> > Is that odd?
>
> I suspect this was a pre-caution more than anything.
>
OK, then i drop it.
> > 2. High-order allocations. Do you think we should not care much about
> > it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> > if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> > keeping NOFAIL for high-order sounds like not a good approach to me.
>
> We should avoid high order allocations with GFP_NOFAIL at all cost.
>
What do you propose here? Fail such request?
> > 3. "... at the same place?"
> > Do you mean in the __vmalloc_node_range_noprof()?
> >
> > __vmalloc_node_range_noprof()
> > -> __vmalloc_area_node(gfp_mask)
> > -> vm_area_alloc_pages()
> >
> > if, so it is not straight forward, i.e. there is one more allocation:
> >
> > <snip>
> > static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > pgprot_t prot, unsigned int page_shift,
> > int node)
> > {
> > ...
> > /* Please note that the recursion is strictly bounded. */
> > if (array_size > PAGE_SIZE) {
> > area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node,
> > area->caller);
> > } else {
> > area->pages = kmalloc_node_noprof(array_size, nested_gfp, node);
> > }
> > ...
> > }
> > <snip>
> >
> > whereas it is easier to do it inside of the __vmalloc_area_node().
>
> Right. The allocation path is quite convoluted here. If it is just too
> much of a hassle to implement NOFAIL at a single place then we should
> aim at reducing that. Having that at 3 different layers is just begging
> for inconsistences.
>
Hard to not agree :)
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-27 12:47 ` Uladzislau Rezki
@ 2024-08-27 13:37 ` Michal Hocko
2024-08-27 15:29 ` Uladzislau Rezki
0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2024-08-27 13:37 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Hailong Liu, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
> On Tue, Aug 27, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
[...]
> > > 2. High-order allocations. Do you think we should not care much about
> > > it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> > > if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> > > keeping NOFAIL for high-order sounds like not a good approach to me.
> >
> > We should avoid high order allocations with GFP_NOFAIL at all cost.
> >
> What do you propose here? Fail such request?
We shouldn't have any hard requirements for higher order allocations in the vmalloc
right? In other words we can always fallback to base pages.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-27 13:37 ` Michal Hocko
@ 2024-08-27 15:29 ` Uladzislau Rezki
2024-08-28 7:14 ` Michal Hocko
0 siblings, 1 reply; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-27 15:29 UTC (permalink / raw)
To: Michal Hocko
Cc: Uladzislau Rezki, Hailong Liu, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Tue, Aug 27, 2024 at 03:37:38PM +0200, Michal Hocko wrote:
> On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
> > On Tue, Aug 26, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
> [...]
> > > > 2. High-order allocations. Do you think we should not care much about
> > > > it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> > > > if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> > > > keeping NOFAIL for high-order sounds like not a good approach to me.
> > >
> > > We should avoid high order allocations with GFP_NOFAIL at all cost.
> > >
> > What do you propose here? Fail such request?
>
> We shouldn't have any hard requirements for higher order allocations in the vmalloc
> right? In other words we can always fallback to base pages.
>
We always drop NOFAIL for high-order, if it fails we fall-back to
order-0. I got the feeling that you wanted just bail-out fully if
high-order and NOFAIL.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-27 15:29 ` Uladzislau Rezki
@ 2024-08-28 7:14 ` Michal Hocko
2024-08-28 17:23 ` Uladzislau Rezki
0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2024-08-28 7:14 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Hailong Liu, Andrew Morton, Barry Song, Christoph Hellwig,
Vlastimil Babka, Tangquan Zheng, stable, Baoquan He,
Matthew Wilcox, linux-mm, linux-kernel
On Tue 27-08-24 17:29:34, Uladzislau Rezki wrote:
> On Tue, Aug 27, 2024 at 03:37:38PM +0200, Michal Hocko wrote:
> > On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
> > > On Tue, Aug 26, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
> > [...]
> > > > > 2. High-order allocations. Do you think we should not care much about
> > > > > it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> > > > > if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> > > > > keeping NOFAIL for high-order sounds like not a good approach to me.
> > > >
> > > > We should avoid high order allocations with GFP_NOFAIL at all cost.
> > > >
> > > What do you propose here? Fail such request?
> >
> > We shouldn't have any hard requirements for higher order allocations in the vmalloc
> > right? In other words we can always fallback to base pages.
> >
> We always drop NOFAIL for high-order, if it fails we fall-back to
> order-0. I got the feeling that you wanted just bail-out fully if
> high-order and NOFAIL.
Nope. We should always fall back to order 0 for both NOFAIL and regular
vmalloc allocations.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
2024-08-28 7:14 ` Michal Hocko
@ 2024-08-28 17:23 ` Uladzislau Rezki
0 siblings, 0 replies; 27+ messages in thread
From: Uladzislau Rezki @ 2024-08-28 17:23 UTC (permalink / raw)
To: Michal Hocko
Cc: Uladzislau Rezki, Hailong Liu, Andrew Morton, Barry Song,
Christoph Hellwig, Vlastimil Babka, Tangquan Zheng, stable,
Baoquan He, Matthew Wilcox, linux-mm, linux-kernel
On Wed, Aug 28, 2024 at 09:14:53AM +0200, Michal Hocko wrote:
> On Tue 27-08-24 17:29:34, Uladzislau Rezki wrote:
> > On Tue, Aug 27, 2024 at 03:37:38PM +0200, Michal Hocko wrote:
> > > On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
> > > > On Tue, Aug 26, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
> > > [...]
> > > > > > 2. High-order allocations. Do you think we should not care much about
> > > > > > it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0
> > > > > > if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus
> > > > > > keeping NOFAIL for high-order sounds like not a good approach to me.
> > > > >
> > > > > We should avoid high order allocations with GFP_NOFAIL at all cost.
> > > > >
> > > > What do you propose here? Fail such request?
> > >
> > > We shouldn't have any hard requirements for higher order allocations in the vmalloc
> > > right? In other words we can always fallback to base pages.
> > >
> > We always drop NOFAIL for high-order, if it fails we fall-back to
> > order-0. I got the feeling that you wanted just bail-out fully if
> > high-order and NOFAIL.
>
> Nope. We should always fall back to order 0 for both NOFAIL and regular
> vmalloc allocations.
>
Good.
Thanks for the ACK!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2024-08-28 17:24 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240808122019.3361-1-hailong.liu@oppo.com>
2024-08-08 21:05 ` [RESEND PATCH v1] mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0 Barry Song
2024-08-09 9:33 ` Michal Hocko
2024-08-09 9:41 ` Uladzislau Rezki
2024-08-16 5:07 ` Andrew Morton
2024-08-16 7:19 ` Uladzislau Rezki
2024-08-16 9:12 ` Hailong Liu
2024-08-16 10:13 ` Uladzislau Rezki
2024-08-16 11:46 ` Hailong Liu
2024-08-16 12:32 ` Michal Hocko
2024-08-23 16:42 ` Uladzislau Rezki
2024-08-26 7:52 ` Michal Hocko
2024-08-26 12:38 ` Uladzislau Rezki
2024-08-27 6:49 ` Michal Hocko
2024-08-27 12:47 ` Uladzislau Rezki
2024-08-27 13:37 ` Michal Hocko
2024-08-27 15:29 ` Uladzislau Rezki
2024-08-28 7:14 ` Michal Hocko
2024-08-28 17:23 ` Uladzislau Rezki
2024-08-19 11:59 ` Uladzislau Rezki
2024-08-19 12:57 ` Hailong Liu
2024-08-19 13:38 ` Uladzislau Rezki
2024-08-19 13:45 ` Uladzislau Rezki
2024-08-20 1:59 ` Hailong Liu
2024-08-20 6:44 ` Uladzislau Rezki
2024-08-20 6:54 ` Hailong Liu
2024-08-16 16:11 ` Baoquan He
2024-08-16 16:15 ` Baoquan He
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).