* [PATCH 0/2] frozen pages for large kmalloc
@ 2025-05-29 8:56 Vlastimil Babka
2025-05-29 8:56 ` [PATCH 1/2] mm, slab: use " Vlastimil Babka
2025-05-29 8:56 ` [PATCH 2/2] mm, slab: support NUMA policy " Vlastimil Babka
0 siblings, 2 replies; 11+ messages in thread
From: Vlastimil Babka @ 2025-05-29 8:56 UTC (permalink / raw)
To: Christoph Lameter, David Rientjes
Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
linux-mm, linux-kernel, Vlastimil Babka
In [1] I have suggested we start warning for get_page() done on large
kmalloc pages as the first step to convert them to frozen pages.
We exposed to -next and indeed got such a warning [2]. But it turns out
the code is using sendpage_ok() and thus would avoid the get_page() if
the page was actually frozen.
So in this version, freeze the large kmalloc pages at the same time as
adding the warnings and refusals to get_page/put_page() on large kmalloc
pages, the same as we do for slab pages - in patch 1.
While doing that I've noticed that large kmalloc doesn't observe numa
policies, while the rest of the allocator does. So I've done that in
patch 2 as I assume Christopher would welcome this change.
Given the timing I would expose this to -next after the current merge
window closes, thus towards 6.17.
[1] https://lore.kernel.org/all/20250417074102.4543-2-vbabka@suse.cz/
[2] https://lore.kernel.org/all/202505221248.595a9117-lkp@intel.com/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Vlastimil Babka (2):
mm, slab: use frozen pages for large kmalloc
mm, slab: support NUMA policy for large kmalloc
include/linux/mm.h | 4 +++-
mm/slub.c | 9 +++++++--
2 files changed, 10 insertions(+), 3 deletions(-)
---
base-commit: 9c32cda43eb78f78c73aee4aa344b777714e259b
change-id: 20250529-frozen-pages-for-large-kmalloc-bd4d2522e52b
Best regards,
--
Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] mm, slab: use frozen pages for large kmalloc
2025-05-29 8:56 [PATCH 0/2] frozen pages for large kmalloc Vlastimil Babka
@ 2025-05-29 8:56 ` Vlastimil Babka
2025-05-29 15:46 ` Roman Gushchin
2025-05-30 4:17 ` Harry Yoo
2025-05-29 8:56 ` [PATCH 2/2] mm, slab: support NUMA policy " Vlastimil Babka
1 sibling, 2 replies; 11+ messages in thread
From: Vlastimil Babka @ 2025-05-29 8:56 UTC (permalink / raw)
To: Christoph Lameter, David Rientjes
Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
linux-mm, linux-kernel, Vlastimil Babka
Since slab pages are now frozen, it makes sense to have large kmalloc()
objects behave same as small kmalloc(), as the choice between the two is
an implementation detail depending on allocation size.
Notably, increasing refcount on a slab page containing kmalloc() object
is not possible anymore, so it should be consistent for large kmalloc
pages.
Therefore, change large kmalloc to use the frozen pages API.
Because of some unexpected fallout in the slab pages case (see commit
b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page"),
implement the same kind of checks and warnings as part of this change.
Notably, networking code using sendpage_ok() to determine whether the
page refcount can be manipulated in the network stack should continue
behaving correctly. Before this change, the function returns true for
large kmalloc pages and page refcount can be manipulated. After this
change, the function will return false.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/mm.h | 4 +++-
mm/slub.c | 7 +++++--
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index bf55206935c467f7508e863332063bb15f904a24..d3eb6adf9fa949fbd611470182a03c743b16aac7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1549,6 +1549,8 @@ static inline void get_page(struct page *page)
struct folio *folio = page_folio(page);
if (WARN_ON_ONCE(folio_test_slab(folio)))
return;
+ if (WARN_ON_ONCE(folio_test_large_kmalloc(folio)))
+ return;
folio_get(folio);
}
@@ -1643,7 +1645,7 @@ static inline void put_page(struct page *page)
{
struct folio *folio = page_folio(page);
- if (folio_test_slab(folio))
+ if (folio_test_slab(folio) || folio_test_large_kmalloc(folio))
return;
folio_put(folio);
diff --git a/mm/slub.c b/mm/slub.c
index dc9e729e1d269b5d362cb5bc44f824640ffd00f3..d7a62063a1676a327e13536bf724f0160f1fc8dc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4281,8 +4281,11 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
if (unlikely(flags & GFP_SLAB_BUG_MASK))
flags = kmalloc_fix_flags(flags);
+ if (node == NUMA_NO_NODE)
+ node = numa_mem_id();
+
flags |= __GFP_COMP;
- folio = (struct folio *)alloc_pages_node_noprof(node, flags, order);
+ folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
if (folio) {
ptr = folio_address(folio);
lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
@@ -4778,7 +4781,7 @@ static void free_large_kmalloc(struct folio *folio, void *object)
lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
-(PAGE_SIZE << order));
__folio_clear_large_kmalloc(folio);
- folio_put(folio);
+ free_frozen_pages(&folio->page, order);
}
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
2025-05-29 8:56 [PATCH 0/2] frozen pages for large kmalloc Vlastimil Babka
2025-05-29 8:56 ` [PATCH 1/2] mm, slab: use " Vlastimil Babka
@ 2025-05-29 8:56 ` Vlastimil Babka
2025-05-29 14:57 ` Christoph Lameter (Ampere)
2025-05-30 7:11 ` Harry Yoo
1 sibling, 2 replies; 11+ messages in thread
From: Vlastimil Babka @ 2025-05-29 8:56 UTC (permalink / raw)
To: Christoph Lameter, David Rientjes
Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
linux-mm, linux-kernel, Vlastimil Babka
The slab allocator observes the task's numa policy in various places
such as allocating slab pages. Large kmalloc allocations currently do
not, which seems to be an unintended omission. It is simple to correct
that, so make ___kmalloc_large_node() behave the same way as
alloc_slab_page().
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/slub.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index d7a62063a1676a327e13536bf724f0160f1fc8dc..d87015fad2df65629050d9bcd224facd3d2f4033 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4281,11 +4281,13 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
if (unlikely(flags & GFP_SLAB_BUG_MASK))
flags = kmalloc_fix_flags(flags);
+ flags |= __GFP_COMP;
+
if (node == NUMA_NO_NODE)
- node = numa_mem_id();
+ folio = (struct folio *)alloc_frozen_pages_noprof(flags, order);
+ else
+ folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
- flags |= __GFP_COMP;
- folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
if (folio) {
ptr = folio_address(folio);
lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
--
2.49.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
2025-05-29 8:56 ` [PATCH 2/2] mm, slab: support NUMA policy " Vlastimil Babka
@ 2025-05-29 14:57 ` Christoph Lameter (Ampere)
2025-05-29 15:59 ` Vlastimil Babka
2025-05-30 7:11 ` Harry Yoo
1 sibling, 1 reply; 11+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-05-29 14:57 UTC (permalink / raw)
To: Vlastimil Babka
Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
Matthew Wilcox, linux-mm, linux-kernel
On Thu, 29 May 2025, Vlastimil Babka wrote:
> The slab allocator observes the task's numa policy in various places
> such as allocating slab pages. Large kmalloc allocations currently do
> not, which seems to be an unintended omission. It is simple to correct
> that, so make ___kmalloc_large_node() behave the same way as
> alloc_slab_page().
Large kmalloc allocation lead to the use of the page allocator which
implements the NUMA policies for the allocations.
This patch is not necessary.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm, slab: use frozen pages for large kmalloc
2025-05-29 8:56 ` [PATCH 1/2] mm, slab: use " Vlastimil Babka
@ 2025-05-29 15:46 ` Roman Gushchin
2025-05-30 16:59 ` Matthew Wilcox
2025-05-30 4:17 ` Harry Yoo
1 sibling, 1 reply; 11+ messages in thread
From: Roman Gushchin @ 2025-05-29 15:46 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Christoph Lameter, David Rientjes, Andrew Morton, Harry Yoo,
Matthew Wilcox, linux-mm, linux-kernel
On Thu, May 29, 2025 at 10:56:26AM +0200, Vlastimil Babka wrote:
> Since slab pages are now frozen, it makes sense to have large kmalloc()
> objects behave same as small kmalloc(), as the choice between the two is
> an implementation detail depending on allocation size.
>
> Notably, increasing refcount on a slab page containing kmalloc() object
> is not possible anymore, so it should be consistent for large kmalloc
> pages.
>
> Therefore, change large kmalloc to use the frozen pages API.
>
> Because of some unexpected fallout in the slab pages case (see commit
> b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page"),
> implement the same kind of checks and warnings as part of this change.
>
> Notably, networking code using sendpage_ok() to determine whether the
> page refcount can be manipulated in the network stack should continue
> behaving correctly. Before this change, the function returns true for
> large kmalloc pages and page refcount can be manipulated. After this
> change, the function will return false.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bf55206935c467f7508e863332063bb15f904a24..d3eb6adf9fa949fbd611470182a03c743b16aac7 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1549,6 +1549,8 @@ static inline void get_page(struct page *page)
> struct folio *folio = page_folio(page);
> if (WARN_ON_ONCE(folio_test_slab(folio)))
> return;
> + if (WARN_ON_ONCE(folio_test_large_kmalloc(folio)))
> + return;
> folio_get(folio);
I guess eventually we can convert them to VM_WARN_ON_ONCE()?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
2025-05-29 14:57 ` Christoph Lameter (Ampere)
@ 2025-05-29 15:59 ` Vlastimil Babka
2025-05-30 19:05 ` Christoph Lameter (Ampere)
0 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka @ 2025-05-29 15:59 UTC (permalink / raw)
To: Christoph Lameter (Ampere)
Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
Matthew Wilcox, linux-mm, linux-kernel
On 5/29/25 16:57, Christoph Lameter (Ampere) wrote:
> On Thu, 29 May 2025, Vlastimil Babka wrote:
>
>> The slab allocator observes the task's numa policy in various places
>> such as allocating slab pages. Large kmalloc allocations currently do
>> not, which seems to be an unintended omission. It is simple to correct
>> that, so make ___kmalloc_large_node() behave the same way as
>> alloc_slab_page().
>
> Large kmalloc allocation lead to the use of the page allocator which
> implements the NUMA policies for the allocations.
>
> This patch is not necessary.
I'm confused, as that's only true depending on which page allocator entry
point you use. AFAICS before this series, it's using
alloc_pages_node_noprof() which only does
if (nid == NUMA_NO_NODE)
nid = numa_mem_id();
and no mempolicies.
I see this patch as analogical to your commit 1941b31482a6 ("Reenable NUMA
policy support in the slab allocator")
Am I missing something?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm, slab: use frozen pages for large kmalloc
2025-05-29 8:56 ` [PATCH 1/2] mm, slab: use " Vlastimil Babka
2025-05-29 15:46 ` Roman Gushchin
@ 2025-05-30 4:17 ` Harry Yoo
1 sibling, 0 replies; 11+ messages in thread
From: Harry Yoo @ 2025-05-30 4:17 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Christoph Lameter, David Rientjes, Andrew Morton, Roman Gushchin,
Matthew Wilcox, linux-mm, linux-kernel
On Thu, May 29, 2025 at 10:56:26AM +0200, Vlastimil Babka wrote:
> Since slab pages are now frozen, it makes sense to have large kmalloc()
> objects behave same as small kmalloc(), as the choice between the two is
> an implementation detail depending on allocation size.
>
> Notably, increasing refcount on a slab page containing kmalloc() object
> is not possible anymore, so it should be consistent for large kmalloc
> pages.
>
> Therefore, change large kmalloc to use the frozen pages API.
>
> Because of some unexpected fallout in the slab pages case (see commit
> b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page"),
> implement the same kind of checks and warnings as part of this change.
>
> Notably, networking code using sendpage_ok() to determine whether the
> page refcount can be manipulated in the network stack should continue
> behaving correctly. Before this change, the function returns true for
> large kmalloc pages and page refcount can be manipulated. After this
> change, the function will return false.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
Acked-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry / Hyeonggon
> include/linux/mm.h | 4 +++-
> mm/slub.c | 7 +++++--
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bf55206935c467f7508e863332063bb15f904a24..d3eb6adf9fa949fbd611470182a03c743b16aac7 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1549,6 +1549,8 @@ static inline void get_page(struct page *page)
> struct folio *folio = page_folio(page);
> if (WARN_ON_ONCE(folio_test_slab(folio)))
> return;
> + if (WARN_ON_ONCE(folio_test_large_kmalloc(folio)))
> + return;
> folio_get(folio);
> }
>
> @@ -1643,7 +1645,7 @@ static inline void put_page(struct page *page)
> {
> struct folio *folio = page_folio(page);
>
> - if (folio_test_slab(folio))
> + if (folio_test_slab(folio) || folio_test_large_kmalloc(folio))
> return;
>
> folio_put(folio);
> diff --git a/mm/slub.c b/mm/slub.c
> index dc9e729e1d269b5d362cb5bc44f824640ffd00f3..d7a62063a1676a327e13536bf724f0160f1fc8dc 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4281,8 +4281,11 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
> if (unlikely(flags & GFP_SLAB_BUG_MASK))
> flags = kmalloc_fix_flags(flags);
>
> + if (node == NUMA_NO_NODE)
> + node = numa_mem_id();
> +
> flags |= __GFP_COMP;
> - folio = (struct folio *)alloc_pages_node_noprof(node, flags, order);
> + folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
> if (folio) {
> ptr = folio_address(folio);
> lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
> @@ -4778,7 +4781,7 @@ static void free_large_kmalloc(struct folio *folio, void *object)
> lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
> -(PAGE_SIZE << order));
> __folio_clear_large_kmalloc(folio);
> - folio_put(folio);
> + free_frozen_pages(&folio->page, order);
> }
>
> /*
>
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
2025-05-29 8:56 ` [PATCH 2/2] mm, slab: support NUMA policy " Vlastimil Babka
2025-05-29 14:57 ` Christoph Lameter (Ampere)
@ 2025-05-30 7:11 ` Harry Yoo
1 sibling, 0 replies; 11+ messages in thread
From: Harry Yoo @ 2025-05-30 7:11 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Christoph Lameter, David Rientjes, Andrew Morton, Roman Gushchin,
Matthew Wilcox, linux-mm, linux-kernel
On Thu, May 29, 2025 at 10:56:27AM +0200, Vlastimil Babka wrote:
> The slab allocator observes the task's numa policy in various places
> such as allocating slab pages. Large kmalloc allocations currently do
> not, which seems to be an unintended omission. It is simple to correct
> that, so make ___kmalloc_large_node() behave the same way as
> alloc_slab_page().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry / Hyeonggon
> mm/slub.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index d7a62063a1676a327e13536bf724f0160f1fc8dc..d87015fad2df65629050d9bcd224facd3d2f4033 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4281,11 +4281,13 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
> if (unlikely(flags & GFP_SLAB_BUG_MASK))
> flags = kmalloc_fix_flags(flags);
>
> + flags |= __GFP_COMP;
> +
> if (node == NUMA_NO_NODE)
> - node = numa_mem_id();
> + folio = (struct folio *)alloc_frozen_pages_noprof(flags, order);
> + else
> + folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
>
> - flags |= __GFP_COMP;
> - folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
> if (folio) {
> ptr = folio_address(folio);
> lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
>
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm, slab: use frozen pages for large kmalloc
2025-05-29 15:46 ` Roman Gushchin
@ 2025-05-30 16:59 ` Matthew Wilcox
0 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2025-05-30 16:59 UTC (permalink / raw)
To: Roman Gushchin
Cc: Vlastimil Babka, Christoph Lameter, David Rientjes, Andrew Morton,
Harry Yoo, linux-mm, linux-kernel
On Thu, May 29, 2025 at 03:46:26PM +0000, Roman Gushchin wrote:
> > @@ -1549,6 +1549,8 @@ static inline void get_page(struct page *page)
> > struct folio *folio = page_folio(page);
> > if (WARN_ON_ONCE(folio_test_slab(folio)))
> > return;
> > + if (WARN_ON_ONCE(folio_test_large_kmalloc(folio)))
> > + return;
> > folio_get(folio);
>
> I guess eventually we can convert them to VM_WARN_ON_ONCE()?
Eventually they go away
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
2025-05-29 15:59 ` Vlastimil Babka
@ 2025-05-30 19:05 ` Christoph Lameter (Ampere)
2025-06-02 8:24 ` Vlastimil Babka
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-05-30 19:05 UTC (permalink / raw)
To: Vlastimil Babka
Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
Matthew Wilcox, linux-mm, linux-kernel
On Thu, 29 May 2025, Vlastimil Babka wrote:
> On 5/29/25 16:57, Christoph Lameter (Ampere) wrote:
> > On Thu, 29 May 2025, Vlastimil Babka wrote:
> >
> >> The slab allocator observes the task's numa policy in various places
> >> such as allocating slab pages. Large kmalloc allocations currently do
> >> not, which seems to be an unintended omission. It is simple to correct
> >> that, so make ___kmalloc_large_node() behave the same way as
> >> alloc_slab_page().
> >
> > Large kmalloc allocation lead to the use of the page allocator which
> > implements the NUMA policies for the allocations.
> >
> > This patch is not necessary.
>
> I'm confused, as that's only true depending on which page allocator entry
> point you use. AFAICS before this series, it's using
> alloc_pages_node_noprof() which only does
>
>
> if (nid == NUMA_NO_NODE)
> nid = numa_mem_id();
>
> and no mempolicies.
That is a bug.
> I see this patch as analogical to your commit 1941b31482a6 ("Reenable NUMA
> policy support in the slab allocator")
>
> Am I missing something?
The page allocator has its own NUMA suport.
The patch to reenable NUMA support dealt with an issue within the
allocator where the memory policies were ignored.
It seems that the error was repeated for large kmalloc allocations.
Instead of respecting memory allocation policies the allocation is forced
to be local to the node.
The forcing to the node is possible with GFP_THISNODE. The default needs
to be following memory policies.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
2025-05-30 19:05 ` Christoph Lameter (Ampere)
@ 2025-06-02 8:24 ` Vlastimil Babka
0 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2025-06-02 8:24 UTC (permalink / raw)
To: Christoph Lameter (Ampere)
Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
Matthew Wilcox, linux-mm, linux-kernel
On 5/30/25 21:05, Christoph Lameter (Ampere) wrote:
> On Thu, 29 May 2025, Vlastimil Babka wrote:
>
>> On 5/29/25 16:57, Christoph Lameter (Ampere) wrote:
>> > On Thu, 29 May 2025, Vlastimil Babka wrote:
>> >
>> >> The slab allocator observes the task's numa policy in various places
>> >> such as allocating slab pages. Large kmalloc allocations currently do
>> >> not, which seems to be an unintended omission. It is simple to correct
>> >> that, so make ___kmalloc_large_node() behave the same way as
>> >> alloc_slab_page().
>> >
>> > Large kmalloc allocation lead to the use of the page allocator which
>> > implements the NUMA policies for the allocations.
>> >
>> > This patch is not necessary.
>>
>> I'm confused, as that's only true depending on which page allocator entry
>> point you use. AFAICS before this series, it's using
>> alloc_pages_node_noprof() which only does
>>
>>
>> if (nid == NUMA_NO_NODE)
>> nid = numa_mem_id();
>>
>> and no mempolicies.
>
> That is a bug.
>
>> I see this patch as analogical to your commit 1941b31482a6 ("Reenable NUMA
>> policy support in the slab allocator")
>>
>> Am I missing something?
>
> The page allocator has its own NUMA suport.
It has support for respecting a preferred node (or forced node with
__GFP_THISNODE), nodemask, cpusets.
Mempolicy support is arguably outside the page allocator itself -
alloc_pages_mpol() lives in mm/mempolicy.c. Although some generically
looking wrappers alloc_pages() lead to it, while others don't
(__alloc_pages()). It's a kinda mess.
> The patch to reenable NUMA support dealt with an issue within the
> allocator where the memory policies were ignored.
I'd agree in the sense the issue was within the slab allocator, calling the
non-mpol-aware page allocator entry, which was not intended.
> It seems that the error was repeated for large kmalloc allocations.
After some digging, seems you're right and the error was done in commit
c4cab557521a ("mm/slab_common: cleanup kmalloc_large()") in v6.1. I'll add a
Fixes: tag and reword changelog accordingly.
> Instead of respecting memory allocation policies the allocation is forced
> to be local to the node.
It's not forced, but preferred, unless kmalloc() caller itself passes
__GFP_THISNODE. The slab layer doesn't add it.
> The forcing to the node is possible with GFP_THISNODE. The default needs
> to be following memory policies.
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-06-02 8:24 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-29 8:56 [PATCH 0/2] frozen pages for large kmalloc Vlastimil Babka
2025-05-29 8:56 ` [PATCH 1/2] mm, slab: use " Vlastimil Babka
2025-05-29 15:46 ` Roman Gushchin
2025-05-30 16:59 ` Matthew Wilcox
2025-05-30 4:17 ` Harry Yoo
2025-05-29 8:56 ` [PATCH 2/2] mm, slab: support NUMA policy " Vlastimil Babka
2025-05-29 14:57 ` Christoph Lameter (Ampere)
2025-05-29 15:59 ` Vlastimil Babka
2025-05-30 19:05 ` Christoph Lameter (Ampere)
2025-06-02 8:24 ` Vlastimil Babka
2025-05-30 7:11 ` Harry Yoo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).