linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] frozen pages for large kmalloc
@ 2025-06-02 11:02 Vlastimil Babka
  2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Vlastimil Babka @ 2025-06-02 11:02 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes
  Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
	linux-mm, linux-kernel, Vlastimil Babka

In [1] I have suggested we start warning for get_page() done on large
kmalloc pages as the first step to convert them to frozen pages.
We exposed to -next and indeed got such a warning [2]. But it turns out
the code is using sendpage_ok() and thus would avoid the get_page() if
the page was actually frozen.

So in this version, freeze the large kmalloc pages at the same time as
adding the warnings and refusals to get_page/put_page() on large kmalloc
pages, the same as we do for slab pages - in patch 2.

While doing that I've noticed that large kmalloc doesn't observe NUMA
policies, while the rest of the allocator does. This turns out to be a
regression from v6.1, so I'm restoring that first in patch 1. There is
no Cc: stable as it's not fixing a critical bug, but we can submit it to
e.g. latest LTSS afterwards.

Given the timing I would expose this to -next after the current merge
window closes, thus towards 6.17.

[1] https://lore.kernel.org/all/20250417074102.4543-2-vbabka@suse.cz/
[2] https://lore.kernel.org/all/202505221248.595a9117-lkp@intel.com/

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Changes in v2:
- Reword commit log of patch 1 to acknowledge it's restoring pre-6.1
  behavior, add Fixes:
- Change the order of the two patches to allow possible backport of the
  NUMA fix.
- Link to v1: https://patch.msgid.link/20250529-frozen-pages-for-large-kmalloc-v1-0-b3aa52a8fa17@suse.cz

---
Vlastimil Babka (2):
      mm, slab: restore NUMA policy support for large kmalloc
      mm, slab: use frozen pages for large kmalloc

 include/linux/mm.h | 4 +++-
 mm/slub.c          | 9 +++++++--
 2 files changed, 10 insertions(+), 3 deletions(-)
---
base-commit: 9c32cda43eb78f78c73aee4aa344b777714e259b
change-id: 20250529-frozen-pages-for-large-kmalloc-bd4d2522e52b

Best regards,
-- 
Vlastimil Babka <vbabka@suse.cz>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] mm, slab: restore NUMA policy support for large kmalloc
  2025-06-02 11:02 [PATCH v2 0/2] frozen pages for large kmalloc Vlastimil Babka
@ 2025-06-02 11:02 ` Vlastimil Babka
  2025-06-02 16:52   ` Christoph Lameter (Ampere)
                     ` (2 more replies)
  2025-06-02 11:02 ` [PATCH v2 2/2] mm, slab: use frozen pages " Vlastimil Babka
  2025-06-17 10:01 ` [PATCH v2 0/2] " Vlastimil Babka
  2 siblings, 3 replies; 7+ messages in thread
From: Vlastimil Babka @ 2025-06-02 11:02 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes
  Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
	linux-mm, linux-kernel, Vlastimil Babka

The slab allocator observes the task's NUMA policy in various places
such as allocating slab pages. Large kmalloc() allocations used to do
that too, until an unintended change by c4cab557521a ("mm/slab_common:
cleanup kmalloc_large()") resulted in ignoring mempolicy and just
preferring the local node. Restore the NUMA policy support.

Fixes: c4cab557521a ("mm/slab_common: cleanup kmalloc_large()")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index dc9e729e1d269b5d362cb5bc44f824640ffd00f3..11356c701f9f857a2e8cf40bf963ac3abdb5e010 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4282,7 +4282,12 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
 		flags = kmalloc_fix_flags(flags);
 
 	flags |= __GFP_COMP;
-	folio = (struct folio *)alloc_pages_node_noprof(node, flags, order);
+
+	if (node == NUMA_NO_NODE)
+		folio = (struct folio *)alloc_pages_noprof(flags, order);
+	else
+		folio = (struct folio *)__alloc_pages_noprof(flags, order, node, NULL);
+
 	if (folio) {
 		ptr = folio_address(folio);
 		lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,

-- 
2.49.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] mm, slab: use frozen pages for large kmalloc
  2025-06-02 11:02 [PATCH v2 0/2] frozen pages for large kmalloc Vlastimil Babka
  2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
@ 2025-06-02 11:02 ` Vlastimil Babka
  2025-06-17 10:01 ` [PATCH v2 0/2] " Vlastimil Babka
  2 siblings, 0 replies; 7+ messages in thread
From: Vlastimil Babka @ 2025-06-02 11:02 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes
  Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
	linux-mm, linux-kernel, Vlastimil Babka

Since slab pages are now frozen, it makes sense to have large kmalloc()
objects behave same as small kmalloc(), as the choice between the two is
an implementation detail depending on allocation size.

Notably, increasing refcount on a slab page containing kmalloc() object
is not possible anymore, so it should be consistent for large kmalloc
pages.

Therefore, change large kmalloc to use the frozen pages API.

Because of some unexpected fallout in the slab pages case (see commit
b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page"),
implement the same kind of checks and warnings as part of this change.

Notably, networking code using sendpage_ok() to determine whether the
page refcount can be manipulated in the network stack should continue
behaving correctly. Before this change, the function returns true for
large kmalloc pages and page refcount can be manipulated. After this
change, the function will return false.

Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/mm.h | 4 +++-
 mm/slub.c          | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bf55206935c467f7508e863332063bb15f904a24..d3eb6adf9fa949fbd611470182a03c743b16aac7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1549,6 +1549,8 @@ static inline void get_page(struct page *page)
 	struct folio *folio = page_folio(page);
 	if (WARN_ON_ONCE(folio_test_slab(folio)))
 		return;
+	if (WARN_ON_ONCE(folio_test_large_kmalloc(folio)))
+		return;
 	folio_get(folio);
 }
 
@@ -1643,7 +1645,7 @@ static inline void put_page(struct page *page)
 {
 	struct folio *folio = page_folio(page);
 
-	if (folio_test_slab(folio))
+	if (folio_test_slab(folio) || folio_test_large_kmalloc(folio))
 		return;
 
 	folio_put(folio);
diff --git a/mm/slub.c b/mm/slub.c
index 11356c701f9f857a2e8cf40bf963ac3abdb5e010..d87015fad2df65629050d9bcd224facd3d2f4033 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4284,9 +4284,9 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
 	flags |= __GFP_COMP;
 
 	if (node == NUMA_NO_NODE)
-		folio = (struct folio *)alloc_pages_noprof(flags, order);
+		folio = (struct folio *)alloc_frozen_pages_noprof(flags, order);
 	else
-		folio = (struct folio *)__alloc_pages_noprof(flags, order, node, NULL);
+		folio = (struct folio *)__alloc_frozen_pages_noprof(flags, order, node, NULL);
 
 	if (folio) {
 		ptr = folio_address(folio);
@@ -4783,7 +4783,7 @@ static void free_large_kmalloc(struct folio *folio, void *object)
 	lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 			      -(PAGE_SIZE << order));
 	__folio_clear_large_kmalloc(folio);
-	folio_put(folio);
+	free_frozen_pages(&folio->page, order);
 }
 
 /*

-- 
2.49.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm, slab: restore NUMA policy support for large kmalloc
  2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
@ 2025-06-02 16:52   ` Christoph Lameter (Ampere)
  2025-06-02 21:54   ` Roman Gushchin
  2025-06-04  8:53   ` Harry Yoo
  2 siblings, 0 replies; 7+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-06-02 16:52 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
	Matthew Wilcox, linux-mm, linux-kernel

On Mon, 2 Jun 2025, Vlastimil Babka wrote:

> The slab allocator observes the task's NUMA policy in various places
> such as allocating slab pages. Large kmalloc() allocations used to do
> that too, until an unintended change by c4cab557521a ("mm/slab_common:
> cleanup kmalloc_large()") resulted in ignoring mempolicy and just
> preferring the local node. Restore the NUMA policy support.

Acked-by: Christoph Lameter (Ampere) <cl@gentwo.org>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm, slab: restore NUMA policy support for large kmalloc
  2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
  2025-06-02 16:52   ` Christoph Lameter (Ampere)
@ 2025-06-02 21:54   ` Roman Gushchin
  2025-06-04  8:53   ` Harry Yoo
  2 siblings, 0 replies; 7+ messages in thread
From: Roman Gushchin @ 2025-06-02 21:54 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Andrew Morton, Harry Yoo,
	Matthew Wilcox, linux-mm, linux-kernel

Vlastimil Babka <vbabka@suse.cz> writes:

> The slab allocator observes the task's NUMA policy in various places
> such as allocating slab pages. Large kmalloc() allocations used to do
> that too, until an unintended change by c4cab557521a ("mm/slab_common:
> cleanup kmalloc_large()") resulted in ignoring mempolicy and just
> preferring the local node. Restore the NUMA policy support.
>
> Fixes: c4cab557521a ("mm/slab_common: cleanup kmalloc_large()")
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm, slab: restore NUMA policy support for large kmalloc
  2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
  2025-06-02 16:52   ` Christoph Lameter (Ampere)
  2025-06-02 21:54   ` Roman Gushchin
@ 2025-06-04  8:53   ` Harry Yoo
  2 siblings, 0 replies; 7+ messages in thread
From: Harry Yoo @ 2025-06-04  8:53 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Andrew Morton, Roman Gushchin,
	Matthew Wilcox, linux-mm, linux-kernel

On Mon, Jun 02, 2025 at 01:02:12PM +0200, Vlastimil Babka wrote:
> The slab allocator observes the task's NUMA policy in various places
> such as allocating slab pages. Large kmalloc() allocations used to do
> that too, until an unintended change by c4cab557521a ("mm/slab_common:
> cleanup kmalloc_large()") resulted in ignoring mempolicy and just
> preferring the local node. Restore the NUMA policy support.
> 
> Fixes: c4cab557521a ("mm/slab_common: cleanup kmalloc_large()")

Oops, I broke it unintentionally :(

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---

Reviewed-by: Harry Yoo <harry.yoo@oracle.com>

Thanks for fixing!

-- 
Cheers,
Harry / Hyeonggon

>  mm/slub.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index dc9e729e1d269b5d362cb5bc44f824640ffd00f3..11356c701f9f857a2e8cf40bf963ac3abdb5e010 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4282,7 +4282,12 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
>  		flags = kmalloc_fix_flags(flags);
>  
>  	flags |= __GFP_COMP;
> -	folio = (struct folio *)alloc_pages_node_noprof(node, flags, order);
> +
> +	if (node == NUMA_NO_NODE)
> +		folio = (struct folio *)alloc_pages_noprof(flags, order);
> +	else
> +		folio = (struct folio *)__alloc_pages_noprof(flags, order, node, NULL);
> +
>  	if (folio) {
>  		ptr = folio_address(folio);
>  		lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
> 
> -- 
> 2.49.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/2] frozen pages for large kmalloc
  2025-06-02 11:02 [PATCH v2 0/2] frozen pages for large kmalloc Vlastimil Babka
  2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
  2025-06-02 11:02 ` [PATCH v2 2/2] mm, slab: use frozen pages " Vlastimil Babka
@ 2025-06-17 10:01 ` Vlastimil Babka
  2 siblings, 0 replies; 7+ messages in thread
From: Vlastimil Babka @ 2025-06-17 10:01 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes
  Cc: Andrew Morton, Roman Gushchin, Harry Yoo, Matthew Wilcox,
	linux-mm, linux-kernel

On 6/2/25 13:02, Vlastimil Babka wrote:
> In [1] I have suggested we start warning for get_page() done on large
> kmalloc pages as the first step to convert them to frozen pages.
> We exposed to -next and indeed got such a warning [2]. But it turns out
> the code is using sendpage_ok() and thus would avoid the get_page() if
> the page was actually frozen.
> 
> So in this version, freeze the large kmalloc pages at the same time as
> adding the warnings and refusals to get_page/put_page() on large kmalloc
> pages, the same as we do for slab pages - in patch 2.
> 
> While doing that I've noticed that large kmalloc doesn't observe NUMA
> policies, while the rest of the allocator does. This turns out to be a
> regression from v6.1, so I'm restoring that first in patch 1. There is
> no Cc: stable as it's not fixing a critical bug, but we can submit it to
> e.g. latest LTSS afterwards.
> 
> Given the timing I would expose this to -next after the current merge
> window closes, thus towards 6.17.

It's now in slab/for-next. Went with Cc: stable for the first patch in the
end. Thanks for the reviews!

> [1] https://lore.kernel.org/all/20250417074102.4543-2-vbabka@suse.cz/
> [2] https://lore.kernel.org/all/202505221248.595a9117-lkp@intel.com/
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> Changes in v2:
> - Reword commit log of patch 1 to acknowledge it's restoring pre-6.1
>   behavior, add Fixes:
> - Change the order of the two patches to allow possible backport of the
>   NUMA fix.
> - Link to v1: https://patch.msgid.link/20250529-frozen-pages-for-large-kmalloc-v1-0-b3aa52a8fa17@suse.cz
> 
> ---
> Vlastimil Babka (2):
>       mm, slab: restore NUMA policy support for large kmalloc
>       mm, slab: use frozen pages for large kmalloc
> 
>  include/linux/mm.h | 4 +++-
>  mm/slub.c          | 9 +++++++--
>  2 files changed, 10 insertions(+), 3 deletions(-)
> ---
> base-commit: 9c32cda43eb78f78c73aee4aa344b777714e259b
> change-id: 20250529-frozen-pages-for-large-kmalloc-bd4d2522e52b
> 
> Best regards,



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-06-17 10:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 11:02 [PATCH v2 0/2] frozen pages for large kmalloc Vlastimil Babka
2025-06-02 11:02 ` [PATCH v2 1/2] mm, slab: restore NUMA policy support " Vlastimil Babka
2025-06-02 16:52   ` Christoph Lameter (Ampere)
2025-06-02 21:54   ` Roman Gushchin
2025-06-04  8:53   ` Harry Yoo
2025-06-02 11:02 ` [PATCH v2 2/2] mm, slab: use frozen pages " Vlastimil Babka
2025-06-17 10:01 ` [PATCH v2 0/2] " Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).