Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] block: partitions: replace __get_free_page() with kmalloc()
@ 2026-05-20  8:15 Mike Rapoport (Microsoft)
  2026-05-25  6:08 ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Rapoport (Microsoft) @ 2026-05-20  8:15 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Rapoport, linux-block, linux-kernel, linux-mm

check_partition() allocates a buffer to use as backing buffer for
seq_buf.

This buffer can be allocated with kmalloc() as there's nothing special
about it to go directly to the page allocator.

Replace use of __get_free_page() with kmalloc() and free_page() with
kfree().

Link: https://lore.kernel.org/all/635405e4-9423-4a25-a6e7-e03c8ea0bcbe@redhat.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
This is a (tiny) part of larger work of replacing page allocator calls
with kmalloc:

Also in git:
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git gfp-to-kmalloc/block

Signed-off-by: Mike Rapoport <rppt@kernel.org>
---
 block/partitions/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/partitions/core.c b/block/partitions/core.c
index 5d5332ce586b..b5c59b79ca7c 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -124,7 +124,7 @@ static struct parsed_partitions *check_partition(struct gendisk *hd)
 	state = allocate_partitions(hd);
 	if (!state)
 		return NULL;
-	state->pp_buf.buffer = (char *)__get_free_page(GFP_KERNEL);
+	state->pp_buf.buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!state->pp_buf.buffer) {
 		free_partitions(state);
 		return NULL;
@@ -154,7 +154,7 @@ static struct parsed_partitions *check_partition(struct gendisk *hd)
 	if (res > 0) {
 		printk(KERN_INFO "%s", seq_buf_str(&state->pp_buf));
 
-		free_page((unsigned long)state->pp_buf.buffer);
+		kfree(state->pp_buf.buffer);
 		return state;
 	}
 	if (state->access_beyond_eod)
@@ -170,7 +170,7 @@ static struct parsed_partitions *check_partition(struct gendisk *hd)
 		printk(KERN_INFO "%s", seq_buf_str(&state->pp_buf));
 	}
 
-	free_page((unsigned long)state->pp_buf.buffer);
+	kfree(state->pp_buf.buffer);
 	free_partitions(state);
 	return ERR_PTR(res);
 }

---
base-commit: 5d6919055dec134de3c40167a490f33c74c12581
change-id: 20260520-block-25582753fd38

Best regards,
--  
Sincerely yours,
Mike.



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-20  8:15 [PATCH] block: partitions: replace __get_free_page() with kmalloc() Mike Rapoport (Microsoft)
@ 2026-05-25  6:08 ` Christoph Hellwig
  2026-05-25  6:52   ` Mike Rapoport
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-25  6:08 UTC (permalink / raw)
  To: Mike Rapoport (Microsoft); +Cc: Jens Axboe, linux-block, linux-kernel, linux-mm

On Wed, May 20, 2026 at 11:15:52AM +0300, Mike Rapoport (Microsoft) wrote:
> check_partition() allocates a buffer to use as backing buffer for
> seq_buf.
> 
> This buffer can be allocated with kmalloc() as there's nothing special
> about it to go directly to the page allocator.
> 
> Replace use of __get_free_page() with kmalloc() and free_page() with
> kfree().

So I heard various vague references that we should replace
__get_free_page with kmalloc, but nothing definitive.  Can you please
point to a good resource for that?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-25  6:08 ` Christoph Hellwig
@ 2026-05-25  6:52   ` Mike Rapoport
  2026-05-25  7:16     ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Rapoport @ 2026-05-25  6:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, linux-kernel, linux-mm

On Sun, May 24, 2026 at 11:08:31PM -0700, Christoph Hellwig wrote:
> On Wed, May 20, 2026 at 11:15:52AM +0300, Mike Rapoport (Microsoft) wrote:
> > check_partition() allocates a buffer to use as backing buffer for
> > seq_buf.
> > 
> > This buffer can be allocated with kmalloc() as there's nothing special
> > about it to go directly to the page allocator.
> > 
> > Replace use of __get_free_page() with kmalloc() and free_page() with
> > kfree().
> 
> So I heard various vague references that we should replace
> __get_free_page with kmalloc, but nothing definitive.  Can you please
> point to a good resource for that?

There was quite recent discussion when I posted patches that change
__get_free_page to return void *:

https://lore.kernel.org/all/20251018093002.3660549-1-rppt@kernel.org/

And an old thread when Al posted similar patches:

https://lore.kernel.org/all/CA+55aFwp4iy4rtX2gE2WjBGFL=NxMVnoFeHqYa2j1dYOMMGqxg@mail.gmail.com/T/#u<S-Del>

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-25  6:52   ` Mike Rapoport
@ 2026-05-25  7:16     ` Christoph Hellwig
  2026-05-25  9:35       ` Mike Rapoport
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-25  7:16 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel,
	linux-mm

On Mon, May 25, 2026 at 09:52:09AM +0300, Mike Rapoport wrote:
> On Sun, May 24, 2026 at 11:08:31PM -0700, Christoph Hellwig wrote:
> > On Wed, May 20, 2026 at 11:15:52AM +0300, Mike Rapoport (Microsoft) wrote:
> > > check_partition() allocates a buffer to use as backing buffer for
> > > seq_buf.
> > > 
> > > This buffer can be allocated with kmalloc() as there's nothing special
> > > about it to go directly to the page allocator.
> > > 
> > > Replace use of __get_free_page() with kmalloc() and free_page() with
> > > kfree().
> > 
> > So I heard various vague references that we should replace
> > __get_free_page with kmalloc, but nothing definitive.  Can you please
> > point to a good resource for that?
> 
> There was quite recent discussion when I posted patches that change
> __get_free_page to return void *:
> 
> https://lore.kernel.org/all/20251018093002.3660549-1-rppt@kernel.org/

This doesn't tell much more.

> And an old thread when Al posted similar patches:
> 
> https://lore.kernel.org/all/CA+55aFwp4iy4rtX2gE2WjBGFL=NxMVnoFeHqYa2j1dYOMMGqxg@mail.gmail.com/T/#u<S-Del>

This does, but it still fails to explain why kmalloc performs just as
well as __get_free_page(s) these days.

And such an explanation or link to it should go into every patch doing
this switch.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-25  7:16     ` Christoph Hellwig
@ 2026-05-25  9:35       ` Mike Rapoport
  2026-05-26  6:27         ` Christoph Hellwig
  2026-05-26 12:07         ` Vlastimil Babka
  0 siblings, 2 replies; 9+ messages in thread
From: Mike Rapoport @ 2026-05-25  9:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, linux-kernel, linux-mm

On Mon, May 25, 2026 at 12:16:23AM -0700, Christoph Hellwig wrote:
> On Mon, May 25, 2026 at 09:52:09AM +0300, Mike Rapoport wrote:
> > On Sun, May 24, 2026 at 11:08:31PM -0700, Christoph Hellwig wrote:
> > > On Wed, May 20, 2026 at 11:15:52AM +0300, Mike Rapoport (Microsoft) wrote:
> > > > check_partition() allocates a buffer to use as backing buffer for
> > > > seq_buf.
> > > > 
> > > > This buffer can be allocated with kmalloc() as there's nothing special
> > > > about it to go directly to the page allocator.
> > > > 
> > > > Replace use of __get_free_page() with kmalloc() and free_page() with
> > > > kfree().
> > > 
> > > So I heard various vague references that we should replace
> > > __get_free_page with kmalloc, but nothing definitive.  Can you please
> > > point to a good resource for that?
> > 
> > There was quite recent discussion when I posted patches that change
> > __get_free_page to return void *:
> > 
> > https://lore.kernel.org/all/20251018093002.3660549-1-rppt@kernel.org/
> 
> This doesn't tell much more.
> 
> > And an old thread when Al posted similar patches:
> > 
> > https://lore.kernel.org/all/CA+55aFwp4iy4rtX2gE2WjBGFL=NxMVnoFeHqYa2j1dYOMMGqxg@mail.gmail.com/T/#u<S-Del>
> 
> This does, but it still fails to explain why kmalloc performs just as
> well as __get_free_page(s) these days.

I don't think that in this case - a single allocation on the cold path -
the performance difference is even measurable.

Nevertheless allocations from slab caches are way faster than
__get_free_page() (i.e.  alloc_pages()) as it's essentially lockless
cmpxchg. Allocations that need to refill the cache do alloc_pages() with a
little of slab bookkeeping overhead.

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-25  9:35       ` Mike Rapoport
@ 2026-05-26  6:27         ` Christoph Hellwig
  2026-05-26 12:07         ` Vlastimil Babka
  1 sibling, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-26  6:27 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel,
	linux-mm

On Mon, May 25, 2026 at 12:35:56PM +0300, Mike Rapoport wrote:
> > This does, but it still fails to explain why kmalloc performs just as
> > well as __get_free_page(s) these days.
> 
> I don't think that in this case - a single allocation on the cold path -
> the performance difference is even measurable.

Well, please state that.

> Nevertheless allocations from slab caches are way faster than
> __get_free_page() (i.e.  alloc_pages()) as it's essentially lockless
> cmpxchg. Allocations that need to refill the cache do alloc_pages() with a
> little of slab bookkeeping overhead.

Please state that too.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-25  9:35       ` Mike Rapoport
  2026-05-26  6:27         ` Christoph Hellwig
@ 2026-05-26 12:07         ` Vlastimil Babka
  2026-05-26 14:37           ` Matthew Wilcox
  1 sibling, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2026-05-26 12:07 UTC (permalink / raw)
  To: Mike Rapoport, Christoph Hellwig, Matthew Wilcox
  Cc: Jens Axboe, linux-block, linux-kernel, linux-mm

On 5/25/26 11:35 AM, Mike Rapoport wrote:
> On Mon, May 25, 2026 at 12:16:23AM -0700, Christoph Hellwig wrote:
>>
>> This does, but it still fails to explain why kmalloc performs just as
>> well as __get_free_page(s) these days.
> 
> I don't think that in this case - a single allocation on the cold path -
> the performance difference is even measurable.
> 
> Nevertheless allocations from slab caches are way faster than
> __get_free_page() (i.e.  alloc_pages()) as it's essentially lockless
> cmpxchg. Allocations that need to refill the cache do alloc_pages() with a

Probably not "way faster" but the fast path is quite similar - percpu
pcplist protected by spin_trylock (pages) vs sheaves with local_trylock
(slab), should slightly favour slab because spinlocks are typically not
inlined and local_trylock is.

The main reasons for switching AFAIU would be related with the
folio/memdesc conversions? If one needs just a kernel memory buffer,
kmalloc() it is, even if it happens to be page size. Page allocator
should be only used if you need e.g. the refcounting or anything else
that struct page provides. But then in some cases the memdesc conversion
would need adjustments at some point. With kmalloc() we can forget about
this user.

Matthew can probably state it better or even link to something
authoritative?

> little of slab bookkeeping overhead.
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-26 12:07         ` Vlastimil Babka
@ 2026-05-26 14:37           ` Matthew Wilcox
  2026-05-26 20:57             ` Vlastimil Babka
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2026-05-26 14:37 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Mike Rapoport, Christoph Hellwig, Jens Axboe, linux-block,
	linux-kernel, linux-mm

On Tue, May 26, 2026 at 02:07:36PM +0200, Vlastimil Babka wrote:
> The main reasons for switching AFAIU would be related with the
> folio/memdesc conversions? If one needs just a kernel memory buffer,
> kmalloc() it is, even if it happens to be page size. Page allocator
> should be only used if you need e.g. the refcounting or anything else
> that struct page provides. But then in some cases the memdesc conversion
> would need adjustments at some point. With kmalloc() we can forget about
> this user.

No, I think this is unrelated to memdescs.

I've seen a few people say slightly wrong things about
folios/pages/memdescs recently, so let me try to clarify the end state.

I do not intend to get rid of the ability to allocate a bare page of
memory with something like alloc_pages() or get_free_page().  It's
just that the struct page associated with it will contain far less
information (because it's smaller).

https://kernelnewbies.org/MatthewWilcox/Memdescs has a bit more
information, but to distill it:

You get a u64 worth of data (technically one per page, but if you
allocate multiple pages, they're all going to be the same).
Bits 0-3 will be type 0 (to indicate that it has no memdesc).  
Bits 4-10 will be subtype 2 (to indicate no information about owner).
Bit 11 will be clear to indicate that this page should not be mappable
to userspace.
Bits 12-17 will store the allocation order.
The top few bits will encode zone/node/section like page->flags
do today.

That doesn't leave many free bits for the user, but that's OK because
most allocations don't actually need any bits in struct page.  If you do
want something like a refcount or list_head, see the "Managed memory"
section on that page.  If you actually want a full-fat folio, well,
allocate a folio, not a page.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] block: partitions: replace __get_free_page() with kmalloc()
  2026-05-26 14:37           ` Matthew Wilcox
@ 2026-05-26 20:57             ` Vlastimil Babka
  0 siblings, 0 replies; 9+ messages in thread
From: Vlastimil Babka @ 2026-05-26 20:57 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Mike Rapoport, Christoph Hellwig, Jens Axboe, linux-block,
	linux-kernel, linux-mm

On 5/26/26 16:37, Matthew Wilcox wrote:
> On Tue, May 26, 2026 at 02:07:36PM +0200, Vlastimil Babka wrote:
>> The main reasons for switching AFAIU would be related with the
>> folio/memdesc conversions? If one needs just a kernel memory buffer,
>> kmalloc() it is, even if it happens to be page size. Page allocator
>> should be only used if you need e.g. the refcounting or anything else
>> that struct page provides. But then in some cases the memdesc conversion
>> would need adjustments at some point. With kmalloc() we can forget about
>> this user.
> 
> No, I think this is unrelated to memdescs.
> 
> I've seen a few people say slightly wrong things about
> folios/pages/memdescs recently, so let me try to clarify the end state.
> 
> I do not intend to get rid of the ability to allocate a bare page of
> memory with something like alloc_pages() or get_free_page().  It's
> just that the struct page associated with it will contain far less
> information (because it's smaller).

Alright, but isn't it still the case that if you don't need any of what
struct page provides today or will do in the future, it's better if you just
use kmalloc()? I thought you said so yourself?

https://lore.kernel.org/all/aPQxN7-FeFB6vTuv@casper.infradead.org/

So what exactly would your rationale for "Most of them shouldn't be using
get_free_pages() at all, they should be using kmalloc()." be?

> https://kernelnewbies.org/MatthewWilcox/Memdescs has a bit more
> information, but to distill it:
> 
> You get a u64 worth of data (technically one per page, but if you
> allocate multiple pages, they're all going to be the same).
> Bits 0-3 will be type 0 (to indicate that it has no memdesc).  
> Bits 4-10 will be subtype 2 (to indicate no information about owner).
> Bit 11 will be clear to indicate that this page should not be mappable
> to userspace.
> Bits 12-17 will store the allocation order.
> The top few bits will encode zone/node/section like page->flags
> do today.
> 
> That doesn't leave many free bits for the user, but that's OK because
> most allocations don't actually need any bits in struct page.  If you do
> want something like a refcount or list_head, see the "Managed memory"
> section on that page.  If you actually want a full-fat folio, well,
> allocate a folio, not a page.



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-05-26 20:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20  8:15 [PATCH] block: partitions: replace __get_free_page() with kmalloc() Mike Rapoport (Microsoft)
2026-05-25  6:08 ` Christoph Hellwig
2026-05-25  6:52   ` Mike Rapoport
2026-05-25  7:16     ` Christoph Hellwig
2026-05-25  9:35       ` Mike Rapoport
2026-05-26  6:27         ` Christoph Hellwig
2026-05-26 12:07         ` Vlastimil Babka
2026-05-26 14:37           ` Matthew Wilcox
2026-05-26 20:57             ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox