All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH v2] mm: Fix memblock_free_late() when using deferred struct page
Date: Fri, 20 Feb 2026 11:00:15 +0200	[thread overview]
Message-ID: <aZgin4snWf0RD98X@kernel.org> (raw)
In-Reply-To: <39289588fddb4844264546cd103ba4595430f313.camel@kernel.crashing.org>

On Fri, Feb 20, 2026 at 09:46:50AM +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2026-02-19 at 12:16 +0200, Mike Rapoport wrote:
> > 
> > Let's split it. EFI does weird things with memory already, like mremapping
> > normal memory for example.
> 
> Yup.
> 
> > Here's my take on the split. Lightly tested on qemu and recovered ~45M of
> > ram with the OVMF version I have :)
> 
> Nice :-) I'll test this here.
> 
> > > 
> > +struct efi_freeable_range {
> > +	u64 start;
> > +	u64 end;
> > +};
> > 
> 
> Haha, you went the blunt way :-) I was trying to avoid creating yet-
> another structure with "start/end" :-) 

Well, seems to me the easiest and the most efficient :)
I could have used "struct range", but I don't like it's semantics with
excluding the end. It would mean adding/subtracting 1 everywhere, seems
error prone to me.
 
> > +
> > +static struct efi_freeable_range *ranges_to_free;
> > +
> >  void __init efi_free_boot_services(void)
> >  {
> 
> I was going to call it efi_unmap_boot_services() to avoid having two
> things with almost the same name.

I wanted to minimize churn, but in the end it's not that much to change and
efi_unmap_boot_services() is a better name.
 
> >  	struct efi_memory_map_data data = { 0 };
> >  	efi_memory_desc_t *md;
> >  	int num_entries = 0;
> > +	int idx = 0;
> >  	void *new, *new_md;
> >  
> >  	/* Keep all regions for /sys/kernel/debug/efi */
> >  	if (efi_enabled(EFI_DBG))
> >  		return;
> >  
> > +	ranges_to_free = kzalloc(sizeof(*ranges_to_free) * efi.memmap.nr_map,
> > +				 GFP_KERNEL);
> > +	if (!ranges_to_free) {
> > +		pr_err("Failed to allocate storage for freeable EFI regions\n");
> > +		return;
> > +	}
> 
> Do we still want to do the whole unmap dance in that case ? I mean, OOM
> here means the system is pretty much a goner at that stage but ...

There is another potential OOM in that function. If it happens, we just
skip remapping and return. So return here is consistent :) 
 
> >  	for_each_efi_memory_desc(md) {
> >  		unsigned long long start = md->phys_addr;
> >  		unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
> > @@ -471,7 +486,15 @@ void __init efi_free_boot_services(void)
> >  			start = SZ_1M;
> >  		}
> >  
> > -		memblock_free_late(start, size);
> > +		/*
> > +		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
> > +		 * map are still not initialized and we can't reliably free
> > +		 * memory here.
> > +		 * Queue the ranges to free at a later point.
> > +		 */
> > +		ranges_to_free[idx].start = start;
> > +		ranges_to_free[idx].end = start + size;
> > +		idx++;
> 
> Do we want to make this conditional to CONFIG_DEFERRED_STRUCT_PAGE_INIT
> or we don't care ?

I think it'll add ugliness for no good reason. If we want to keep systems
with CONFIG_DEFERRED_STRUCT_PAGE_INIT=n behave the same way as now, we need
several more if (CONFIG_DEFERRED_STRUCT_PAGE_INIT) and it becomes hairy.

And the change is quite small IMHO to just make it for everything.

> >  	}
> >  
> >  	if (!num_entries)
> > @@ -512,6 +535,23 @@ void __init efi_free_boot_services(void)
> >  	}
> >  }
> >  
> > +static int __init efi_free_boot_services_memory(void)
> > +{
> > +	struct efi_freeable_range *range = ranges_to_free;
> > +
> > +	while (range->start) {
> > +		void *start = phys_to_virt(range->start);
> > +		void *end = phys_to_virt(range->end);
> > +
> > +		free_reserved_area(start, end, -1, NULL);
> 
> I assume here too the total_ram_page_inc stuff is taken care of ? I
> haven't really looked. This feels like a fragile counter.

This is a fragile counter :)
free_reserved_area() -> free_reserved_page() take care of it.
 
> > +		range++;
> > +	}
> > +	kfree(ranges_to_free);
> > +
> > +	return 0;
> > +}
> > +late_initcall(efi_free_boot_services_memory);
> > +
> >  /*
> >   * A number of config table entries get remapped to virtual addresses
> >   * after entering EFI virtual mode. However, the kexec kernel requires
> > 
> > base-commit: 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b
> > -- 
> > 2.51.0
> > 
> >  
> > > Cheers,
> > > Ben.
> > > 
> > 
> 

-- 
Sincerely yours,
Mike.


  parent reply	other threads:[~2026-02-20  9:00 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-03  8:02 [PATCH] mm: Fix memblock_free_late() when using deferred struct page Benjamin Herrenschmidt
2026-02-03 18:40 ` Mike Rapoport
2026-02-03 19:53   ` Benjamin Herrenschmidt
2026-02-04  7:39     ` Mike Rapoport
2026-02-04  9:02       ` Benjamin Herrenschmidt
2026-02-06 10:33         ` Mike Rapoport
2026-02-10  1:04           ` Benjamin Herrenschmidt
2026-02-10  2:10             ` Benjamin Herrenschmidt
2026-02-10  6:17               ` Benjamin Herrenschmidt
2026-02-10  8:34                 ` Benjamin Herrenschmidt
2026-02-10 14:32                   ` Mike Rapoport
2026-02-10 23:23                     ` Benjamin Herrenschmidt
2026-02-11  5:20                       ` Mike Rapoport
2026-02-16  5:34                       ` Benjamin Herrenschmidt
2026-02-16  6:51                         ` Benjamin Herrenschmidt
2026-02-16  4:53                     ` Benjamin Herrenschmidt
2026-02-16 15:28                       ` Mike Rapoport
2026-02-16 10:36           ` Alexander Potapenko
2026-02-17  8:28 ` [PATCH v2] " Benjamin Herrenschmidt
2026-02-17 12:32   ` Mike Rapoport
2026-02-17 22:00     ` Benjamin Herrenschmidt
2026-02-17 21:47   ` Benjamin Herrenschmidt
2026-02-18  0:15     ` Benjamin Herrenschmidt
2026-02-18  8:05       ` Mike Rapoport
2026-02-19  2:48         ` Benjamin Herrenschmidt
2026-02-19 10:16           ` Mike Rapoport
2026-02-19 22:46             ` Benjamin Herrenschmidt
2026-02-20  4:57               ` Benjamin Herrenschmidt
2026-02-20  9:09                 ` Mike Rapoport
2026-02-23  2:54                   ` Benjamin Herrenschmidt
2026-02-24  5:56                   ` Benjamin Herrenschmidt
2026-02-20  9:00               ` Mike Rapoport [this message]
2026-02-20  5:12             ` Benjamin Herrenschmidt
2026-02-20  5:15             ` Benjamin Herrenschmidt
2026-02-20  5:47             ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZgin4snWf0RD98X@kernel.org \
    --to=rppt@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.