All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Aaron Thompson <dev@aaront.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/1] Fix memblock_free_late() deferred init bug, redux
Date: Tue, 7 Feb 2023 10:02:32 +0200	[thread overview]
Message-ID: <Y+IFmA1FVNRtpEFZ@kernel.org> (raw)
In-Reply-To: <20230206071211.3157-1-dev@aaront.org>

Hi Aaron,

On Mon, Feb 06, 2023 at 07:12:09AM +0000, Aaron Thompson wrote:
> Hi Mike,
> 
> Unfortunately my attempted bugfix 115d9d77bb0f ("mm: Always release pages to the
> buddy allocator in memblock_free_late()") is itself buggy. It's true that all
> reserved pages are initialized by the time memblock_free_late() is called, but
> if the pages being freed are in the deferred init range, __free_one_page() might
> access nearby uninitialized pages when trying to coalesce buddies, in which case
> badness ensues :(
> 
> deferred_init_maxorder() handles this by initializing a max-order-sized block of
> pages before freeing any of them. We could change memblock_free_late() to do
> that, but it seems to me that having memblock_free_late() get involved in
> initializing and freeing free pages would be a bit messy. I think it will be
> simpler to free the reserved pages later, as part of deferred init or after.
> 
> I can see a few ways to accomplish that:
> 
> 1. Have memblock_free_late() remove the range from memblock.reserved. Deferred
>    init will then handle that range as just another free range, so the pages
>    will be initialized and freed by deferred_init_maxorder().
> 
>    This is the simplest fix, but the problem is that the pages will be
>    initialized twice, first by memmap_init_reserved_pages() and again by
>    deferred_init_maxorder(). It looks risky to me to blindly zero out an
>    already-initialized page struct, but if we could say for certain that the
>    page structs for memblock-reserved ranges aren't actually used, at least
>    until after deferred init is done, then this could work. I don't know the
>    usage of page structs well enough to say.
> 
> 2. Take 1 and fix the double-init problem. In addition to removing the range
>    from memblock.reserved, also set a flag on the range in memblock.memory that
>    indicates the pages for that range have already been initialized.
>    deferred_init_maxorder() would skip initializing pages for ranges with the
>    flag set, but it would still free them.
> 
>    This seems like a bit of a conceptual stretch of the memblock region flags
>    since this is not a property of the memory itself but rather of the page
>    structs corresponding to that memory. But it gets the job done.
> 
> 3. Defer the freeing of reserved pages until after deferred init is completely
>    done. Have memblock_free_late() set a flag on the range in memblock.reserved,
>    and have memblock_discard() call __free_pages_core() on those ranges.
> 
>    I think this would be a reasonable use of flags on reserved regions. They are
>    not currently used.

I think 3 is the most straight-forward as a concept. It'll need some care
for ARCH_KEEP_MEMBLOCK architectures, e.g. arm64, though

I also can think about

4. Extend initialization of the memory map around the reserved regions in
memmap_init_reserved_pages()/reserve_bootmem_region(). If these functions
initialize the memory map of the entire pageblock surrounding the reserved
range, __free_one_page() will certainly access initialized struct pages.
 
> The included patch implements option 1 because it is the simplest, but it should
> not be used if the double-init of the page structs is unsafe. In my testing I
> verified that the count, mapcount, and lru list head of all pages are at their
> defaults when memblock_free_late() is called by efi_free_boot_services(), but
> that's obviously not conclusive. I have draft versions of 2 and 3 that I can
> finish up quickly if either of those are preferable.

At this point of the release cycle I prefer to revert 115d9d77bb0f ("mm:
Always release pages to the buddy allocator in memblock_free_late()") and
to work on the proper fix for the next release.
 
> Please let me know what you think, and sorry for introducing this bug.
> 
> Thanks,
> Aaron
> 
> Aaron Thompson (1):
>   mm: Defer freeing reserved pages in memblock_free_late()
> 
>  mm/internal.h                     |  2 ++
>  mm/memblock.c                     | 36 ++++++++++++++++++++-----------
>  mm/page_alloc.c                   | 17 +++++++++++++++
>  tools/testing/memblock/internal.h |  7 +++---
>  4 files changed, 47 insertions(+), 15 deletions(-)
> 
> -- 
> 2.30.2
> 
> 

-- 
Sincerely yours,
Mike.


  parent reply	other threads:[~2023-02-07  8:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06  7:12 [PATCH 0/1] Fix memblock_free_late() deferred init bug, redux Aaron Thompson
2023-02-06  7:12 ` [PATCH 1/1] mm: Defer freeing reserved pages in memblock_free_late() Aaron Thompson
2023-02-07  8:02 ` Mike Rapoport [this message]
2023-02-07  8:28   ` [PATCH 0/1] Fix memblock_free_late() deferred init bug, redux Aaron Thompson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+IFmA1FVNRtpEFZ@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dev@aaront.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.