* FAILED: patch "[PATCH] mm: Always release pages to the buddy allocator in" failed to apply to 5.15-stable tree
@ 2023-01-14 16:45 gregkh
2023-01-15 8:32 ` Mike Rapoport
0 siblings, 1 reply; 4+ messages in thread
From: gregkh @ 2023-01-14 16:45 UTC (permalink / raw)
To: dev, rppt; +Cc: stable
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
Possible dependencies:
115d9d77bb0f ("mm: Always release pages to the buddy allocator in memblock_free_late().")
16802e55dea9 ("memblock tests: Add skeleton of the memblock simulator")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 115d9d77bb0f9152c60b6e8646369fa7f6167593 Mon Sep 17 00:00:00 2001
From: Aaron Thompson <dev@aaront.org>
Date: Fri, 6 Jan 2023 22:22:44 +0000
Subject: [PATCH] mm: Always release pages to the buddy allocator in
memblock_free_late().
If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
only releases pages to the buddy allocator if they are not in the
deferred range. This is correct for free pages (as defined by
for_each_free_mem_pfn_range_in_zone()) because free pages in the
deferred range will be initialized and released as part of the deferred
init process. memblock_free_pages() is called by memblock_free_late(),
which is used to free reserved ranges after memblock_free_all() has
run. All pages in reserved ranges have been initialized at that point,
and accordingly, those pages are not touched by the deferred init
process. This means that currently, if the pages that
memblock_free_late() intends to release are in the deferred range, they
will never be released to the buddy allocator. They will forever be
reserved.
In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
which is also correct for free pages but is not correct for reserved
pages. KMSAN metadata for reserved pages is initialized by
kmsan_init_shadow(), which runs shortly before memblock_free_all().
For both of these reasons, memblock_free_pages() should only be called
for free pages, and memblock_free_late() should call __free_pages_core()
directly instead.
One case where this issue can occur in the wild is EFI boot on
x86_64. The x86 EFI code reserves all EFI boot services memory ranges
via memblock_reserve() and frees them later via memblock_free_late()
(efi_reserve_boot_services() and efi_free_boot_services(),
respectively). If any of those ranges happens to fall within the
deferred init range, the pages will not be released and that memory will
be unavailable.
For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
v6.2-rc2:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 178867
v6.2-rc2 + patch:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 222816 # +43,949 pages
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
diff --git a/mm/memblock.c b/mm/memblock.c
index d036c7861310..685e30e6d27c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1640,7 +1640,13 @@ void __init memblock_free_late(phys_addr_t base, phys_addr_t size)
end = PFN_DOWN(base + size);
for (; cursor < end; cursor++) {
- memblock_free_pages(pfn_to_page(cursor), cursor, 0);
+ /*
+ * Reserved pages are always initialized by the end of
+ * memblock_free_all() (by memmap_init() and, if deferred
+ * initialization is enabled, memmap_init_reserved_pages()), so
+ * these pages can be released directly to the buddy allocator.
+ */
+ __free_pages_core(pfn_to_page(cursor), 0);
totalram_pages_inc();
}
}
diff --git a/tools/testing/memblock/internal.h b/tools/testing/memblock/internal.h
index fdb7f5db7308..85973e55489e 100644
--- a/tools/testing/memblock/internal.h
+++ b/tools/testing/memblock/internal.h
@@ -15,6 +15,10 @@ bool mirrored_kernelcore = false;
struct page {};
+void __free_pages_core(struct page *page, unsigned int order)
+{
+}
+
void memblock_free_pages(struct page *page, unsigned long pfn,
unsigned int order)
{
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: FAILED: patch "[PATCH] mm: Always release pages to the buddy allocator in" failed to apply to 5.15-stable tree
2023-01-14 16:45 FAILED: patch "[PATCH] mm: Always release pages to the buddy allocator in" failed to apply to 5.15-stable tree gregkh
@ 2023-01-15 8:32 ` Mike Rapoport
2023-01-15 9:03 ` Aaron Thompson
0 siblings, 1 reply; 4+ messages in thread
From: Mike Rapoport @ 2023-01-15 8:32 UTC (permalink / raw)
To: gregkh; +Cc: dev, stable
On Sat, Jan 14, 2023 at 05:45:53PM +0100, gregkh@linuxfoundation.org wrote:
>
> The patch below does not apply to the 5.15-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
The patch below applies to 5.15, 5.10 and 5.4.
As for 4.19 and 4.14, they still have bootmem/nobootmem so I'd rather
wouldn't touch them.
From c292bd7e64214fcc78b7c72a9ccd3973dd19b7fb Mon Sep 17 00:00:00 2001
From: Aaron Thompson <dev@aaront.org>
Date: Fri, 6 Jan 2023 22:22:44 +0000
Subject: [PATCH] mm: Always release pages to the buddy allocator in
memblock_free_late().
If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
only releases pages to the buddy allocator if they are not in the
deferred range. This is correct for free pages (as defined by
for_each_free_mem_pfn_range_in_zone()) because free pages in the
deferred range will be initialized and released as part of the deferred
init process. memblock_free_pages() is called by memblock_free_late(),
which is used to free reserved ranges after memblock_free_all() has
run. All pages in reserved ranges have been initialized at that point,
and accordingly, those pages are not touched by the deferred init
process. This means that currently, if the pages that
memblock_free_late() intends to release are in the deferred range, they
will never be released to the buddy allocator. They will forever be
reserved.
In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
which is also correct for free pages but is not correct for reserved
pages. KMSAN metadata for reserved pages is initialized by
kmsan_init_shadow(), which runs shortly before memblock_free_all().
For both of these reasons, memblock_free_pages() should only be called
for free pages, and memblock_free_late() should call __free_pages_core()
directly instead.
One case where this issue can occur in the wild is EFI boot on
x86_64. The x86 EFI code reserves all EFI boot services memory ranges
via memblock_reserve() and frees them later via memblock_free_late()
(efi_reserve_boot_services() and efi_free_boot_services(),
respectively). If any of those ranges happens to fall within the
deferred init range, the pages will not be released and that memory will
be unavailable.
For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
v6.2-rc2:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 178867
v6.2-rc2 + patch:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 222816 # +43,949 pages
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
mm/memblock.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/memblock.c b/mm/memblock.c
index 2b7397781c99..838d59a74c65 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1615,7 +1615,13 @@ void __init __memblock_free_late(phys_addr_t base, phys_addr_t size)
end = PFN_DOWN(base + size);
for (; cursor < end; cursor++) {
- memblock_free_pages(pfn_to_page(cursor), cursor, 0);
+ /*
+ * Reserved pages are always initialized by the end of
+ * memblock_free_all() (by memmap_init() and, if deferred
+ * initialization is enabled, memmap_init_reserved_pages()), so
+ * these pages can be released directly to the buddy allocator.
+ */
+ __free_pages_core(pfn_to_page(cursor), 0);
totalram_pages_inc();
}
}
--
2.35.1
> thanks,
>
> greg k-h
>
--
Sincerely yours,
Mike.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: FAILED: patch "[PATCH] mm: Always release pages to the buddy allocator in" failed to apply to 5.15-stable tree
2023-01-15 8:32 ` Mike Rapoport
@ 2023-01-15 9:03 ` Aaron Thompson
2023-01-15 18:44 ` Mike Rapoport
0 siblings, 1 reply; 4+ messages in thread
From: Aaron Thompson @ 2023-01-15 9:03 UTC (permalink / raw)
To: Mike Rapoport; +Cc: gregkh, stable
On 2023-01-15 00:32, Mike Rapoport wrote:
> On Sat, Jan 14, 2023 at 05:45:53PM +0100, gregkh@linuxfoundation.org
> wrote:
>>
>> The patch below does not apply to the 5.15-stable tree.
>> If someone wants it applied there, or to any other stable or longterm
>> tree, then please email the backport, including the original git
>> commit
>> id to <stable@vger.kernel.org>.
>
> The patch below applies to 5.15, 5.10 and 5.4.
Thanks Mike. The code works as intended, but the commit message and the
comments have some inaccuracies (changed function names, no KMSAN). Does
that matter? I can send updated patches if so, just let me know.
> As for 4.19 and 4.14, they still have bootmem/nobootmem so I'd rather
> wouldn't touch them.
>
> From c292bd7e64214fcc78b7c72a9ccd3973dd19b7fb Mon Sep 17 00:00:00 2001
> From: Aaron Thompson <dev@aaront.org>
> Date: Fri, 6 Jan 2023 22:22:44 +0000
> Subject: [PATCH] mm: Always release pages to the buddy allocator in
> memblock_free_late().
>
> If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
> only releases pages to the buddy allocator if they are not in the
> deferred range. This is correct for free pages (as defined by
> for_each_free_mem_pfn_range_in_zone()) because free pages in the
> deferred range will be initialized and released as part of the deferred
> init process. memblock_free_pages() is called by memblock_free_late(),
> which is used to free reserved ranges after memblock_free_all() has
> run. All pages in reserved ranges have been initialized at that point,
> and accordingly, those pages are not touched by the deferred init
> process. This means that currently, if the pages that
> memblock_free_late() intends to release are in the deferred range, they
> will never be released to the buddy allocator. They will forever be
> reserved.
>
> In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
> which is also correct for free pages but is not correct for reserved
> pages. KMSAN metadata for reserved pages is initialized by
> kmsan_init_shadow(), which runs shortly before memblock_free_all().
>
> For both of these reasons, memblock_free_pages() should only be called
> for free pages, and memblock_free_late() should call
> __free_pages_core()
> directly instead.
>
> One case where this issue can occur in the wild is EFI boot on
> x86_64. The x86 EFI code reserves all EFI boot services memory ranges
> via memblock_reserve() and frees them later via memblock_free_late()
> (efi_reserve_boot_services() and efi_free_boot_services(),
> respectively). If any of those ranges happens to fall within the
> deferred init range, the pages will not be released and that memory
> will
> be unavailable.
>
> For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
>
> v6.2-rc2:
> # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
> Node 0, zone DMA
> spanned 4095
> present 3999
> managed 3840
> Node 0, zone DMA32
> spanned 246652
> present 245868
> managed 178867
>
> v6.2-rc2 + patch:
> # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
> Node 0, zone DMA
> spanned 4095
> present 3999
> managed 3840
> Node 0, zone DMA32
> spanned 246652
> present 245868
> managed 222816 # +43,949 pages
>
> Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages
> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
> Signed-off-by: Aaron Thompson <dev@aaront.org>
> Link:
> https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> ---
> mm/memblock.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 2b7397781c99..838d59a74c65 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1615,7 +1615,13 @@ void __init __memblock_free_late(phys_addr_t
> base, phys_addr_t size)
> end = PFN_DOWN(base + size);
>
> for (; cursor < end; cursor++) {
> - memblock_free_pages(pfn_to_page(cursor), cursor, 0);
> + /*
> + * Reserved pages are always initialized by the end of
> + * memblock_free_all() (by memmap_init() and, if deferred
> + * initialization is enabled, memmap_init_reserved_pages()), so
> + * these pages can be released directly to the buddy allocator.
> + */
> + __free_pages_core(pfn_to_page(cursor), 0);
> totalram_pages_inc();
> }
> }
> --
> 2.35.1
>
>
>> thanks,
>>
>> greg k-h
>>
Thanks,
-- Aaron
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: FAILED: patch "[PATCH] mm: Always release pages to the buddy allocator in" failed to apply to 5.15-stable tree
2023-01-15 9:03 ` Aaron Thompson
@ 2023-01-15 18:44 ` Mike Rapoport
0 siblings, 0 replies; 4+ messages in thread
From: Mike Rapoport @ 2023-01-15 18:44 UTC (permalink / raw)
To: Aaron Thompson; +Cc: gregkh, stable
On Sun, Jan 15, 2023 at 09:03:05AM +0000, Aaron Thompson wrote:
>
> On 2023-01-15 00:32, Mike Rapoport wrote:
> > On Sat, Jan 14, 2023 at 05:45:53PM +0100, gregkh@linuxfoundation.org
> > wrote:
> > >
> > > The patch below does not apply to the 5.15-stable tree.
> > > If someone wants it applied there, or to any other stable or longterm
> > > tree, then please email the backport, including the original git
> > > commit
> > > id to <stable@vger.kernel.org>.
> >
> > The patch below applies to 5.15, 5.10 and 5.4.
>
> Thanks Mike. The code works as intended, but the commit message and the
> comments have some inaccuracies (changed function names, no KMSAN). Does
> that matter? I can send updated patches if so, just let me know.
It's up to stable folks, I think.
Greg?
> > As for 4.19 and 4.14, they still have bootmem/nobootmem so I'd rather
> > wouldn't touch them.
> >
> > From c292bd7e64214fcc78b7c72a9ccd3973dd19b7fb Mon Sep 17 00:00:00 2001
> > From: Aaron Thompson <dev@aaront.org>
> > Date: Fri, 6 Jan 2023 22:22:44 +0000
> > Subject: [PATCH] mm: Always release pages to the buddy allocator in
> > memblock_free_late().
> >
> > If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
> > only releases pages to the buddy allocator if they are not in the
> > deferred range. This is correct for free pages (as defined by
> > for_each_free_mem_pfn_range_in_zone()) because free pages in the
> > deferred range will be initialized and released as part of the deferred
> > init process. memblock_free_pages() is called by memblock_free_late(),
> > which is used to free reserved ranges after memblock_free_all() has
> > run. All pages in reserved ranges have been initialized at that point,
> > and accordingly, those pages are not touched by the deferred init
> > process. This means that currently, if the pages that
> > memblock_free_late() intends to release are in the deferred range, they
> > will never be released to the buddy allocator. They will forever be
> > reserved.
> >
> > In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
> > which is also correct for free pages but is not correct for reserved
> > pages. KMSAN metadata for reserved pages is initialized by
> > kmsan_init_shadow(), which runs shortly before memblock_free_all().
> >
> > For both of these reasons, memblock_free_pages() should only be called
> > for free pages, and memblock_free_late() should call __free_pages_core()
> > directly instead.
> >
> > One case where this issue can occur in the wild is EFI boot on
> > x86_64. The x86 EFI code reserves all EFI boot services memory ranges
> > via memblock_reserve() and frees them later via memblock_free_late()
> > (efi_reserve_boot_services() and efi_free_boot_services(),
> > respectively). If any of those ranges happens to fall within the
> > deferred init range, the pages will not be released and that memory will
> > be unavailable.
> >
> > For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
> >
> > v6.2-rc2:
> > # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
> > Node 0, zone DMA
> > spanned 4095
> > present 3999
> > managed 3840
> > Node 0, zone DMA32
> > spanned 246652
> > present 245868
> > managed 178867
> >
> > v6.2-rc2 + patch:
> > # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
> > Node 0, zone DMA
> > spanned 4095
> > present 3999
> > managed 3840
> > Node 0, zone DMA32
> > spanned 246652
> > present 245868
> > managed 222816 # +43,949 pages
> >
> > Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages
> > if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
> > Signed-off-by: Aaron Thompson <dev@aaront.org>
> > Link:
> > https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
> > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> > ---
> > mm/memblock.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 2b7397781c99..838d59a74c65 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -1615,7 +1615,13 @@ void __init __memblock_free_late(phys_addr_t
> > base, phys_addr_t size)
> > end = PFN_DOWN(base + size);
> >
> > for (; cursor < end; cursor++) {
> > - memblock_free_pages(pfn_to_page(cursor), cursor, 0);
> > + /*
> > + * Reserved pages are always initialized by the end of
> > + * memblock_free_all() (by memmap_init() and, if deferred
> > + * initialization is enabled, memmap_init_reserved_pages()), so
> > + * these pages can be released directly to the buddy allocator.
> > + */
> > + __free_pages_core(pfn_to_page(cursor), 0);
> > totalram_pages_inc();
> > }
> > }
> > --
> > 2.35.1
> >
> >
> > > thanks,
> > >
> > > greg k-h
> > >
>
> Thanks,
> -- Aaron
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-01-15 18:44 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-14 16:45 FAILED: patch "[PATCH] mm: Always release pages to the buddy allocator in" failed to apply to 5.15-stable tree gregkh
2023-01-15 8:32 ` Mike Rapoport
2023-01-15 9:03 ` Aaron Thompson
2023-01-15 18:44 ` Mike Rapoport
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).