public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Aaron Thompson <dev@aaront.org>,
	"Mike Rapoport (IBM)" <rppt@kernel.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 50/64] mm: Always release pages to the buddy allocator in memblock_free_late().
Date: Mon, 16 Jan 2023 16:51:57 +0100	[thread overview]
Message-ID: <20230116154745.308157681@linuxfoundation.org> (raw)
In-Reply-To: <20230116154743.577276578@linuxfoundation.org>

From: Aaron Thompson <dev@aaront.org>

[ Upstream commit 115d9d77bb0f9152c60b6e8646369fa7f6167593 ]

If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
only releases pages to the buddy allocator if they are not in the
deferred range. This is correct for free pages (as defined by
for_each_free_mem_pfn_range_in_zone()) because free pages in the
deferred range will be initialized and released as part of the deferred
init process. memblock_free_pages() is called by memblock_free_late(),
which is used to free reserved ranges after memblock_free_all() has
run. All pages in reserved ranges have been initialized at that point,
and accordingly, those pages are not touched by the deferred init
process. This means that currently, if the pages that
memblock_free_late() intends to release are in the deferred range, they
will never be released to the buddy allocator. They will forever be
reserved.

In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
which is also correct for free pages but is not correct for reserved
pages. KMSAN metadata for reserved pages is initialized by
kmsan_init_shadow(), which runs shortly before memblock_free_all().

For both of these reasons, memblock_free_pages() should only be called
for free pages, and memblock_free_late() should call __free_pages_core()
directly instead.

One case where this issue can occur in the wild is EFI boot on
x86_64. The x86 EFI code reserves all EFI boot services memory ranges
via memblock_reserve() and frees them later via memblock_free_late()
(efi_reserve_boot_services() and efi_free_boot_services(),
respectively). If any of those ranges happens to fall within the
deferred init range, the pages will not be released and that memory will
be unavailable.

For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:

v6.2-rc2:
  # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
  Node 0, zone      DMA
          spanned  4095
          present  3999
          managed  3840
  Node 0, zone    DMA32
          spanned  246652
          present  245868
          managed  178867

v6.2-rc2 + patch:
  # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
  Node 0, zone      DMA
          spanned  4095
          present  3999
          managed  3840
  Node 0, zone    DMA32
          spanned  246652
          present  245868
          managed  222816   # +43,949 pages

Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/memblock.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index f72d53957033..f6a4dffb9a88 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1597,7 +1597,13 @@ void __init __memblock_free_late(phys_addr_t base, phys_addr_t size)
 	end = PFN_DOWN(base + size);
 
 	for (; cursor < end; cursor++) {
-		memblock_free_pages(pfn_to_page(cursor), cursor, 0);
+		/*
+		 * Reserved pages are always initialized by the end of
+		 * memblock_free_all() (by memmap_init() and, if deferred
+		 * initialization is enabled, memmap_init_reserved_pages()), so
+		 * these pages can be released directly to the buddy allocator.
+		 */
+		__free_pages_core(pfn_to_page(cursor), 0);
 		totalram_pages_inc();
 	}
 }
-- 
2.35.1




  parent reply	other threads:[~2023-01-16 16:14 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-16 15:51 [PATCH 5.10 00/64] 5.10.164-rc1 review Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 01/64] netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 02/64] ALSA: hda/realtek: Enable mute/micmute LEDs on HP Spectre x360 13-aw0xxx Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 03/64] KVM: arm64: Fix S1PTW handling on RO memslots Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 04/64] efi: tpm: Avoid READ_ONCE() for accessing the event log Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 05/64] docs: Fix the docs build with Sphinx 6.0 Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 06/64] perf auxtrace: Fix address filter duplicate symbol selection Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 07/64] s390/kexec: fix ipl report address for kdump Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 08/64] ASoC: qcom: lpass-cpu: Fix fallback SD line index handling Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 09/64] s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 10/64] s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 11/64] cifs: Fix uninitialized memory read for smb311 posix symlink create Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 12/64] drm/msm/adreno: Make adreno quirks not overwrite each other Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 13/64] drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 14/64] platform/x86: sony-laptop: Dont turn off 0x153 keyboard backlight during probe Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 15/64] ixgbe: fix pci device refcount leak Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 16/64] ipv6: raw: Deduct extension header length in rawv6_push_pending_frames Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 17/64] bus: mhi: host: Fix race between channel preparation and M0 event Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 18/64] iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 19/64] iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 20/64] clk: imx8mp: Add DISP2 pixel clock Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 21/64] clk: imx8mp: add clkout1/2 support Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 22/64] dt-bindings: clocks: imx8mp: Add ID for usb suspend clock Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 23/64] clk: imx: imx8mp: add shared clk gate for usb suspend clk Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 24/64] xhci: Avoid parsing transfer events several times Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 25/64] xhci: get isochronous ring directly from endpoint structure Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 26/64] xhci: adjust parameters passed to cleanup_halted_endpoint() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 27/64] xhci: Add xhci_reset_halted_ep() helper function Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 28/64] xhci: move xhci_td_cleanup so it can be called by more functions Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 29/64] xhci: store TD status in the td struct instead of passing it along Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 30/64] xhci: move and rename xhci_cleanup_halted_endpoint() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 31/64] xhci: Prevent infinite loop in transaction errors recovery for streams Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 32/64] usb: ulpi: defer ulpi_register on ulpi_read_id timeout Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 33/64] ext4: fix uninititialized value in ext4_evict_inode Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 34/64] xfrm: fix rcu lock in xfrm_notify_userpolicy() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 35/64] netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 36/64] powerpc/imc-pmu: Fix use of mutex in IRQs disabled section Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 37/64] x86/boot: Avoid using Intel mnemonics in AT&T syntax asm Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 38/64] EDAC/device: Fix period calculation in edac_device_reset_delay_period() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 39/64] regulator: da9211: Use irq handler when ready Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 40/64] ASoC: wm8904: fix wrong outputs volume after power reactivation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 41/64] tipc: fix unexpected link reset due to discovery messages Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 42/64] octeontx2-af: Update get/set resource count functions Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 43/64] octeontx2-af: Map NIX block from CGX connection Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 44/64] octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 45/64] hvc/xen: lock console list traversal Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 46/64] nfc: pn533: Wait for out_urbs completion in pn533_usb_send_frame() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 47/64] net/sched: act_mpls: Fix warning during failed attribute validation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 48/64] net/mlx5: Fix ptp max frequency adjustment range Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 49/64] net/mlx5e: Dont support encap rules with gbp option Greg Kroah-Hartman
2023-01-16 15:51 ` Greg Kroah-Hartman [this message]
2023-01-16 15:51 ` [PATCH 5.10 51/64] iommu/mediatek-v1: Add error handle for mtk_iommu_probe Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 52/64] iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 53/64] Documentation: KVM: add API issues section Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 54/64] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 55/64] x86/resctrl: Use task_curr() instead of task_struct->on_cpu to prevent unnecessary IPI Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 56/64] x86/resctrl: Fix task CLOSID/RMID update race Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 57/64] arm64: atomics: format whitespace consistently Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 58/64] arm64: atomics: remove LL/SC trampolines Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 59/64] arm64: cmpxchg_double*: hazard against entire exchange variable Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 60/64] efi: fix NULL-deref in init error path Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 61/64] drm/virtio: Fix GEM handle creation UAF Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 62/64] io_uring/io-wq: free worker if task_work creation is canceled Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 63/64] io_uring/io-wq: only free worker if it was allocated for creation Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 64/64] Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout" Greg Kroah-Hartman
2023-01-16 18:58 ` [PATCH 5.10 00/64] 5.10.164-rc1 review Daniel Díaz
2023-01-16 21:30   ` Pavel Machek
2023-01-17  9:32   ` Greg Kroah-Hartman
2023-01-16 23:58 ` Shuah Khan
2023-01-17 12:35 ` Sudip Mukherjee
2023-01-17 14:20   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230116154745.308157681@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dev@aaront.org \
    --cc=patches@lists.linux.dev \
    --cc=rppt@kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox