From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Jinjiang Tu <tujinjiang@huawei.com>,
syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com,
David Hildenbrand <david@redhat.com>,
Miaohe Lin <linmiaohe@huawei.com>, Zi Yan <ziy@nvidia.com>,
Oscar Salvador <osalvador@suse.de>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH 6.15 78/92] mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
Date: Wed, 30 Jul 2025 11:36:26 +0200 [thread overview]
Message-ID: <20250730093233.774349408@linuxfoundation.org> (raw)
In-Reply-To: <20250730093230.629234025@linuxfoundation.org>
6.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinjiang Tu <tujinjiang@huawei.com>
commit 9f1e8cd0b7c4c944e9921b52a6661b5eda2705ab upstream.
In shrink_folio_list(), the hwpoisoned folio may be large folio, which
can't be handled by unmap_poisoned_folio(). For THP, try_to_unmap_one()
must be passed with TTU_SPLIT_HUGE_PMD to split huge PMD first and then
retry. Without TTU_SPLIT_HUGE_PMD, we will trigger null-ptr deref of
pvmw.pte. Even we passed TTU_SPLIT_HUGE_PMD, we will trigger a
WARN_ON_ONCE due to the page isn't in swapcache.
Since UCE is rare in real world, and race with reclaimation is more rare,
just skipping the hwpoisoned large folio is enough. memory_failure() will
handle it if the UCE is triggered again.
This happens when memory reclaim for large folio races with
memory_failure(), and will lead to kernel panic. The race is as
follows:
cpu0 cpu1
shrink_folio_list memory_failure
TestSetPageHWPoison
unmap_poisoned_folio
--> trigger BUG_ON due to
unmap_poisoned_folio couldn't
handle large folio
[tujinjiang@huawei.com: add comment to unmap_poisoned_folio()]
Link: https://lkml.kernel.org/r/69fd4e00-1b13-d5f7-1c82-705c7d977ea4@huawei.com
Link: https://lkml.kernel.org/r/20250627125747.3094074-2-tujinjiang@huawei.com
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Fixes: 1b0449544c64 ("mm/vmscan: don't try to reclaim hwpoison folio")
Reported-by: syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68412d57.050a0220.2461cf.000e.GAE@google.com/
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/memory-failure.c | 4 ++++
mm/vmscan.c | 8 ++++++++
2 files changed, 12 insertions(+)
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1561,6 +1561,10 @@ static int get_hwpoison_page(struct page
return ret;
}
+/*
+ * The caller must guarantee the folio isn't large folio, except hugetlb.
+ * try_to_unmap() can't handle it.
+ */
int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill)
{
enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON;
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1128,6 +1128,14 @@ retry:
goto keep;
if (folio_contain_hwpoisoned_page(folio)) {
+ /*
+ * unmap_poisoned_folio() can't handle large
+ * folio, just skip it. memory_failure() will
+ * handle it if the UCE is triggered again.
+ */
+ if (folio_test_large(folio))
+ goto keep_locked;
+
unmap_poisoned_folio(folio, folio_pfn(folio), false);
folio_unlock(folio);
folio_put(folio);
next prev parent reply other threads:[~2025-07-30 9:54 UTC|newest]
Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-30 9:35 [PATCH 6.15 00/92] 6.15.9-rc1 review Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 01/92] x86/traps: Initialize DR7 by writing its architectural reset value Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 02/92] virtio_net: Enforce minimum TX ring size for reliability Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 03/92] virtio_ring: Fix error reporting in virtqueue_resize Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 04/92] drm/amd/display: Dont allow OLED to go down to fully off Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 05/92] regulator: core: fix NULL dereference on unbind due to stale coupling data Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 06/92] platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CA Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 07/92] RDMA/core: Rate limit GID cache warning messages Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 08/92] iio: fix potential out-of-bound write Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 09/92] interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 10/92] interconnect: icc-clk: destroy nodes in case of memory allocation failures Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 11/92] iio: adc: ad7949: use spi_is_bpw_supported() Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 12/92] regmap: fix potential memory leak of regmap_bus Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 13/92] platform/mellanox: mlxbf-pmc: Remove newline char from event name input Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 14/92] platform/mellanox: mlxbf-pmc: Validate event/enable input Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 15/92] platform/mellanox: mlxbf-pmc: Use kstrtobool() to check 0/1 input Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 16/92] tools/hv: fcopy: Fix incorrect file path conversion Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 17/92] x86/hyperv: Fix usage of cpu_online_mask to get valid cpu Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 18/92] platform/x86: Fix initialization order for firmware_attributes_class Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 19/92] staging: vchiq_arm: Make vchiq_shutdown never fail Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 20/92] xfrm: state: initialize state_ptrs earlier in xfrm_state_find Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 21/92] xfrm: state: use a consistent pcpu_id " Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 22/92] xfrm: always initialize offload path Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 23/92] xfrm: Set transport header to fix UDP GRO handling Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 24/92] xfrm: ipcomp: adjust transport header after decompressing Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 25/92] xfrm: interface: fix use-after-free after changing collect_md xfrm interface Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 26/92] ASoC: mediatek: mt8365-dai-i2s: pass correct size to mt8365_dai_set_priv Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 27/92] net: ti: icssg-prueth: Fix buffer allocation for ICSSG Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 28/92] net/mlx5: Fix memory leak in cmd_exec() Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 29/92] net/mlx5: E-Switch, Fix peer miss rules to use peer eswitch Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 30/92] i40e: report VF tx_dropped with tx_errors instead of tx_discards Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 31/92] i40e: When removing VF MAC filters, only check PF-set MAC Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 32/92] net: appletalk: Fix use-after-free in AARP proxy probe Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 33/92] net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 34/92] can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 35/92] drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe() Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 36/92] ALSA: hda/realtek: Fix mute LED mask on HP OMEN 16 laptop Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 37/92] selftests: drv-net: wait for iperf client to stop sending Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 38/92] s390/ism: fix concurrency management in ism_cmd() Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 39/92] net: hns3: fix concurrent setting vlan filter issue Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 40/92] net: hns3: disable interrupt when ptp init failed Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 41/92] net: hns3: fixed vf get max channels bug Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 42/92] net: hns3: default enable tx bounce buffer when smmu enabled Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 43/92] platform/x86: alienware-wmi-wmax: Fix `dmi_system_id` array Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 44/92] platform/x86: ideapad-laptop: Fix FnLock not remembered among boots Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 45/92] platform/x86: ideapad-laptop: Fix kbd backlight " Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 46/92] drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4x Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 47/92] Revert "drm/prime: Use dma_buf from GEM object instance" Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 48/92] Revert "drm/gem-framebuffer: " Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 49/92] Revert "drm/gem-dma: " Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 50/92] drm/amdgpu: Reset the clear flag in buddy during resume Greg Kroah-Hartman
2025-07-30 9:35 ` [PATCH 6.15 51/92] drm/sched: Remove optimization that causes hang when killing dependent jobs Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 52/92] mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show() Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 53/92] ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 54/92] ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36 Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 55/92] timekeeping: Zero initialize system_counterval when querying time from phc drivers Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 56/92] i2c: qup: jump out of the loop in case of timeout Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 57/92] i2c: tegra: Fix reset error handling with ACPI Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 58/92] i2c: virtio: Avoid hang by using interruptible completion wait Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 59/92] bus: fsl-mc: Fix potential double device reference in fsl_mc_get_endpoint() Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 60/92] sprintf.h requires stdarg.h Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 61/92] ALSA: hda/realtek - Add mute LED support for HP Pavilion 15-eg0xxx Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 62/92] ALSA: hda/realtek - Add mute LED support for HP Victus 15-fa0xxx Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 63/92] arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack() Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 64/92] ASoC: mediatek: common: fix device and OF node leak Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 65/92] dpaa2-eth: Fix device reference count leak in MAC endpoint handling Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 66/92] dpaa2-switch: " Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 67/92] e1000e: disregard NVM checksum on tgp when valid checksum bit is not set Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 68/92] e1000e: ignore uninitialized checksum word on tgp Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 69/92] gve: Fix stuck TX queue for DQ queue format Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 70/92] ice: Fix a null pointer dereference in ice_copy_and_init_pkg() Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 71/92] kasan: use vmalloc_dump_obj() for vmalloc error reports Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 72/92] nilfs2: reject invalid file types when reading inodes Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 73/92] PCI/pwrctrl: Create pwrctrl devices only when CONFIG_PCI_PWRCTRL is enabled Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 74/92] resource: fix false warning in __request_region() Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 75/92] selftests: mptcp: connect: also cover alt modes Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 76/92] selftests: mptcp: connect: also cover checksum Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 77/92] selftests/mm: fix split_huge_page_test for folio_split() tests Greg Kroah-Hartman
2025-07-30 9:36 ` Greg Kroah-Hartman [this message]
2025-07-30 9:36 ` [PATCH 6.15 79/92] mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 80/92] selftests/bpf: Add tests with stack ptr register in conditional jmp Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 81/92] drm/xe: Make WA BB part of LRC BO Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 82/92] drm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 83/92] drm/amdgpu: Implement SDMA soft reset directly for v5.x Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 84/92] drm/amdgpu: Fix SDMA engine reset with logical instance ID Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 85/92] drm/shmem-helper: Remove obsoleted is_iomem test Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 86/92] Revert "drm/gem-shmem: Use dma_buf from GEM object instance" Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 87/92] usb: typec: tcpm: allow to use sink in accessory mode Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 88/92] usb: typec: tcpm: allow switching to mode accessory to mux properly Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 89/92] usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 90/92] spi: cadence-quadspi: fix cleanup of rx_chan on failure paths Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 91/92] ALSA: hda/tegra: Add Tegra264 support Greg Kroah-Hartman
2025-07-30 9:36 ` [PATCH 6.15 92/92] ALSA: hda: Add missing NVIDIA HDA codec IDs Greg Kroah-Hartman
2025-07-30 12:35 ` [PATCH 6.15 00/92] 6.15.9-rc1 review Ronald Warsow
2025-07-30 13:09 ` Christian Heusel
2025-07-30 14:09 ` Jon Hunter
2025-07-30 15:29 ` Mark Brown
2025-07-30 16:47 ` Achill Gilgenast
2025-07-30 17:21 ` Brett A C Sheffield
2025-07-30 20:08 ` Peter Schneider
2025-07-30 20:51 ` Shuah Khan
2025-07-31 2:30 ` Justin Forbes
2025-07-31 2:39 ` Takeshi Ogasawara
2025-07-31 8:36 ` Ron Economos
2025-07-31 10:17 ` Naresh Kamboju
2025-07-31 19:00 ` Miguel Ojeda
2025-08-01 1:31 ` Hardik Garg
2025-08-18 22:51 ` [PATCH 6.16 000/570] 6.16.2-rc1 review Hardik Garg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250730093233.774349408@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linmiaohe@huawei.com \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com \
--cc=tujinjiang@huawei.com \
--cc=wangkefeng.wang@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).