patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Miaohe Lin <linmiaohe@huawei.com>,
	David Hildenbrand <david@redhat.com>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 30/81] mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory
Date: Tue, 30 Sep 2025 16:46:32 +0200	[thread overview]
Message-ID: <20250930143820.928129957@linuxfoundation.org> (raw)
In-Reply-To: <20250930143819.654157320@linuxfoundation.org>

5.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Miaohe Lin <linmiaohe@huawei.com>

[ Upstream commit d613f53c83ec47089c4e25859d5e8e0359f6f8da ]

When I did memory failure tests, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page.  So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered.  This can be reproduced by below steps:

1.Offline memory block:

 echo offline > /sys/devices/system/memory/memory12/state

2.Get offlined memory pfn:

 page-types -b n -rlN

3.Write pfn to unpoison-pfn

 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

This scenario can be identified by pfn_to_online_page() returning NULL.
And ZONE_DEVICE pages are never expected, so we can simply fail if
pfn_to_online_page() == NULL to fix the bug.

Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Adjust context ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/memory-failure.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1543,10 +1543,9 @@ int unpoison_memory(unsigned long pfn)
 	static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
 					DEFAULT_RATELIMIT_BURST);
 
-	if (!pfn_valid(pfn))
-		return -ENXIO;
-
-	p = pfn_to_page(pfn);
+	p = pfn_to_online_page(pfn);
+	if (!p)
+		return -EIO;
 	page = compound_head(p);
 
 	if (!PageHWPoison(p)) {



  parent reply	other threads:[~2025-09-30 14:52 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30 14:46 [PATCH 5.4 00/81] 5.4.300-rc1 review Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 01/81] usb: hub: Fix flushing of delayed work used for post resume purposes Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 02/81] net: Fix null-ptr-deref by sock_lock_init_class_and_name() and rmmod Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 03/81] NFSv4: Dont clear capabilities that wont be reset Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 04/81] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 05/81] EDAC/altera: Delete an inappropriate dma_free_coherent() call Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 06/81] ocfs2: fix recursive semaphore deadlock in fiemap call Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 07/81] mtd: rawnand: stm32_fmc2: fix ECC overwrite Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 08/81] fuse: check if copy_file_range() returns larger than requested size Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 09/81] fuse: prevent overflow in copy_file_range return value Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 10/81] mm/khugepaged: fix the address passed to notifier on testing young Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 11/81] mtd: rawnand: stm32_fmc2: avoid overlapping mappings on ECC buffer Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 12/81] mtd: nand: raw: atmel: Fix comment in timings preparation Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 13/81] mtd: nand: raw: atmel: Respect tAR, tCLR in read setup timing Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 14/81] tty: hvc_console: Call hvc_kick in hvc_write unconditionally Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 15/81] USB: serial: option: add Telit Cinterion FN990A w/audio compositions Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 16/81] USB: serial: option: add Telit Cinterion LE910C4-WWX new compositions Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 17/81] net: fec: Fix possible NPD in fec_enet_phy_reset_after_clk_enable() Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 18/81] igb: fix link test skipping when interface is admin down Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 19/81] genirq/affinity: Add irq_update_affinity_desc() Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 20/81] genirq: Export affinity setter for modules Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 21/81] genirq: Provide new interfaces for affinity hints Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 22/81] i40e: Use irq_update_affinity_hint() Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 23/81] i40e: fix IRQ freeing in i40e_vsi_request_irq_msix error path Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 24/81] can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 25/81] can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 26/81] dmaengine: ti: edma: Fix memory allocation size for queue_priority_map Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 27/81] dmaengine: qcom: bam_dma: Fix DT error handling for num-channels/ees Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 28/81] phy: ti-pipe3: fix device leak at unbind Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 29/81] soc: qcom: mdt_loader: Deal with zero e_shentsize Greg Kroah-Hartman
2025-09-30 14:46 ` Greg Kroah-Hartman [this message]
2025-09-30 14:46 ` [PATCH 5.4 31/81] ALSA: firewire-motu: drop EPOLLOUT from poll return values as write is not supported Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 32/81] wifi: mac80211: fix incorrect type for ret Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 33/81] pcmcia: omap_cf: Mark driver struct with __refdata to prevent section mismatch Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 34/81] cgroup: split cgroup_destroy_wq into 3 workqueues Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 35/81] net: natsemi: fix `rx_dropped` double accounting on `netif_rx()` failure Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 36/81] i40e: remove redundant memory barrier when cleaning Tx descs Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 37/81] tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect() Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 38/81] Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 39/81] net: liquidio: fix overflow in octeon_init_instr_queue() Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 40/81] cnic: Fix use-after-free bugs in cnic_delete_task Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 41/81] nilfs2: fix CFI failure when accessing /sys/fs/nilfs2/features/* Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 42/81] power: supply: bq27xxx: fix error return in case of no bq27000 hdq battery Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 43/81] power: supply: bq27xxx: restrict no-battery detection to bq27000 Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 44/81] mmc: mvsdio: Fix dma_unmap_sg() nents value Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 45/81] rds: ib: Increment i_fastreg_wrs before bailing out Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 46/81] ASoC: wm8940: Correct typo in control name Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 47/81] ASoC: wm8974: Correct PLL rate rounding Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 48/81] ASoC: SOF: Intel: hda-stream: Fix incorrect variable used in error message Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 49/81] usb: gadget: dummy_hcd: remove usage of list iterator past the loop body Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 50/81] USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 51/81] serial: sc16is7xx: fix bug in flow control levels init Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 52/81] net: rfkill: gpio: add DT support Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 53/81] net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 54/81] KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR even if AVIC is active Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 55/81] ALSA: usb-audio: Fix block comments in mixer_quirks Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 56/81] ALSA: usb-audio: Avoid multiple assignments " Greg Kroah-Hartman
2025-09-30 14:46 ` [PATCH 5.4 57/81] ALSA: usb-audio: Simplify NULL comparison " Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 58/81] ALSA: usb-audio: Remove unneeded wmb() " Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 59/81] ALSA: usb-audio: Add mixer quirk for Sony DualSense PS5 Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 60/81] ALSA: usb-audio: Convert comma to semicolon Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 61/81] ALSA: usb-audio: Fix build with CONFIG_INPUT=n Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 62/81] usb: core: Add 0x prefix to quirks debug output Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 63/81] IB/mlx5: Fix obj_type mismatch for SRQ event subscriptions Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 64/81] can: rcar_can: rcar_can_resume(): fix s2ram with PSCI Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 65/81] can: hi311x: populate ndo_change_mtu() to prevent buffer overflow Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 66/81] can: sun4i_can: " Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 67/81] can: mcba_usb: " Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 68/81] can: peak_usb: fix shift-out-of-bounds issue Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 69/81] drm/gma500: Fix null dereference in hdmi teardown Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 70/81] i40e: fix idx validation in i40e_validate_queue_map Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 71/81] i40e: fix input validation logic for action_meta Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 72/81] i40e: add max boundary check for VF filters Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 73/81] fbcon: fix integer overflow in fbcon_do_set_font Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 74/81] fbcon: Fix OOB access in font allocation Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 75/81] mm/migrate_device: dont add folio to be freed to LRU in migrate_device_finalize() Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 76/81] i40e: increase max descriptors for XL710 Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 77/81] i40e: add validation for ring_len param Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 78/81] i40e: fix idx validation in config queues msg Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 79/81] i40e: fix validation of VF state in get resources Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 80/81] i40e: add mask to apply valid bits for itr_idx Greg Kroah-Hartman
2025-09-30 14:47 ` [PATCH 5.4 81/81] mm/hugetlb: fix folio is still mapped when deleted Greg Kroah-Hartman
2025-09-30 17:06 ` [PATCH 5.4 00/81] 5.4.300-rc1 review Florian Fainelli
2025-09-30 18:52 ` Brett A C Sheffield
2025-10-01  9:11 ` [PATCH 5.4 00/81] " Jon Hunter
2025-10-01 12:07 ` Naresh Kamboju
2025-10-01 13:37 ` [External] : " ALOK TIWARI
2025-10-01 16:21 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250930143820.928129957@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).