patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Jann Horn <jannh@google.com>,
	Zach OKeefe <zokeefe@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@intel.linux.com>,
	Yang Shi <shy828301@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Bjoern Doebel <doebel@amazon.de>
Subject: [PATCH 5.15 38/64] mm/khugepaged: fix ->anon_vma race
Date: Sun,  7 Sep 2025 21:58:20 +0200	[thread overview]
Message-ID: <20250907195604.457007923@linuxfoundation.org> (raw)
In-Reply-To: <20250907195603.394640159@linuxfoundation.org>

5.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jann Horn <jannh@google.com>

commit 023f47a8250c6bdb4aebe744db4bf7f73414028b upstream.

If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.

Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).

If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.

Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.

Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.

Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_POneSDF+A@mail.gmail.com/
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh@google.com>
Reported-by: Zach O'Keefe <zokeefe@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@intel.linux.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[doebel@amazon.de: Kernel 5.15 uses a different control flow pattern,
 context adjustments.]
Signed-off-by: Bjoern Doebel <doebel@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/khugepaged.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1609,7 +1609,7 @@ static void retract_page_tables(struct a
 		 * has higher cost too. It would also probably require locking
 		 * the anon_vma.
 		 */
-		if (vma->anon_vma)
+		if (READ_ONCE(vma->anon_vma))
 			continue;
 		addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
 		if (addr & ~HPAGE_PMD_MASK)
@@ -1631,6 +1631,19 @@ static void retract_page_tables(struct a
 			if (!khugepaged_test_exit(mm)) {
 				struct mmu_notifier_range range;
 
+				/*
+				 * Re-check whether we have an ->anon_vma, because
+				 * collapse_and_free_pmd() requires that either no
+				 * ->anon_vma exists or the anon_vma is locked.
+				 * We already checked ->anon_vma above, but that check
+				 * is racy because ->anon_vma can be populated under the
+				 * mmap lock in read mode.
+				 */
+				if (vma->anon_vma) {
+					mmap_write_unlock(mm);
+					continue;
+				}
+
 				mmu_notifier_range_init(&range,
 							MMU_NOTIFY_CLEAR, 0,
 							NULL, mm, addr,



  parent reply	other threads:[~2025-09-07 20:13 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-07 19:57 [PATCH 5.15 00/64] 5.15.192-rc1 review Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 01/64] bpf: Add cookie object to bpf maps Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 02/64] bpf: Move cgroup iterator helpers to bpf.h Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 03/64] bpf: Move bpf map owner out of common struct Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 04/64] bpf: Fix oob access in cgroup local storage Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 05/64] drm/amd/display: Dont warn when missing DCE encoder caps Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 06/64] fs: writeback: fix use-after-free in __mark_inode_dirty() Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 07/64] tee: fix NULL pointer dereference in tee_shm_put Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 08/64] arm64: dts: rockchip: Add vcc-supply to SPI flash on rk3399-pinebook-pro Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 09/64] wifi: cfg80211: fix use-after-free in cmp_bss() Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 10/64] netfilter: br_netfilter: do not check confirmed bit in br_nf_local_in() after confirm Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 11/64] netfilter: conntrack: helper: Replace -EEXIST by -EBUSY Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 12/64] Bluetooth: Fix use-after-free in l2cap_sock_cleanup_listen() Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 13/64] xirc2ps_cs: fix register access when enabling FullDuplex Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 14/64] mISDN: Fix memory leak in dsp_hwec_enable() Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 15/64] icmp: fix icmp_ndo_send address translation for reply direction Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 16/64] i40e: Fix potential invalid access when MAC list is empty Greg Kroah-Hartman
2025-09-07 19:57 ` [PATCH 5.15 17/64] net: ethernet: mtk_eth_soc: fix tx vlan tag for llc packets Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 18/64] wifi: cw1200: cap SSID length in cw1200_do_join() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 19/64] wifi: libertas: cap SSID len in lbs_associate() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 20/64] net: thunder_bgx: add a missing of_node_put Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 21/64] net: thunder_bgx: decrement cleanup index before use Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 22/64] ipv4: Fix NULL vs error pointer check in inet_blackhole_dev_init() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 23/64] ax25: properly unshare skbs in ax25_kiss_rcv() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 24/64] net: atm: fix memory leak in atm_register_sysfs when device_register fail Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 25/64] ppp: fix memory leak in pad_compress_skb Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 26/64] ptp: Add generic PTP is_sync() function Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 27/64] net: phy: mscc: Fix memory leak when using one step timestamping Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 28/64] phy: mscc: Stop taking ts_lock for tx_queue and use its own lock Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 29/64] ALSA: usb-audio: Add mute TLV for playback volumes on some devices Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 30/64] pcmcia: Fix a NULL pointer dereference in __iodyn_find_io_region() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 31/64] x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 32/64] mm: move page table sync declarations to linux/pgtable.h Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 33/64] wifi: mwifiex: Initialize the chan_stats array to zero Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 34/64] drm/amdgpu: drop hw access in non-DC audio fini Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 35/64] scsi: lpfc: Fix buffer free/clear order in deferred receive path Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 36/64] batman-adv: fix OOB read/write in network-coding decode Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 37/64] e1000e: fix heap overflow in e1000_set_eeprom Greg Kroah-Hartman
2025-09-07 19:58 ` Greg Kroah-Hartman [this message]
2025-09-07 19:58 ` [PATCH 5.15 39/64] cpufreq/sched: Explicitly synchronize limits_changed flag handling Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 40/64] KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 41/64] spi: tegra114: Remove unnecessary NULL-pointer checks Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 42/64] spi: tegra114: Dont fail set_cs_timing when delays are zero Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 43/64] iio: chemical: pms7003: use aligned_s64 for timestamp Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 44/64] iio: light: opt3001: fix deadlock due to concurrent flag access Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 45/64] gpio: pca953x: fix IRQ storm on system wake up Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 46/64] dma-buf: insert memory barrier before updating num_fences Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 47/64] dmaengine: mediatek: Fix a possible deadlock error in mtk_cqdma_tx_status() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 48/64] net: dsa: microchip: update tag_ksz masks for KSZ9477 family Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 49/64] net: dsa: microchip: linearize skb for tail-tagging switches Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 50/64] vmxnet3: update MTU after device quiesce Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 51/64] arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDs Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 52/64] randstruct: gcc-plugin: Remove bogus void member Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 53/64] randstruct: gcc-plugin: Fix attribute addition Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 54/64] mm/slub: avoid accessing metadata when pointer is invalid in object_err() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 55/64] ALSA: hda/hdmi: Add pin fix for another HP EliteDesk 800 G4 model Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 56/64] pcmcia: Add error handling for add_interval() in do_validate_mem() Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 57/64] spi: spi-fsl-lpspi: Fix transmissions when using CONT Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 58/64] spi: spi-fsl-lpspi: Set correct chip-select polarity bit Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 59/64] spi: spi-fsl-lpspi: Reset FIFO and disable module on transfer abort Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 60/64] drm/bridge: ti-sn65dsi86: fix REFCLK setting Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 61/64] perf bpf-event: Fix use-after-free in synthesis Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 62/64] clk: qcom: gdsc: Set retain_ff before moving to HW CTRL Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 63/64] spi: tegra114: Use value to check for invalid delays Greg Kroah-Hartman
2025-09-07 19:58 ` [PATCH 5.15 64/64] dmaengine: mediatek: Fix a flag reuse error in mtk_cqdma_tx_status() Greg Kroah-Hartman
2025-09-08  2:35 ` [PATCH 5.15 00/64] 5.15.192-rc1 review Florian Fainelli
2025-09-08  9:27 ` Brett A C Sheffield
2025-09-08 15:01 ` Jon Hunter
2025-09-08 18:24 ` Naresh Kamboju
2025-09-09 10:29   ` Greg Kroah-Hartman
2025-09-09 14:18     ` Naresh Kamboju
2025-09-09 14:37       ` Greg Kroah-Hartman
2025-09-08 22:52 ` Shuah Khan
2025-09-09  6:14 ` Ron Economos
2025-09-09 14:10 ` Vijayendra Suman
2025-09-17  8:03   ` Pavel Machek
2025-09-09 17:36 ` Hardik Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250907195604.457007923@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=doebel@amazon.de \
    --cc=jannh@google.com \
    --cc=kirill.shutemov@intel.linux.com \
    --cc=patches@lists.linux.dev \
    --cc=shy828301@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).