From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 207B42BE7AC for ; Fri, 20 Jun 2025 15:11:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750432302; cv=none; b=ERIAJ9Thl8lOoNyA89mLOaBJgrdMMp0DBBqlZe8/PfVYubutXWh57ugIbxuuUkFcHm6zGMFI/wPDaCYTFXTl/b9F/4K+3n8XpoFi/8/TDiB7ebYvNmz6VT1kBnGLmV+zUf7BBpX1Fl2BqgdczAh3mk2afWrp6JXbFW4n1B5HCB8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750432302; c=relaxed/simple; bh=nrO9GX5YgbHVI3tjaekCl7w0+DtLxxv7yPyJBqIPGsM=; h=Subject:To:Cc:From:Date:Message-ID:MIME-Version:Content-Type; b=uj+lTRobm1zdUwYzwR7cCLOZaCP5lXYL73dcCce0yknlbcbUHeoxYsgsvgMqa54bdbX5bsrU6XiG5K9uS/p7JWVv6fgarS405SVOJJ/y8QA7ReO+znHQZrXaqijiGBnd7Yu/PxuyTpC2Xkw0xhrtSyC9cPRRJgIhYW6mvzhGUZw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=XlxGJ3m5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="XlxGJ3m5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 31FC9C4CEEF; Fri, 20 Jun 2025 15:11:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1750432301; bh=nrO9GX5YgbHVI3tjaekCl7w0+DtLxxv7yPyJBqIPGsM=; h=Subject:To:Cc:From:Date:From; b=XlxGJ3m5oZ+dwJ+NFXter8r/K50A+8KYyEygLRJK9jmW1HUxhGPDAt4JZsKAMashk jt+J5zPDsPF82+iyf4TmKfmJ2Ow8DIf7o8A6ClZPBGAVhRNMSs/T44XjZKV5k+RA0T KwHXFoK9svrPsz/EqckUa8enhjE903Yrd0SKCaFU= Subject: FAILED: patch "[PATCH] mm: close theoretical race where stale TLB entries could" failed to apply to 5.4-stable tree To: ryan.roberts@arm.com,akpm@linux-foundation.org,david@redhat.com,jannh@google.com,liam.howlett@oracle.com,lorenzo.stoakes@oracle.com,mgorman@suse.de,stable@vger.kernel.org,vbabka@suse.cz Cc: From: Date: Fri, 20 Jun 2025 17:11:29 +0200 Message-ID: <2025062029-saturday-conical-0eae@gregkh> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x 383c4613c67c26e90e8eebb72e3083457d02033f # git commit -s git send-email --to '' --in-reply-to '2025062029-saturday-conical-0eae@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 383c4613c67c26e90e8eebb72e3083457d02033f Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Fri, 6 Jun 2025 10:28:07 +0100 Subject: [PATCH] mm: close theoretical race where stale TLB entries could linger Commit 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries") described a theoretical race as such: """ Nadav Amit identified a theoretical race between page reclaim and mprotect due to TLB flushes being batched outside of the PTL being held. He described the race as follows: CPU0 CPU1 ---- ---- user accesses memory using RW PTE [PTE now cached in TLB] try_to_unmap_one() ==> ptep_get_and_clear() ==> set_tlb_ubc_flush_pending() mprotect(addr, PROT_READ) ==> change_pte_range() ==> [ PTE non-present - no flush ] user writes using cached RW PTE ... try_to_unmap_flush() The same type of race exists for reads when protecting for PROT_NONE and also exists for operations that can leave an old TLB entry behind such as munmap, mremap and madvise. """ The solution was to introduce flush_tlb_batched_pending() and call it under the PTL from mprotect/madvise/munmap/mremap to complete any pending tlb flushes. However, while madvise_free_pte_range() and madvise_cold_or_pageout_pte_range() were both retro-fitted to call flush_tlb_batched_pending() immediately after initially acquiring the PTL, they both temporarily release the PTL to split a large folio if they stumble upon one. In this case, where re-acquiring the PTL flush_tlb_batched_pending() must be called again, but it previously was not. Let's fix that. There are 2 Fixes: tags here: the first is the commit that fixed madvise_free_pte_range(). The second is the commit that added madvise_cold_or_pageout_pte_range(), which looks like it copy/pasted the faulty pattern from madvise_free_pte_range(). This is a theoretical bug discovered during code review. Link: https://lkml.kernel.org/r/20250606092809.4194056-1-ryan.roberts@arm.com Fixes: 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries") Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD") Signed-off-by: Ryan Roberts Reviewed-by: Jann Horn Acked-by: David Hildenbrand Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Mel Gorman Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton diff --git a/mm/madvise.c b/mm/madvise.c index 5f7a66a1617e..1d44a35ae85c 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -508,6 +508,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, pte_offset_map_lock(mm, pmd, addr, &ptl); if (!start_pte) break; + flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); if (!err) nr = 0; @@ -741,6 +742,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, start_pte = pte; if (!start_pte) break; + flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); if (!err) nr = 0;