From: Mike Kravetz <mike.kravetz@oracle.com>
To: gregkh@linuxfoundation.org
Cc: akpm@linux-foundation.org, almasrymina@google.com,
axelrasmussen@google.com, david@redhat.com,
harperchen1110@gmail.com, nadav.amit@gmail.com,
naoya.horiguchi@linux.dev, peterx@redhat.com, riel@surriel.com,
stable@vger.kernel.org, vbabka@suse.cz, willy@infradead.org
Subject: Re: FAILED: patch "[PATCH] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED" failed to apply to 6.0-stable tree
Date: Mon, 5 Dec 2022 11:37:08 -0800 [thread overview]
Message-ID: <Y45IZJ+NX76pJ+yn@monkey> (raw)
In-Reply-To: <167006657717429@kroah.com>
On 12/03/22 12:22, gregkh@linuxfoundation.org wrote:
>
> The patch below does not apply to the 6.0-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
Not exactly sure of the policy here where the commit title does not apply to
the backport. I left the title 'as is' and added a note about differences.
The commit message already anticipated a backport and mentions this as well.
From d0c568ab5c0b2c2c3aae8a3dd04a7c41cf1088c9 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Mon, 14 Nov 2022 15:55:06 -0800
Subject: [PATCH 2/2] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED
processing
commit 04ada095dcfc4ae359418053c0be94453bdf1e84 upstream.
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page
tables associated with the address range. For hugetlb vmas,
zap_page_range will call __unmap_hugepage_range_final. However,
__unmap_hugepage_range_final assumes the passed vma is about to be removed
and deletes the vma_lock to prevent pmd sharing as the vma is on the way
out. In the case of madvise(MADV_DONTNEED) the vma remains, but the
missing vma_lock prevents pmd sharing and could potentially lead to issues
with truncation/fault races.
This issue was originally reported here [1] as a BUG triggered in
page_try_dup_anon_rmap. Prior to the introduction of the hugetlb
vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to
prevent pmd sharing. Subsequent faults on this vma were confused as
VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was
not set in new pages added to the page table. This resulted in pages that
appeared anonymous in a VM_SHARED vma and triggered the BUG.
Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap
call from unmap_vmas(). This is used to indicate the 'final' unmapping of
a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and
the vm_lock is not deleted.
NOTE - Prior to the introduction of the huegtlb vma_lock in v6.1, this
issue is addressed by not clearing the VM_MAYSHARE flag when
__unmap_hugepage_range_final is called in the MADV_DONTNEED case.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/
Link: https://lkml.kernel.org/r/20221114235507.294320-3-mike.kravetz@oracle.com
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Wei Chen <harperchen1110@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
include/linux/mm.h | 2 ++
mm/hugetlb.c | 25 ++++++++++++++-----------
mm/memory.c | 2 +-
3 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7368a99e4e55..fa4d973c1363 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1794,6 +1794,8 @@ struct zap_details {
* default, the flag is not set.
*/
#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
+/* Set in unmap_vmas() to indicate a final unmap call. Only used by hugetlb */
+#define ZAP_FLAG_UNMAP ((__force zap_flags_t) BIT(1))
#ifdef CONFIG_MMU
extern bool can_do_mlock(void);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index dbb558e71e9e..022a3bfafec4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5145,17 +5145,20 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb,
{
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
- /*
- * Clear this flag so that x86's huge_pmd_share page_table_shareable
- * test will fail on a vma being torn down, and not grab a page table
- * on its way out. We're lucky that the flag has such an appropriate
- * name, and can in fact be safely cleared here. We could clear it
- * before the __unmap_hugepage_range above, but all that's necessary
- * is to clear it before releasing the i_mmap_rwsem. This works
- * because in the context this is called, the VMA is about to be
- * destroyed and the i_mmap_rwsem is held.
- */
- vma->vm_flags &= ~VM_MAYSHARE;
+ if (zap_flags & ZAP_FLAG_UNMAP) { /* final unmap */
+ /*
+ * Clear this flag so that x86's huge_pmd_share
+ * page_table_shareable test will fail on a vma being torn
+ * down, and not grab a page table on its way out. We're lucky
+ * that the flag has such an appropriate name, and can in fact
+ * be safely cleared here. We could clear it before the
+ * __unmap_hugepage_range above, but all that's necessary
+ * is to clear it before releasing the i_mmap_rwsem. This works
+ * because in the context this is called, the VMA is about to
+ * be destroyed and the i_mmap_rwsem is held.
+ */
+ vma->vm_flags &= ~VM_MAYSHARE;
+ }
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
diff --git a/mm/memory.c b/mm/memory.c
index 68d5b3dcec2e..a0fdaa74091f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1712,7 +1712,7 @@ void unmap_vmas(struct mmu_gather *tlb,
{
struct mmu_notifier_range range;
struct zap_details details = {
- .zap_flags = ZAP_FLAG_DROP_MARKER,
+ .zap_flags = ZAP_FLAG_DROP_MARKER | ZAP_FLAG_UNMAP,
/* Careful - we need to zap private pages too! */
.even_cows = true,
};
--
2.38.1
prev parent reply other threads:[~2022-12-05 19:42 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-03 11:22 FAILED: patch "[PATCH] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED" failed to apply to 6.0-stable tree gregkh
2022-12-05 19:37 ` Mike Kravetz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y45IZJ+NX76pJ+yn@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=axelrasmussen@google.com \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=harperchen1110@gmail.com \
--cc=nadav.amit@gmail.com \
--cc=naoya.horiguchi@linux.dev \
--cc=peterx@redhat.com \
--cc=riel@surriel.com \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).