* + mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch added to mm-unstable branch
@ 2025-09-19 20:19 Andrew Morton
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2025-09-19 20:19 UTC (permalink / raw)
To: mm-commits, willy, vbabka, surenb, shakeel.butt, rppt, riel,
mhocko, lorenzo.stoakes, liam.howlett, hughd, hannes, david,
baolin.wang, kas, akpm
The patch titled
Subject: mm/rmap: fix a mlock race condition in folio_referenced_one()
has been added to the -mm mm-unstable branch. Its filename is
mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kiryl Shutsemau <kas@kernel.org>
Subject: mm/rmap: fix a mlock race condition in folio_referenced_one()
Date: Fri, 19 Sep 2025 13:40:33 +0100
The mlock_vma_folio() function requires the page table lock to be held in
order to safely mlock the folio. However, folio_referenced_one() mlocks a
large folios outside of the page_vma_mapped_walk() loop where the page
table lock has already been dropped.
Rework the mlock logic to use the same code path inside the loop for both
large and small folios.
Use PVMW_PGTABLE_CROSSED to detect when the folio is mapped across a page
table boundary.
Link: https://lkml.kernel.org/r/20250919124036.455709-3-kirill@shutemov.name
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/rmap.c | 57 ++++++++++++++++++----------------------------------
1 file changed, 20 insertions(+), 37 deletions(-)
--- a/mm/rmap.c~mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one
+++ a/mm/rmap.c
@@ -850,34 +850,34 @@ static bool folio_referenced_one(struct
{
struct folio_referenced_arg *pra = arg;
DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0);
- int referenced = 0;
- unsigned long start = address, ptes = 0;
+ int ptes = 0, referenced = 0;
while (page_vma_mapped_walk(&pvmw)) {
address = pvmw.address;
if (vma->vm_flags & VM_LOCKED) {
- if (!folio_test_large(folio) || !pvmw.pte) {
- /* Restore the mlock which got missed */
- mlock_vma_folio(folio, vma);
- page_vma_mapped_walk_done(&pvmw);
- pra->vm_flags |= VM_LOCKED;
- return false; /* To break the loop */
- }
+ ptes++;
+ pra->mapcount--;
+
+ /* Only mlock fully mapped pages */
+ if (pvmw.pte && ptes != pvmw.nr_pages)
+ continue;
+
/*
- * For large folio fully mapped to VMA, will
- * be handled after the pvmw loop.
+ * All PTEs must be protected by page table lock in
+ * order to mlock the page.
*
- * For large folio cross VMA boundaries, it's
- * expected to be picked by page reclaim. But
- * should skip reference of pages which are in
- * the range of VM_LOCKED vma. As page reclaim
- * should just count the reference of pages out
- * the range of VM_LOCKED vma.
+ * If page table boundary has been cross, current ptl
+ * only protect part of ptes.
*/
- ptes++;
- pra->mapcount--;
- continue;
+ if (pvmw.flags & PVMW_PGTABLE_CROSSSED)
+ continue;
+
+ /* Restore the mlock which got missed */
+ mlock_vma_folio(folio, vma);
+ page_vma_mapped_walk_done(&pvmw);
+ pra->vm_flags |= VM_LOCKED;
+ return false; /* To break the loop */
}
/*
@@ -913,23 +913,6 @@ static bool folio_referenced_one(struct
pra->mapcount--;
}
- if ((vma->vm_flags & VM_LOCKED) &&
- folio_test_large(folio) &&
- folio_within_vma(folio, vma)) {
- unsigned long s_align, e_align;
-
- s_align = ALIGN_DOWN(start, PMD_SIZE);
- e_align = ALIGN_DOWN(start + folio_size(folio) - 1, PMD_SIZE);
-
- /* folio doesn't cross page table boundary and fully mapped */
- if ((s_align == e_align) && (ptes == folio_nr_pages(folio))) {
- /* Restore the mlock which got missed */
- mlock_vma_folio(folio, vma);
- pra->vm_flags |= VM_LOCKED;
- return false; /* To break the loop */
- }
- }
-
if (referenced)
folio_clear_idle(folio);
if (folio_test_clear_young(folio))
_
Patches currently in -mm which might be from kas@kernel.org are
mm-khugepaged-do-not-fail-collapse_pte_mapped_thp-on-scan_pmd_null.patch
mm-page_vma_mapped-track-if-the-page-is-mapped-across-page-table-boundary.patch
mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch
mm-rmap-mlock-large-folios-in-try_to_unmap_one.patch
mm-fault-try-to-map-the-entire-file-folio-in-finish_fault.patch
mm-rmap-improve-mlock-tracking-for-large-folios.patch
^ permalink raw reply [flat|nested] 2+ messages in thread* + mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch added to mm-unstable branch
@ 2025-09-23 21:30 Andrew Morton
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2025-09-23 21:30 UTC (permalink / raw)
To: mm-commits, shakeel.butt, lorenzo.stoakes, hannes, david,
baolin.wang, kas, akpm
The patch titled
Subject: mm/rmap: fix a mlock race condition in folio_referenced_one()
has been added to the -mm mm-unstable branch. Its filename is
mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kiryl Shutsemau <kas@kernel.org>
Subject: mm/rmap: fix a mlock race condition in folio_referenced_one()
Date: Tue, 23 Sep 2025 12:07:07 +0100
The mlock_vma_folio() function requires the page table lock to be held in
order to safely mlock the folio. However, folio_referenced_one() mlocks a
large folios outside of the page_vma_mapped_walk() loop where the page
table lock has already been dropped.
Rework the mlock logic to use the same code path inside the loop for both
large and small folios.
Use PVMW_PGTABLE_CROSSED to detect when the folio is mapped across a page
table boundary.
Link: https://lkml.kernel.org/r/20250923110711.690639-3-kirill@shutemov.name
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/rmap.c | 57 ++++++++++++++++++----------------------------------
1 file changed, 20 insertions(+), 37 deletions(-)
--- a/mm/rmap.c~mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one
+++ a/mm/rmap.c
@@ -850,34 +850,34 @@ static bool folio_referenced_one(struct
{
struct folio_referenced_arg *pra = arg;
DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0);
- int referenced = 0;
- unsigned long start = address, ptes = 0;
+ int ptes = 0, referenced = 0;
while (page_vma_mapped_walk(&pvmw)) {
address = pvmw.address;
if (vma->vm_flags & VM_LOCKED) {
- if (!folio_test_large(folio) || !pvmw.pte) {
- /* Restore the mlock which got missed */
- mlock_vma_folio(folio, vma);
- page_vma_mapped_walk_done(&pvmw);
- pra->vm_flags |= VM_LOCKED;
- return false; /* To break the loop */
- }
+ ptes++;
+ pra->mapcount--;
+
+ /* Only mlock fully mapped pages */
+ if (pvmw.pte && ptes != pvmw.nr_pages)
+ continue;
+
/*
- * For large folio fully mapped to VMA, will
- * be handled after the pvmw loop.
+ * All PTEs must be protected by page table lock in
+ * order to mlock the page.
*
- * For large folio cross VMA boundaries, it's
- * expected to be picked by page reclaim. But
- * should skip reference of pages which are in
- * the range of VM_LOCKED vma. As page reclaim
- * should just count the reference of pages out
- * the range of VM_LOCKED vma.
+ * If page table boundary has been cross, current ptl
+ * only protect part of ptes.
*/
- ptes++;
- pra->mapcount--;
- continue;
+ if (pvmw.flags & PVMW_PGTABLE_CROSSSED)
+ continue;
+
+ /* Restore the mlock which got missed */
+ mlock_vma_folio(folio, vma);
+ page_vma_mapped_walk_done(&pvmw);
+ pra->vm_flags |= VM_LOCKED;
+ return false; /* To break the loop */
}
/*
@@ -913,23 +913,6 @@ static bool folio_referenced_one(struct
pra->mapcount--;
}
- if ((vma->vm_flags & VM_LOCKED) &&
- folio_test_large(folio) &&
- folio_within_vma(folio, vma)) {
- unsigned long s_align, e_align;
-
- s_align = ALIGN_DOWN(start, PMD_SIZE);
- e_align = ALIGN_DOWN(start + folio_size(folio) - 1, PMD_SIZE);
-
- /* folio doesn't cross page table boundary and fully mapped */
- if ((s_align == e_align) && (ptes == folio_nr_pages(folio))) {
- /* Restore the mlock which got missed */
- mlock_vma_folio(folio, vma);
- pra->vm_flags |= VM_LOCKED;
- return false; /* To break the loop */
- }
- }
-
if (referenced)
folio_clear_idle(folio);
if (folio_test_clear_young(folio))
_
Patches currently in -mm which might be from kas@kernel.org are
mm-page_vma_mapped-track-if-the-page-is-mapped-across-page-table-boundary.patch
mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch
mm-rmap-mlock-large-folios-in-try_to_unmap_one.patch
mm-fault-try-to-map-the-entire-file-folio-in-finish_fault.patch
mm-filemap-map-entire-large-folio-faultaround.patch
mm-rmap-improve-mlock-tracking-for-large-folios.patch
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-09-23 21:30 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 20:19 + mm-rmap-fix-a-mlock-race-condition-in-folio_referenced_one.patch added to mm-unstable branch Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2025-09-23 21:30 Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.