* [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
@ 2025-08-26 6:58 Harry Yoo
2025-08-26 6:58 ` [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks Harry Yoo
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Harry Yoo @ 2025-08-26 6:58 UTC (permalink / raw)
To: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Lorenzo Stoakes, David Hildenbrand, Kees Cook
Cc: Vlastimil Babka, Shakeel Butt, Mike Rapoport, Michal Hocko,
Jonathan Corbet, Jann Horn, Pedro Falcato, Rik van Riel, linux-mm,
linux-doc, linux-kernel, Harry Yoo
While move_ptes() has a comment explaining why rmap locks are needed,
Documentation/mm/process_addrs.rst does not. Without being aware of that
comment, I spent hours figuring out how things could go wrong and why,
in some cases, rmap locks can be safely skipped.
Add a more comprehensive explanation to the documentation to save time
for others.
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
Documentation/mm/process_addrs.rst | 32 ++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
index be49e2a269e4..ee7c0dba339e 100644
--- a/Documentation/mm/process_addrs.rst
+++ b/Documentation/mm/process_addrs.rst
@@ -744,6 +744,38 @@ You can observe this in the :c:func:`!mremap` implementation in the functions
:c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap
side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`.
+.. note:: If :c:func:`!mremap()` -> :c:func:`!move_ptes()` does not take rmap
+ locks, :c:func:`!rmap_walk()` may miss a pte for the folio.
+
+ The problematic sequence is as follows:
+
+ 1. :c:func:`!rmap_walk()` checks the destination VMA, finds no pte,
+ and releases the page table lock.
+ 2. :c:func:`!move_ptes()` moves the page tables from the source to the
+ destination.
+ 3. :c:func:`!rmap_walk()` checks the source VMA, finds no pte, and
+ thus rmap walk misses it.
+
+ Taking rmap locks in :c:func:`!move_ptes()` ensures that
+ :c:func:`!rmap_walk()` sees the pte in either the source or
+ destination VMA.
+
+ There are two cases where rmap locks can be skipped:
+
+ 1. If the source VMA is guaranteed to be visited before the
+ destination VMA during rmap walk, :c:func:`!rmap_walk()` will
+ encounter the pte in one of the two VMAs. VMAs associated with
+ an anon_vma are organized in an interval tree, so the src->dst
+ order is guaranteed when the source VMA's vm_pgoff precedes
+ the destination VMA's vm_pgoff.
+
+ 2. When :c:func:`!exec()` relocates a temporary stack VMA via
+ :c:func:`!relocate_vma_down()`, there is no separate destination
+ VMA. Instead, the source VMA is marked as a temporary stack and
+ relocated. In this case, the folios belonging to the VMA cannot be
+ migrated until the relocation is complete, avoiding the need to
+ acquire rmap locks for performance reasons.
+
VMA lock internals
------------------
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks
2025-08-26 6:58 [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Harry Yoo
@ 2025-08-26 6:58 ` Harry Yoo
2025-08-26 9:46 ` Lorenzo Stoakes
2025-08-26 7:22 ` [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Jonathan Corbet
2025-08-26 9:55 ` Lorenzo Stoakes
2 siblings, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-08-26 6:58 UTC (permalink / raw)
To: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Lorenzo Stoakes, David Hildenbrand, Kees Cook
Cc: Vlastimil Babka, Shakeel Butt, Mike Rapoport, Michal Hocko,
Jonathan Corbet, Jann Horn, Pedro Falcato, Rik van Riel, linux-mm,
linux-doc, linux-kernel, Harry Yoo
While move_ptes() explains when rmap locks can be skipped, when reading
the code setting pmc.need_rmap_locks it is not immediately obvious when
need_rmap_locks can be false. Add a brief explanation in copy_vma() and
relocate_vma_down(), and add a pointer to the comment in move_ptes().
Meanwhile, fix and improve the comment in move_ptes().
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
mm/mremap.c | 4 +++-
mm/vma.c | 7 +++++++
mm/vma_exec.c | 5 +++++
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/mm/mremap.c b/mm/mremap.c
index e618a706aff5..86adb872bea0 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -218,8 +218,10 @@ static int move_ptes(struct pagetable_move_control *pmc,
* When need_rmap_locks is false, we use other ways to avoid
* such races:
*
- * - During exec() shift_arg_pages(), we use a specially tagged vma
+ * - During exec() relocate_vma_down(), we use a specially tagged vma
* which rmap call sites look for using vma_is_temporary_stack().
+ * Folios mapped in the temporary stack vma cannot be migrated until
+ * the relocation is complete.
*
* - During mremap(), new_vma is often known to be placed after vma
* in rmap traversal order. This ensures rmap will always observe
diff --git a/mm/vma.c b/mm/vma.c
index 3b12c7579831..3da49f79e9ba 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -1842,6 +1842,11 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
vmg.next = vma_iter_next_rewind(&vmi, NULL);
new_vma = vma_merge_new_range(&vmg);
+ /*
+ * rmap locks can be skipped as long as new_vma is traversed
+ * after vma during rmap walk (new_vma->vm_pgoff >= vma->vm_pgoff).
+ * See the comment in move_ptes().
+ */
if (new_vma) {
/*
* Source vma may have been merged into new_vma
@@ -1879,6 +1884,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
new_vma->vm_ops->open(new_vma);
if (vma_link(mm, new_vma))
goto out_vma_link;
+
+ /* new_vma->pg_off is always >= vma->pg_off if not merged */
*need_rmap_locks = false;
}
return new_vma;
diff --git a/mm/vma_exec.c b/mm/vma_exec.c
index 922ee51747a6..a895dd39ac46 100644
--- a/mm/vma_exec.c
+++ b/mm/vma_exec.c
@@ -63,6 +63,11 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
* process cleanup to remove whatever mess we made.
*/
pmc.for_stack = true;
+ /*
+ * pmc.need_rmap_locks is false since rmap locks can be safely skipped
+ * because migration is disabled for this vma during relocation.
+ * See the comment in move_ptes().
+ */
if (length != move_page_tables(&pmc))
return -ENOMEM;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-26 6:58 [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Harry Yoo
2025-08-26 6:58 ` [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks Harry Yoo
@ 2025-08-26 7:22 ` Jonathan Corbet
2025-08-26 8:37 ` Lorenzo Stoakes
2025-08-26 9:55 ` Lorenzo Stoakes
2 siblings, 1 reply; 12+ messages in thread
From: Jonathan Corbet @ 2025-08-26 7:22 UTC (permalink / raw)
To: Harry Yoo, Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Lorenzo Stoakes, David Hildenbrand, Kees Cook
Cc: Vlastimil Babka, Shakeel Butt, Mike Rapoport, Michal Hocko,
Jann Horn, Pedro Falcato, Rik van Riel, linux-mm, linux-doc,
linux-kernel, Harry Yoo
Harry Yoo <harry.yoo@oracle.com> writes:
> While move_ptes() has a comment explaining why rmap locks are needed,
> Documentation/mm/process_addrs.rst does not. Without being aware of that
> comment, I spent hours figuring out how things could go wrong and why,
> in some cases, rmap locks can be safely skipped.
>
> Add a more comprehensive explanation to the documentation to save time
> for others.
>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
> Documentation/mm/process_addrs.rst | 32 ++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> index be49e2a269e4..ee7c0dba339e 100644
> --- a/Documentation/mm/process_addrs.rst
> +++ b/Documentation/mm/process_addrs.rst
> @@ -744,6 +744,38 @@ You can observe this in the :c:func:`!mremap` implementation in the functions
> :c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap
> side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`.
>
> +.. note:: If :c:func:`!mremap()` -> :c:func:`!move_ptes()` does not take rmap
> + locks, :c:func:`!rmap_walk()` may miss a pte for the folio.
> +
> + The problematic sequence is as follows:
Please don't use :c:func: - just write function() and all the right
things will happen. (For extra credit, fix the existing usages :)
Thanks,
jon
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-26 7:22 ` [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Jonathan Corbet
@ 2025-08-26 8:37 ` Lorenzo Stoakes
2025-08-26 9:48 ` Harry Yoo
0 siblings, 1 reply; 12+ messages in thread
From: Lorenzo Stoakes @ 2025-08-26 8:37 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Harry Yoo, Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
David Hildenbrand, Kees Cook, Vlastimil Babka, Shakeel Butt,
Mike Rapoport, Michal Hocko, Jann Horn, Pedro Falcato,
Rik van Riel, linux-mm, linux-doc, linux-kernel
Harry - one brief very nitty note - could you do a cover letter even for 2 patch
series?
This is a subjective thing and literally just my taste but I prefer it :P
obviously this is optional as a result, but I feel it's neater.
Thanks :)
On Tue, Aug 26, 2025 at 01:22:03AM -0600, Jonathan Corbet wrote:
> Harry Yoo <harry.yoo@oracle.com> writes:
>
> > While move_ptes() has a comment explaining why rmap locks are needed,
> > Documentation/mm/process_addrs.rst does not. Without being aware of that
> > comment, I spent hours figuring out how things could go wrong and why,
> > in some cases, rmap locks can be safely skipped.
> >
> > Add a more comprehensive explanation to the documentation to save time
> > for others.
> >
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> > Documentation/mm/process_addrs.rst | 32 ++++++++++++++++++++++++++++++
> > 1 file changed, 32 insertions(+)
> >
> > diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> > index be49e2a269e4..ee7c0dba339e 100644
> > --- a/Documentation/mm/process_addrs.rst
> > +++ b/Documentation/mm/process_addrs.rst
> > @@ -744,6 +744,38 @@ You can observe this in the :c:func:`!mremap` implementation in the functions
> > :c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap
> > side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`.
> >
> > +.. note:: If :c:func:`!mremap()` -> :c:func:`!move_ptes()` does not take rmap
> > + locks, :c:func:`!rmap_walk()` may miss a pte for the folio.
> > +
> > + The problematic sequence is as follows:
>
> Please don't use :c:func: - just write function() and all the right
> things will happen. (For extra credit, fix the existing usages :)
Yeah sorry Jon on latter bit, I did mean to get to that but workload
been... well you can see on lore :P
I have a real backlog even more than usual right now too due to daring to take a
day off on a national holiday here in the UK :))
Harry - more than happy for you to do the above as part of this series or
separately, will sling you some tags accordingly.
If you're not already doing it (expect you are) you can generate docs via:
make SPHINXDIRS=mm htmldocs
Then get access to generated HTML in a browser locally in Documentation/output/
>
> Thanks,
>
> jon
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks
2025-08-26 6:58 ` [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks Harry Yoo
@ 2025-08-26 9:46 ` Lorenzo Stoakes
2025-08-27 6:52 ` Harry Yoo
0 siblings, 1 reply; 12+ messages in thread
From: Lorenzo Stoakes @ 2025-08-26 9:46 UTC (permalink / raw)
To: Harry Yoo
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
David Hildenbrand, Kees Cook, Vlastimil Babka, Shakeel Butt,
Mike Rapoport, Michal Hocko, Jonathan Corbet, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Tue, Aug 26, 2025 at 03:58:48PM +0900, Harry Yoo wrote:
> While move_ptes() explains when rmap locks can be skipped, when reading
> the code setting pmc.need_rmap_locks it is not immediately obvious when
> need_rmap_locks can be false. Add a brief explanation in copy_vma() and
> relocate_vma_down(), and add a pointer to the comment in move_ptes().
>
> Meanwhile, fix and improve the comment in move_ptes().
>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
This is great thanks! :)
> ---
> mm/mremap.c | 4 +++-
> mm/vma.c | 7 +++++++
> mm/vma_exec.c | 5 +++++
> 3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mremap.c b/mm/mremap.c
> index e618a706aff5..86adb872bea0 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -218,8 +218,10 @@ static int move_ptes(struct pagetable_move_control *pmc,
> * When need_rmap_locks is false, we use other ways to avoid
> * such races:
> *
> - * - During exec() shift_arg_pages(), we use a specially tagged vma
> + * - During exec() relocate_vma_down(), we use a specially tagged vma
> * which rmap call sites look for using vma_is_temporary_stack().
> + * Folios mapped in the temporary stack vma cannot be migrated until
> + * the relocation is complete.
Can we actually move this comment over to move_page_tables()? As this is
relevant to the whole operation. Also could you put a comment referencing this
comment in copy_vma_and_data() as this is where we actually determine whether
this is the case or not in _most cases_.
Let's just get all the 'need rmap locks' and 'corner cases where it's fine
anyway' in one place that is logical :)
Also could you put a comment in copy_vma() over in mm/vma.c saying 'see the
comment in mm/mremap.c' or even risk mentioning the function name (risky as code
changes but still :P) e.g. 'see comment in move_page_tables()' or something.
I'm confused by the 'folios mapped' and 'migrate' bits - and I think people will
be confused by that.
I think better to say 'page tables for the temporary stack cannot be adjusted
until the relocation is complete'.
> *
> * - During mremap(), new_vma is often known to be placed after vma
> * in rmap traversal order. This ensures rmap will always observe
This whole bit after could really do with some ASCII diagrams btw :)) ;) but you
know maybe out of scope here.
> diff --git a/mm/vma.c b/mm/vma.c
> index 3b12c7579831..3da49f79e9ba 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -1842,6 +1842,11 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> vmg.next = vma_iter_next_rewind(&vmi, NULL);
> new_vma = vma_merge_new_range(&vmg);
>
> + /*
> + * rmap locks can be skipped as long as new_vma is traversed
> + * after vma during rmap walk (new_vma->vm_pgoff >= vma->vm_pgoff).
> + * See the comment in move_ptes().
> + */
Obv. would prefer this to say 'move_page_tables()' as mentioned above :P
> if (new_vma) {
> /*
> * Source vma may have been merged into new_vma
> @@ -1879,6 +1884,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> new_vma->vm_ops->open(new_vma);
> if (vma_link(mm, new_vma))
> goto out_vma_link;
> +
> + /* new_vma->pg_off is always >= vma->pg_off if not merged */
Err, new_vma is NULL? :) I'm not sure this comment is too useful.
> *need_rmap_locks = false;
> }
> return new_vma;
> diff --git a/mm/vma_exec.c b/mm/vma_exec.c
> index 922ee51747a6..a895dd39ac46 100644
> --- a/mm/vma_exec.c
> +++ b/mm/vma_exec.c
> @@ -63,6 +63,11 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
> * process cleanup to remove whatever mess we made.
> */
> pmc.for_stack = true;
> + /*
> + * pmc.need_rmap_locks is false since rmap locks can be safely skipped
> + * because migration is disabled for this vma during relocation.
> + * See the comment in move_ptes().
> + */
Let's reword this also, people will get confused about migration here.
'pmc.need_rmap_locks is false since rmap explicitly checks for
vma_is_temporary_stack() and thus extra care does not need to be taken here
during stack relocation. See the comment in move_page_tables().'
> if (length != move_page_tables(&pmc))
> return -ENOMEM;
>
> --
> 2.43.0
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-26 8:37 ` Lorenzo Stoakes
@ 2025-08-26 9:48 ` Harry Yoo
2025-08-26 9:58 ` Lorenzo Stoakes
0 siblings, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-08-26 9:48 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Jonathan Corbet, Andrew Morton, Suren Baghdasaryan,
Liam R . Howlett, David Hildenbrand, Kees Cook, Vlastimil Babka,
Shakeel Butt, Mike Rapoport, Michal Hocko, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Tue, Aug 26, 2025 at 09:37:23AM +0100, Lorenzo Stoakes wrote:
> Harry - one brief very nitty note - could you do a cover letter even for 2 patch
> series?
>
> This is a subjective thing and literally just my taste but I prefer it :P
> obviously this is optional as a result, but I feel it's neater.
>
No problem! will do cover letter from next time.
> On Tue, Aug 26, 2025 at 01:22:03AM -0600, Jonathan Corbet wrote:
> > Harry Yoo <harry.yoo@oracle.com> writes:
> >
> > > While move_ptes() has a comment explaining why rmap locks are needed,
> > > Documentation/mm/process_addrs.rst does not. Without being aware of that
> > > comment, I spent hours figuring out how things could go wrong and why,
> > > in some cases, rmap locks can be safely skipped.
> > >
> > > Add a more comprehensive explanation to the documentation to save time
> > > for others.
> > >
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > > ---
> > > Documentation/mm/process_addrs.rst | 32 ++++++++++++++++++++++++++++++
> > > 1 file changed, 32 insertions(+)
> > >
> > > diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> > > index be49e2a269e4..ee7c0dba339e 100644
> > > --- a/Documentation/mm/process_addrs.rst
> > > +++ b/Documentation/mm/process_addrs.rst
> > > @@ -744,6 +744,38 @@ You can observe this in the :c:func:`!mremap` implementation in the functions
> > > :c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap
> > > side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`.
> > >
> > > +.. note:: If :c:func:`!mremap()` -> :c:func:`!move_ptes()` does not take rmap
> > > + locks, :c:func:`!rmap_walk()` may miss a pte for the folio.
> > > +
> > > + The problematic sequence is as follows:
> >
> > Please don't use :c:func: - just write function() and all the right
> > things will happen. (For extra credit, fix the existing usages :)
Hi Jonathan and Lorenzo,
**blaming myself for thinking**
"Hmmm it's already there, it should be fine to use it..."
> Yeah sorry Jon on latter bit, I did mean to get to that but workload
> been... well you can see on lore :P
>
> I have a real backlog even more than usual right now too due to daring to take a
> day off on a national holiday here in the UK :))
>
> Harry - more than happy for you to do the above as part of this series or
> separately, will sling you some tags accordingly.
Okay, I'll do as a part of the series (process_addrs.rst and memory-model.rst).
> If you're not already doing it (expect you are) you can generate docs via:
>
> make SPHINXDIRS=mm htmldocs
>
> Then get access to generated HTML in a browser locally in Documentation/output/
Thanks and yeah I'm doing it! :)
> >
> > Thanks,
> >
> > jon
>
> Cheers, Lorenzo
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-26 6:58 [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Harry Yoo
2025-08-26 6:58 ` [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks Harry Yoo
2025-08-26 7:22 ` [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Jonathan Corbet
@ 2025-08-26 9:55 ` Lorenzo Stoakes
2 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes @ 2025-08-26 9:55 UTC (permalink / raw)
To: Harry Yoo
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
David Hildenbrand, Kees Cook, Vlastimil Babka, Shakeel Butt,
Mike Rapoport, Michal Hocko, Jonathan Corbet, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Tue, Aug 26, 2025 at 03:58:47PM +0900, Harry Yoo wrote:
> While move_ptes() has a comment explaining why rmap locks are needed,
> Documentation/mm/process_addrs.rst does not. Without being aware of that
> comment, I spent hours figuring out how things could go wrong and why,
> in some cases, rmap locks can be safely skipped.
>
> Add a more comprehensive explanation to the documentation to save time
> for others.
>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Again great, but needs some fix ups, see below!
> ---
> Documentation/mm/process_addrs.rst | 32 ++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> index be49e2a269e4..ee7c0dba339e 100644
> --- a/Documentation/mm/process_addrs.rst
> +++ b/Documentation/mm/process_addrs.rst
> @@ -744,6 +744,38 @@ You can observe this in the :c:func:`!mremap` implementation in the functions
> :c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap
> side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`.
>
> +.. note:: If :c:func:`!mremap()` -> :c:func:`!move_ptes()` does not take rmap
> + locks, :c:func:`!rmap_walk()` may miss a pte for the folio.
> +
> + The problematic sequence is as follows:
> +
> + 1. :c:func:`!rmap_walk()` checks the destination VMA, finds no pte,
> + and releases the page table lock.
> + 2. :c:func:`!move_ptes()` moves the page tables from the source to the
> + destination.
> + 3. :c:func:`!rmap_walk()` checks the source VMA, finds no pte, and
> + thus rmap walk misses it.
> +
> + Taking rmap locks in :c:func:`!move_ptes()` ensures that
> + :c:func:`!rmap_walk()` sees the pte in either the source or
> + destination VMA.
> +
> + There are two cases where rmap locks can be skipped:
> +
> + 1. If the source VMA is guaranteed to be visited before the
> + destination VMA during rmap walk, :c:func:`!rmap_walk()` will
> + encounter the pte in one of the two VMAs. VMAs associated with
> + an anon_vma are organized in an interval tree, so the src->dst
> + order is guaranteed when the source VMA's vm_pgoff precedes
> + the destination VMA's vm_pgoff.
> +
> + 2. When :c:func:`!exec()` relocates a temporary stack VMA via
> + :c:func:`!relocate_vma_down()`, there is no separate destination
> + VMA. Instead, the source VMA is marked as a temporary stack and
> + relocated. In this case, the folios belonging to the VMA cannot be
> + migrated until the relocation is complete, avoiding the need to
Again, let's not say migrated, or refer to folios. 'In this case page
tables cannot be modified until...' I'd say something about rmap will check
vma_is_temporary_stack() so there's no need to acquire rmap locks in this
instance.
> + acquire rmap locks for performance reasons.
> +
This is the wrong place for this.
Here I'm talking about the rmap lock _override_ for instances where we move >=
PMD, you're talking more generally.
So let's move things around a bit:
Some functions manipulate page table levels above PMD (that is PUD,
P4D and PGD page tables). Most notable of these is mremap(), which
is capable of moving higher level page tables.
-> insert here.
In these instances, it is required that all locks are taken, that is the
mmap lock, the VMA lock and the relevant rmap locks.
^--- then here reword this to:
'In instances where higher level page tables are moved, it is required...'
> VMA lock internals
> ------------------
>
> --
> 2.43.0
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-26 9:48 ` Harry Yoo
@ 2025-08-26 9:58 ` Lorenzo Stoakes
2025-08-27 7:18 ` Harry Yoo
0 siblings, 1 reply; 12+ messages in thread
From: Lorenzo Stoakes @ 2025-08-26 9:58 UTC (permalink / raw)
To: Harry Yoo
Cc: Jonathan Corbet, Andrew Morton, Suren Baghdasaryan,
Liam R . Howlett, David Hildenbrand, Kees Cook, Vlastimil Babka,
Shakeel Butt, Mike Rapoport, Michal Hocko, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Tue, Aug 26, 2025 at 06:48:41PM +0900, Harry Yoo wrote:
> Hi Jonathan and Lorenzo,
>
> **blaming myself for thinking**
> "Hmmm it's already there, it should be fine to use it..."
It's totally understandable :)
>
> > Yeah sorry Jon on latter bit, I did mean to get to that but workload
> > been... well you can see on lore :P
> >
> > I have a real backlog even more than usual right now too due to daring to take a
> > day off on a national holiday here in the UK :))
> >
> > Harry - more than happy for you to do the above as part of this series or
> > separately, will sling you some tags accordingly.
>
> Okay, I'll do as a part of the series (process_addrs.rst and memory-model.rst).
Can you please though make sure the formatting is all good? That doc really
needs the function names to stand out, so that's key.
I _think_ Jon fixed it so that should work fine but do check first!
>
> > If you're not already doing it (expect you are) you can generate docs via:
> >
> > make SPHINXDIRS=mm htmldocs
> >
> > Then get access to generated HTML in a browser locally in Documentation/output/
>
> Thanks and yeah I'm doing it! :)
Cheers!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks
2025-08-26 9:46 ` Lorenzo Stoakes
@ 2025-08-27 6:52 ` Harry Yoo
2025-08-27 11:16 ` Lorenzo Stoakes
0 siblings, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-08-27 6:52 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
David Hildenbrand, Kees Cook, Vlastimil Babka, Shakeel Butt,
Mike Rapoport, Michal Hocko, Jonathan Corbet, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Tue, Aug 26, 2025 at 10:46:24AM +0100, Lorenzo Stoakes wrote:
> On Tue, Aug 26, 2025 at 03:58:48PM +0900, Harry Yoo wrote:
> > While move_ptes() explains when rmap locks can be skipped, when reading
> > the code setting pmc.need_rmap_locks it is not immediately obvious when
> > need_rmap_locks can be false. Add a brief explanation in copy_vma() and
> > relocate_vma_down(), and add a pointer to the comment in move_ptes().
> >
> > Meanwhile, fix and improve the comment in move_ptes().
> >
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>
> This is great thanks! :)
You're welcome!
> > ---
> > mm/mremap.c | 4 +++-
> > mm/vma.c | 7 +++++++
> > mm/vma_exec.c | 5 +++++
> > 3 files changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/mremap.c b/mm/mremap.c
> > index e618a706aff5..86adb872bea0 100644
> > --- a/mm/mremap.c
> > +++ b/mm/mremap.c
> > @@ -218,8 +218,10 @@ static int move_ptes(struct pagetable_move_control *pmc,
> > * When need_rmap_locks is false, we use other ways to avoid
> > * such races:
> > *
> > - * - During exec() shift_arg_pages(), we use a specially tagged vma
> > + * - During exec() relocate_vma_down(), we use a specially tagged vma
> > * which rmap call sites look for using vma_is_temporary_stack().
> > + * Folios mapped in the temporary stack vma cannot be migrated until
> > + * the relocation is complete.
>
> Can we actually move this comment over to move_page_tables()? As this is
> relevant to the whole operation.
Sounds good, will do.
> Also could you put a comment referencing this
> comment in copy_vma_and_data() as this is where we actually determine whether
> this is the case or not in _most cases_.
>
> Let's just get all the 'need rmap locks' and 'corner cases where it's fine
> anyway' in one place that is logical :)
Will do.
> Also could you put a comment in copy_vma() over in mm/vma.c saying 'see the
> comment in mm/mremap.c' or even risk mentioning the function name (risky as code
> changes but still :P) e.g. 'see comment in move_page_tables()' or something.
Will take a risk and do "See the comment in move_page_tables()" :)
> I'm confused by the 'folios mapped' and 'migrate' bits - and I think people will
> be confused by that.
>
> I think better to say 'page tables for the temporary stack cannot be adjusted
> until the relocation is complete'.
But is that correct?
Out of all rmap users, only try_to_migrate() cares about
VM_STACK_INCOMPLETE_SETUP via invalid_migration_vma().
I'm not sure what prevents from try_to_unmap() from unmapping it while
it's relocated?
Looks like it's always been like this since a8bef8ff6ea1 ("mm: migration:
avoid race between shift_arg_pages() and rmap_walk() during migration by
not migrating temporary stacks")
> > *
> > * - During mremap(), new_vma is often known to be placed after vma
> > * in rmap traversal order. This ensures rmap will always observe
>
> This whole bit after could really do with some ASCII diagrams btw :)) ;) but you
> know maybe out of scope here.
>
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 3b12c7579831..3da49f79e9ba 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -1842,6 +1842,11 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> > vmg.next = vma_iter_next_rewind(&vmi, NULL);
> > new_vma = vma_merge_new_range(&vmg);
> >
> > + /*
> > + * rmap locks can be skipped as long as new_vma is traversed
> > + * after vma during rmap walk (new_vma->vm_pgoff >= vma->vm_pgoff).
> > + * See the comment in move_ptes().
> > + */
>
> Obv. would prefer this to say 'move_page_tables()' as mentioned above :P
Will do.
> > if (new_vma) {
> > /*
> > * Source vma may have been merged into new_vma
> > @@ -1879,6 +1884,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> > new_vma->vm_ops->open(new_vma);
> > if (vma_link(mm, new_vma))
> > goto out_vma_link;
> > +
> > + /* new_vma->pg_off is always >= vma->pg_off if not merged */
>
> Err, new_vma is NULL? :) I'm not sure this comment is too useful.
Sometimes the line between "worth commenting" and "too much comment" is
vague to me :) I'll remove it. Thanks.
> > *need_rmap_locks = false;
> > }
> > return new_vma;
> > diff --git a/mm/vma_exec.c b/mm/vma_exec.c
> > index 922ee51747a6..a895dd39ac46 100644
> > --- a/mm/vma_exec.c
> > +++ b/mm/vma_exec.c
> > @@ -63,6 +63,11 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
> > * process cleanup to remove whatever mess we made.
> > */
> > pmc.for_stack = true;
> > + /*
> > + * pmc.need_rmap_locks is false since rmap locks can be safely skipped
> > + * because migration is disabled for this vma during relocation.
> > + * See the comment in move_ptes().
> > + */
>
> Let's reword this also, people will get confused about migration here.
>
> 'pmc.need_rmap_locks is false since rmap explicitly checks for
> vma_is_temporary_stack() and thus extra care does not need to be taken here
> during stack relocation. See the comment in move_page_tables().'
This looks good! except for one thing, not all rmap users check for
vma_is_temporary_stack().
> > if (length != move_page_tables(&pmc))
> > return -ENOMEM;
> >
> > --
> > 2.43.0
> >
>
> Cheers, Lorenzo
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-26 9:58 ` Lorenzo Stoakes
@ 2025-08-27 7:18 ` Harry Yoo
2025-08-27 9:25 ` Lorenzo Stoakes
0 siblings, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-08-27 7:18 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Jonathan Corbet, Andrew Morton, Suren Baghdasaryan,
Liam R . Howlett, David Hildenbrand, Kees Cook, Vlastimil Babka,
Shakeel Butt, Mike Rapoport, Michal Hocko, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Tue, Aug 26, 2025 at 10:58:58AM +0100, Lorenzo Stoakes wrote:
> On Tue, Aug 26, 2025 at 06:48:41PM +0900, Harry Yoo wrote:
> > > Yeah sorry Jon on latter bit, I did mean to get to that but workload
> > > been... well you can see on lore :P
> > >
> > > I have a real backlog even more than usual right now too due to daring to take a
> > > day off on a national holiday here in the UK :))
> > >
> > > Harry - more than happy for you to do the above as part of this series or
> > > separately, will sling you some tags accordingly.
> >
> > Okay, I'll do as a part of the series (process_addrs.rst and memory-model.rst).
>
> Can you please though make sure the formatting is all good? That doc really
> needs the function names to stand out, so that's key.
>
> I _think_ Jon fixed it so that should work fine but do check first!
vma_start_write() works well, but it doens't process
anon_vma_[try]lock_read() properly. (only characters after [try] stand out)
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap()
2025-08-27 7:18 ` Harry Yoo
@ 2025-08-27 9:25 ` Lorenzo Stoakes
0 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes @ 2025-08-27 9:25 UTC (permalink / raw)
To: Harry Yoo
Cc: Jonathan Corbet, Andrew Morton, Suren Baghdasaryan,
Liam R . Howlett, David Hildenbrand, Kees Cook, Vlastimil Babka,
Shakeel Butt, Mike Rapoport, Michal Hocko, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Wed, Aug 27, 2025 at 04:18:30PM +0900, Harry Yoo wrote:
> On Tue, Aug 26, 2025 at 10:58:58AM +0100, Lorenzo Stoakes wrote:
> > On Tue, Aug 26, 2025 at 06:48:41PM +0900, Harry Yoo wrote:
> > > > Yeah sorry Jon on latter bit, I did mean to get to that but workload
> > > > been... well you can see on lore :P
> > > >
> > > > I have a real backlog even more than usual right now too due to daring to take a
> > > > day off on a national holiday here in the UK :))
> > > >
> > > > Harry - more than happy for you to do the above as part of this series or
> > > > separately, will sling you some tags accordingly.
> > >
> > > Okay, I'll do as a part of the series (process_addrs.rst and memory-model.rst).
> >
> > Can you please though make sure the formatting is all good? That doc really
> > needs the function names to stand out, so that's key.
> >
> > I _think_ Jon fixed it so that should work fine but do check first!
>
> vma_start_write() works well, but it doens't process
> anon_vma_[try]lock_read() properly. (only characters after [try] stand out)
Hm strange, maybe keep what exists for now.
Jon - any thoughts? I can't remember what we decided for getting formatting even
without kerneldocs etc.?
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks
2025-08-27 6:52 ` Harry Yoo
@ 2025-08-27 11:16 ` Lorenzo Stoakes
0 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes @ 2025-08-27 11:16 UTC (permalink / raw)
To: Harry Yoo
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
David Hildenbrand, Kees Cook, Vlastimil Babka, Shakeel Butt,
Mike Rapoport, Michal Hocko, Jonathan Corbet, Jann Horn,
Pedro Falcato, Rik van Riel, linux-mm, linux-doc, linux-kernel
On Wed, Aug 27, 2025 at 03:52:12PM +0900, Harry Yoo wrote:
> On Tue, Aug 26, 2025 at 10:46:24AM +0100, Lorenzo Stoakes wrote:
> > On Tue, Aug 26, 2025 at 03:58:48PM +0900, Harry Yoo wrote:
> > > While move_ptes() explains when rmap locks can be skipped, when reading
> > > the code setting pmc.need_rmap_locks it is not immediately obvious when
> > > need_rmap_locks can be false. Add a brief explanation in copy_vma() and
> > > relocate_vma_down(), and add a pointer to the comment in move_ptes().
> > >
> > > Meanwhile, fix and improve the comment in move_ptes().
> > >
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> >
> > This is great thanks! :)
>
> You're welcome!
>
> > > ---
> > > mm/mremap.c | 4 +++-
> > > mm/vma.c | 7 +++++++
> > > mm/vma_exec.c | 5 +++++
> > > 3 files changed, 15 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > index e618a706aff5..86adb872bea0 100644
> > > --- a/mm/mremap.c
> > > +++ b/mm/mremap.c
> > > @@ -218,8 +218,10 @@ static int move_ptes(struct pagetable_move_control *pmc,
> > > * When need_rmap_locks is false, we use other ways to avoid
> > > * such races:
> > > *
> > > - * - During exec() shift_arg_pages(), we use a specially tagged vma
> > > + * - During exec() relocate_vma_down(), we use a specially tagged vma
> > > * which rmap call sites look for using vma_is_temporary_stack().
> > > + * Folios mapped in the temporary stack vma cannot be migrated until
> > > + * the relocation is complete.
> >
> > Can we actually move this comment over to move_page_tables()? As this is
> > relevant to the whole operation.
>
> Sounds good, will do.
>
> > Also could you put a comment referencing this
> > comment in copy_vma_and_data() as this is where we actually determine whether
> > this is the case or not in _most cases_.
> >
> > Let's just get all the 'need rmap locks' and 'corner cases where it's fine
> > anyway' in one place that is logical :)
>
> Will do.
>
> > Also could you put a comment in copy_vma() over in mm/vma.c saying 'see the
> > comment in mm/mremap.c' or even risk mentioning the function name (risky as code
> > changes but still :P) e.g. 'see comment in move_page_tables()' or something.
>
> Will take a risk and do "See the comment in move_page_tables()" :)
>
> > I'm confused by the 'folios mapped' and 'migrate' bits - and I think people will
> > be confused by that.
> >
> > I think better to say 'page tables for the temporary stack cannot be adjusted
> > until the relocation is complete'.
>
> But is that correct?
>
> Out of all rmap users, only try_to_migrate() cares about
> VM_STACK_INCOMPLETE_SETUP via invalid_migration_vma().
Right, I think you are being too succinct here then :) as observed in my reading
it this way, I think reasonably somebody might be confused by this also.
Let's make this whole comment clearer.
So what's going on here USUALLY for mremap (not the relocate_vma_down() case
which we will get on to) is really that (assuming new VMA case for now):
1. We create a new VMA at destination.
2. The rmap clones its mappings to the new VMA _as well as_ the old.
Therefore, an rmap traversal will encounter BOTH sets of page tables when
traversing from the folio up the rmap.
Note that the mentioned cloning happens in anon_vma_clone() for anon and/or
vma_link() -> vma_link_file() -> (under i mmap write lock obv.)
__vma_link_file().
The original VMA will only be detached after the operation is complete.
If we unmap say, and say the new VMA is before the old VMA, meaning we dont' get
saved by traversal order, then all that will happen is the try_to_unmap() will
not be able to unmap and the mapcount will stay >0 and all is fine.
All rmap operations are best effort.
The only cases in which we set rmap locks like this are:
- self-merge
- higher order page table moves (see should_take_rmap_locks())
So really all this should talk about the self-merge case, rather than broadly
'why we don't care'.
So the comment's already a mess as it is I think.
So we can self-merge if we mremap() to a place immediately adjacent to the
existing VMA.
In that case, we'll simply change the range of the VMA, without doing any
cloning etc, and move page tables 'internally' within the same VMA.
The rmap traversal order guarantees that racing rmap operations are fine when
you self-merge _after_ a VMA, because we're serialised by PTE PTL, and if you
'miss' an entry in the page tables, you'll 'catch' it again when you traverse
later in the range.
HOWEVER, things are a problem if you self-merge _before_ a VMA.
Examining the relevant bit in copy_vma():
An aside about vma pointer changing...:
new_vma = vma_merge_new_range(&vmg);
if (new_vma) {
Merged---------^
We have firstly to consider whether we get a merge that will change the
actual VMA pointer - this can only be the case (+ in anon case) if:
1. The old VMA is not faulted in yet (otherwise we'd need to keep the
same vma->vm_pgoff) meaning we can change this to make the merge even
possible.
2. There was a VMA prior to the location in which we move that will be
expanded to fill the range.
This means it ha to be _before_ the old VMA for this to happen, e.g.:
if (unlikely(vma_start >= new_vma->vm_start &&
vma_start < new_vma->vm_end)) {
new_vma will be the _merged_ VMA, with vma_start the _old_ VMA's
start. So for this condition to be true we must have:
vs = vma_start (old VMA ->vm_start)
ve = old VMA ->vm_end
nvs = new_vma->vm_start = addr
nve = new_vma->vm_end
mremap():
vs ve
|---------|
| |
|---------|
addr addr + len
|--------|
| |
|--------|
Leads to merge:
nvs nve
vs ve
|--------.---------|
| . |
|--------.---------|
*vmap = vma = new_vma;
Therefore reset VMA--------------^
OK, but if we merge _at all_, we _also_ have to consider the 'before' case, and
pertinent to the rmap lock discussion, this is any case where we get a merge
where _what the rmap indexes on_ would be traversed in a way that could lead to
'seeing' the destination before the source when there is only a single VMA in
play.
Phew.
See how complicated this is? :)
So the pertinent code for this is:
if (new_vma) {
...
*need_rmap_locks = (new_vma->vm_pgoff <= vma->vm_pgoff);
} ...
The 'before' is the ->vm_pgoff of the new and old VMA's.
Note that we've covered off the case of the VMA changing, so we don't get a UAF
above, and in that case vma == new_vma and thus vm_pgoff will match.
But if that occurred, for the merge to succeed, the vm_pgoff would _have_ to be
prior in the traversal order, so it's correct to then take rmap locks.
Otherwise, in other merge cases, where there is no 3-way merge with a prior VMA
causing the VMA pointer to change, we simple take rmap locks in the case where
the traversal order would not save us.
So yeah given all the above, I do not think the existing comment is very
helpful.
It's _only_ in this case where we need to care.
So I thinkt he conflated thing here is this edge case of the damn horrible
initial stack relocation.
It's the _only place_ where we invoke move_page_tables() outside of mremap.
And of course we don't take any rmap locks, so in comes the explanation about
why that's ok.
We obviously also _don't apply the copy_vma()_ logic there.
We are vma_expand()'ing so it's a single VMA, which is problematic.
Therefore we are handling this with the vma_is_temporary_stack() check.
I think try_to_unmap() is ok to race, as if you race and don't find the page
table entries, you won't get to the folio + thus won't get to update the
mapcount so it's fine.
I think migration is very much NOT fine, because suddenly you have migration
entries that might get left behind as per a8bef8ff6ea1.
OK phew.
I am going to leave it to you to find a way to write the above up :)
Also please reflect, as far as is sensible, in the docs.
A sort of vague beginning might be something like:
* When need_rmap_locks is true, we take the i_mmap_rwsem and anon_vma
* locks to ensure that rmap will always observe either the old or the
* new ptes. This is the easiest way to avoid races with
* truncate_pagecache(), page migration, etc...
->
When dealing with racing rmap operations, we must be careful, as the
purpose of an rmap walk operation is to traverse from a folio to linked
VMAs, and then typically back down again through the page tables
mapping them.
Since we are manipulating page tables here, we may interfere with an
rmap operation if we are not careful about locking.
The only case where this is meaningful is a self-merge... blah
etc. etc. etc. :)
>
> I'm not sure what prevents from try_to_unmap() from unmapping it while
> it's relocated?
See above.
>
> Looks like it's always been like this since a8bef8ff6ea1 ("mm: migration:
> avoid race between shift_arg_pages() and rmap_walk() during migration by
> not migrating temporary stacks")
>
> > > *
> > > * - During mremap(), new_vma is often known to be placed after vma
> > > * in rmap traversal order. This ensures rmap will always observe
> >
> > This whole bit after could really do with some ASCII diagrams btw :)) ;) but you
> > know maybe out of scope here.
> >
> > > diff --git a/mm/vma.c b/mm/vma.c
> > > index 3b12c7579831..3da49f79e9ba 100644
> > > --- a/mm/vma.c
> > > +++ b/mm/vma.c
> > > @@ -1842,6 +1842,11 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> > > vmg.next = vma_iter_next_rewind(&vmi, NULL);
> > > new_vma = vma_merge_new_range(&vmg);
> > >
> > > + /*
> > > + * rmap locks can be skipped as long as new_vma is traversed
> > > + * after vma during rmap walk (new_vma->vm_pgoff >= vma->vm_pgoff).
> > > + * See the comment in move_ptes().
> > > + */
> >
> > Obv. would prefer this to say 'move_page_tables()' as mentioned above :P
>
> Will do.
>
> > > if (new_vma) {
> > > /*
> > > * Source vma may have been merged into new_vma
> > > @@ -1879,6 +1884,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> > > new_vma->vm_ops->open(new_vma);
> > > if (vma_link(mm, new_vma))
> > > goto out_vma_link;
> > > +
> > > + /* new_vma->pg_off is always >= vma->pg_off if not merged */
> >
> > Err, new_vma is NULL? :) I'm not sure this comment is too useful.
>
> Sometimes the line between "worth commenting" and "too much comment" is
> vague to me :) I'll remove it. Thanks.
It's fine :) instinct to comment more is nice :)
>
> > > *need_rmap_locks = false;
> > > }
> > > return new_vma;
> > > diff --git a/mm/vma_exec.c b/mm/vma_exec.c
> > > index 922ee51747a6..a895dd39ac46 100644
> > > --- a/mm/vma_exec.c
> > > +++ b/mm/vma_exec.c
> > > @@ -63,6 +63,11 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
> > > * process cleanup to remove whatever mess we made.
> > > */
> > > pmc.for_stack = true;
> > > + /*
> > > + * pmc.need_rmap_locks is false since rmap locks can be safely skipped
> > > + * because migration is disabled for this vma during relocation.
> > > + * See the comment in move_ptes().
> > > + */
> >
> > Let's reword this also, people will get confused about migration here.
> >
> > 'pmc.need_rmap_locks is false since rmap explicitly checks for
> > vma_is_temporary_stack() and thus extra care does not need to be taken here
> > during stack relocation. See the comment in move_page_tables().'
>
> This looks good! except for one thing, not all rmap users check for
> vma_is_temporary_stack().
Yup see above!
>
> > > if (length != move_page_tables(&pmc))
> > > return -ENOMEM;
> > >
> > > --
> > > 2.43.0
> > >
> >
> > Cheers, Lorenzo
>
> --
> Cheers,
> Harry / Hyeonggon
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-08-27 11:17 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-26 6:58 [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Harry Yoo
2025-08-26 6:58 ` [PATCH V1 2/2] mm: document when rmap locks can be skipped when setting need_rmap_locks Harry Yoo
2025-08-26 9:46 ` Lorenzo Stoakes
2025-08-27 6:52 ` Harry Yoo
2025-08-27 11:16 ` Lorenzo Stoakes
2025-08-26 7:22 ` [PATCH V1 1/2] docs/mm: explain when and why rmap locks need to be taken during mremap() Jonathan Corbet
2025-08-26 8:37 ` Lorenzo Stoakes
2025-08-26 9:48 ` Harry Yoo
2025-08-26 9:58 ` Lorenzo Stoakes
2025-08-27 7:18 ` Harry Yoo
2025-08-27 9:25 ` Lorenzo Stoakes
2025-08-26 9:55 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).