From: Lorenzo Stoakes <ljs@kernel.org>
To: Rik van Riel <riel@surriel.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
David Hildenbrand <david@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Pedro Falcato <pfalcato@suse.de>,
Ryan Roberts <ryan.roberts@arm.com>,
Harry Yoo <harry.yoo@oracle.com>, Jann Horn <jannh@google.com>,
Chris Li <chriscli@google.com>, Barry Song <baohua@kernel.org>
Subject: Re: [LSF/MM/BPF TOPIC] The Future of the Anonymous Reverse Mapping [RESEND]
Date: Mon, 4 May 2026 09:01:07 +0100 [thread overview]
Message-ID: <afhFEQfC-nISu-rj@lucifer> (raw)
In-Reply-To: <8c729d621f281dd0b1a891f4aaccb0ba4956b219.camel@surriel.com>
On Sun, May 03, 2026 at 02:26:36PM -0400, Rik van Riel wrote:
> On Sat, 2026-05-02 at 07:53 +0100, Lorenzo Stoakes wrote:
> > As is time-honoured LSF tradition, I am sharing code for my proposal.
> >
> > I worked a very long day yesterday and got the _very_ rough PoC code
> > into
> > some kind of vaguely shareable state.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/ljs/linux.git/log/?h=project/cow-context
> >
> > CAVEATS:
> >
> > * The code is not great, it's 'experimental, wave your arms, hope for
> > the
> > best' stuff used for experimentation.
>
> First, some refcounting that confuses me.
>
> The changelog, and the code in dup_cow_context
> shows that only the parent's cow context gets
> an increased refcount.
>
> However, the code in __put_cow_context seems
> to unconditionally decrement refcounts all up
> the hierarchy, instead of bailing out once it
> encounters a parent that still has a non-zero
> refcount.
>
> How is that supposed to work?
Ah no it does bail out :)
This code is very much PoC so not perhaps ideally clear :)
So __put_cow_context() calls delete_child_from_parent():
void __put_cow_context(struct cow_context *context)
{
...
for (curr = context; curr; curr = parent) {
...
parent = delete_child_from_parent(curr);
...
}
}
static struct cow_context *delete_child_from_parent(struct cow_context *context)
{
...
struct cow_context *parent = context->parent;
if (!parent)
return NULL;
...
if (!refcount_dec_and_test(&parent->refcnt))
return NULL;
And only if the refcount drops to 0 do we propagate (because the parent
being dropped drops a ref from its parent).
return parent;
}
>
> Now, having the remaps array cloned at fork
> time does make the refcounting on that side
> a lot simpler. I like that.
Thanks :)
>
> However, it does raise another question.
>
> Say we have process A, with child process B.
>
> Process A has memory mapped at address X.
>
> Process B munmaps memory at address X, and
> then maps new memory at address X.
>
> If I haven't missed something important, the
> remap table does not need to get used, because
> the offset and the virtual address match.
>
> How does the COW walk handle that situation?
So in the example given the folio would become AnonExclusive (or rather
!folio_maybe_mapped_shared(folio)) so would only walk process A.
If you unmapped in the parent, the folio would become AnonExclusive and
then get moved to processs B's mm's cow context.
So it'd work perfectly fine and be efficient in that case.
But in an example where say process C also forks so the folio remains
shared, we would end up doing a useless walk into process B, then find
either that there isn't a folio there or that the folio was unrelated.
In my slides (I will put them somewhere after LSF) I argue that these kinds
of situations are likely to be the minority, because most memory is
AnonExclusive and that which isn't largely remains untouched (i.e. all the
walks would be valid).
There are cases also where anon_vma can do useless walks (anon_vma is at
the mapping granularity, whereas a folio might or might not still be mapped
at lower levels, assuming not moved by folio_move_anon_rmap()).
>
> Overall, I like that you are trying to tackle
> the problems associated with anon_vma, but
> have to wonder if this implementation will
> be able to avoid some of the complexity
> inherent in the problem space.
Thanks :) And yeah I think unavoidably there were be difficult corner
cases. I also don't suggest that this approach is necessarily going to be
the one that ultimately works, there's a HUGE TODO left on stabilisataion
(esp. in the migration case), and I plan to do a lot of testing around
latency and edge cases to really exercise it.
However, no matter the outcome, this should give us insights into the anon
rmap (and the testing work will provide a good testbed too) - so either
way, I am determined to improve the anon rmap even if I need to look to
another approach.
>
> --
> All Rights Reversed.
Cheers, Lorenzo
prev parent reply other threads:[~2026-05-04 8:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-30 21:23 [LSF/MM/BPF TOPIC] The Future of the Anonymous Reverse Mapping [RESEND] Lorenzo Stoakes (Oracle)
2026-03-31 23:30 ` Barry Song
2026-04-01 8:43 ` Lorenzo Stoakes (Oracle)
2026-04-01 21:03 ` Barry Song
2026-04-02 12:20 ` Lorenzo Stoakes (Oracle)
2026-04-02 21:49 ` Barry Song
2026-05-04 8:10 ` Lorenzo Stoakes
2026-05-02 6:53 ` Lorenzo Stoakes
2026-05-03 18:26 ` Rik van Riel
2026-05-04 8:01 ` Lorenzo Stoakes [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afhFEQfC-nISu-rj@lucifer \
--to=ljs@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=baohua@kernel.org \
--cc=chriscli@google.com \
--cc=david@kernel.org \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox