From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, david@kernel.org,
mhocko@kernel.org, zhengqi.arch@bytedance.com,
yuzhao@google.com, shakeel.butt@linux.dev, willy@infradead.org,
Liam.Howlett@oracle.com, axelrasmussen@google.com,
yuanchu@google.com, weixugc@google.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/1] mm/vmscan: prevent MGLRU reclaim from pinning address space
Date: Mon, 23 Mar 2026 13:43:16 +0000 [thread overview]
Message-ID: <f22cb9d9-7fc8-4a79-ada8-02d66a1155b2@lucifer.local> (raw)
In-Reply-To: <20260322070843.941997-1-surenb@google.com>
On Sun, Mar 22, 2026 at 12:08:43AM -0700, Suren Baghdasaryan wrote:
> When shrinking lruvec, MGLRU pins address space before walking it.
> This is excessive since all it needs for walking the page range is
> a stable mm_struct to be able to take and release mmap_read_lock and
> a stable mm->mm_mt tree to walk. This address space pinning results
Hmm, I guess exit_mmap() calls __mt_destroy(), but that'll just destroy
allocated state and leave the tree empty right, so traversal of that tree
at that point would just do nothing?
> in delays when releasing the memory of a dying process. This also
> prevents mm reapers (both in-kernel oom-reaper and userspace
> process_mrelease()) from doing their job during MGLRU scan because
> they check task_will_free_mem() which will yield negative result due
> to the elevated mm->mm_users.
>
> Replace unnecessary address space pinning with mm_struct pinning by
> replacing mmget/mmput with mmgrab/mmdrop calls. mm_mt is contained
> within mm_struct itself, therefore it won't be freed as long as
> mm_struct is stable and it won't change during the walk because
> mmap_read_lock is being held.
>
> Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
> mm/vmscan.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 33287ba4a500..68e8e90e38f5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2863,8 +2863,9 @@ static struct mm_struct *get_next_mm(struct lru_gen_mm_walk *walk)
> return NULL;
Not related to this series, but I really don't like how coupled MGLRU is to
the rest of the 'classic' reclaim code.
Just in the middle of vmscan you walk into generic mm walker logic and the
only hint it's MGLRU is you see lru_gen_xxx stuff (I'm also annoyed that we
call it MGLRU but it's called lru_gen_xxx in the kernel :)
>
> clear_bit(key, &mm->lru_gen.bitmap);
> + mmgrab(mm);
Is the mm somehow pinned here or, on destruction, would move it from the mm
list meaning that we can safely assume we have something sane in mm-> to
grab? I guess this must have already been the case for mmget_not_zero() to
have been used before though.
>
> - return mmget_not_zero(mm) ? mm : NULL;
> + return mm;
> }
>
> void lru_gen_add_mm(struct mm_struct *mm)
> @@ -3064,7 +3065,7 @@ static bool iterate_mm_list(struct lru_gen_mm_walk *walk, struct mm_struct **ite
> reset_bloom_filter(mm_state, walk->seq + 1);
>
> if (*iter)
> - mmput_async(*iter);
> + mmdrop(*iter);
This will now be a blocking call that could free the mm (via __mmdrop()),
so could take a while, is that ok?
If before the code was intentionally deferring work here, doesn't that
imply that being slow here might be an issue, somehow? Or was it just
because they could? :)
>
> *iter = mm;
>
>
> base-commit: 8c65073d94c8b7cc3170de31af38edc9f5d96f0e
> --
> 2.53.0.1018.g2bb0e51243-goog
>
Thanks, Lorenzo
next prev parent reply other threads:[~2026-03-23 13:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-22 7:08 [PATCH 1/1] mm/vmscan: prevent MGLRU reclaim from pinning address space Suren Baghdasaryan
2026-03-23 13:43 ` Lorenzo Stoakes (Oracle) [this message]
2026-03-23 16:19 ` Suren Baghdasaryan
2026-03-23 17:06 ` Lorenzo Stoakes (Oracle)
2026-03-23 17:24 ` Suren Baghdasaryan
2026-03-23 13:43 ` Lorenzo Stoakes (Oracle)
2026-03-23 16:26 ` Suren Baghdasaryan
2026-03-23 17:02 ` Lorenzo Stoakes (Oracle)
2026-03-23 17:43 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f22cb9d9-7fc8-4a79-ada8-02d66a1155b2@lucifer.local \
--to=ljs@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox