public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: akpm@linux-foundation.org, hca@linux.ibm.com,
	linux-s390@vger.kernel.org, david@kernel.org, brauner@kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	surenb@google.com, timmurray@google.com
Subject: Re: [PATCH v1 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag
Date: Fri, 24 Apr 2026 15:49:19 -0700	[thread overview]
Message-ID: <aevzbx_Pk5Cu5exa@google.com> (raw)
In-Reply-To: <aesiYAumkLCyedf0@tiehlicka>

On Fri, Apr 24, 2026 at 09:57:20AM +0200, Michal Hocko wrote:
> On Tue 21-04-26 16:02:39, Minchan Kim wrote:
> > Currently, process_mrelease() requires userspace to send a SIGKILL signal
> > prior to the call. This separation introduces a scheduling race window
> > where the victim task may receive the signal and enter the exit path
> > before the reaper can invoke process_mrelease().
> > 
> > When the victim enters the exit path (do_exit -> exit_mm), it clears its
> > task->mm immediately. This causes process_mrelease() to fail with -ESRCH,
> > leaving the actual address space teardown (exit_mmap) to be deferred until
> > the mm's reference count drops to zero. In Android, arbitrary reference counts
> > (e.g., async I/O, reading /proc/<pid>/cmdline, or various other remote
> > VM accesses) frequently delay this teardown indefinitely, defeating the
> > purpose of expedited reclamation.
> > 
> > This delay keeps memory pressure high, forcing the system to unnecessarily
> > kill additional innocent background apps before the memory from the first
> > victim is recovered.
> 
> Thanks, this makes the motivation much more clear and usecase very
> sound.
> 
> > This patch introduces the PROCESS_MRELEASE_REAP_KILL UAPI flag to support
> > an integrated auto-kill mode. When specified, process_mrelease() directly
> > injects a SIGKILL into the target task.
> > 
> > To solve the race condition deterministically, we grab the mm reference
> > via mmget() and set the MMF_UNSTABLE flag *before* sending the SIGKILL.
> > Using mmget() instead of mmgrab() keeps mm_users > 0, preventing the
> > victim from calling exit_mmap() in its own exit path.
> 
> Why is this needed? Address space tear down is an operation that can run
> from several execution contexts.

Agreed.

> 
> > This ensures that
> > the memory is reclaimed synchronously and deterministically by the reaper
> > in the context of process_mrelease(), avoiding delays caused by
> > non-deterministic scheduling of the victim task.
> 
> The memory is still reclaimed synchronously from the mrelease context.
> This is really confusing.
> 
> Please also explain why do you need to do all that ugly
> task_will_free_mem hoops. Why cannot you simply kill the task if
> task_will_free_mem fails (if PROCESS_MRELEASE_REAP_KILL is used).

I wanted to handle shared address spaces.
Even though we are okay with the target task not being in a SIGKILL
state yet (since we are about to kill it), we must ensure that all
*other* processes sharing the same mm are also dying.

If we simply bypass the check and force a kill when there are living sharers,
the memory will NOT be freed even after the target task dies because
the other processes still pin the mm.

So, to address this, I think we need to modify task_will_free_mem() slightly
to ignore the exit state of the *target* task only, while still checking that
all *other* sharing processes are dying:

static bool task_will_free_mem(struct task_struct *task, bool ignore_exit)
{
...
/* ignore tarket task's signal state */
if (!__task_will_free_mem(task, ignore_exit))
    return false;

/*
 * but other processes sharing the mm with target must be exit
 * state
 */
for_each_process(p) {
    ...
    if (!__task_will_free_mem(p, false))
	return false;
}
...
}

      reply	other threads:[~2026-04-24 22:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 23:02 [PATCH v1 0/3] mm: process_mrelease: expedite clean file folio reclaim and add auto-kill Minchan Kim
2026-04-21 23:02 ` [PATCH v1 1/3] mm: process_mrelease: expedite clean file folio reclaim via mmu_gather Minchan Kim
2026-04-24  7:56   ` David Hildenbrand (Arm)
2026-04-24 21:24     ` Minchan Kim
2026-04-24 19:33   ` Matthew Wilcox
2026-04-24 21:56     ` Minchan Kim
2026-04-21 23:02 ` [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Minchan Kim
2026-04-22  7:22   ` Baolin Wang
2026-04-23 23:38     ` Minchan Kim
2026-04-24  7:51   ` Michal Hocko
2026-04-24  7:57     ` David Hildenbrand (Arm)
2026-04-24 19:15       ` Minchan Kim
2026-04-24 19:26     ` Minchan Kim
2026-04-21 23:02 ` [PATCH v1 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag Minchan Kim
2026-04-24  7:57   ` Michal Hocko
2026-04-24 22:49     ` Minchan Kim [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aevzbx_Pk5Cu5exa@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=david@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox