From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F47E199E89
	for <linux-kernel@vger.kernel.org>; Tue, 14 Apr 2026 20:00:18 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1776196818; cv=none; b=SUOIdGS6nT3+GDG1wnw6L4zytNGQzenLSA/ObzEH7F6JVo5XmJWwEZlX0j8CMuIq7BMXlUVr45Xpjd8LOhQuHtmSbs7mLm4fYOCah/6L0H1Z18IsZNGWRgvsnQXOThpjGUGHHeM0ORfESfhb/g0JSB96sIW0QZVoWXaX9IRZiXk=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1776196818; c=relaxed/simple;
	bh=E++00pwx7ahLkiLbRIEGgnYWDaPnO/diQWMC/2yTE+0=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=Do3uSBkaVLzRu2d3Ea1GKLFOfDy1GInY7iIMCOXGNPqycmAoQhyKn7dbiXvmP71Ty5hSpw2X/026gub+0O6QasfFySfpBzAVpCBiFZAC9zUNgjQ4p0gXIuzRs4wbItze9InMKwQ1qztk1E3DYrBW/vBZPQP9uPA4tC+TKs9kb9Y=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=h653xH12; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="h653xH12"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 856E9C19425;
	Tue, 14 Apr 2026 20:00:17 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1776196817;
	bh=E++00pwx7ahLkiLbRIEGgnYWDaPnO/diQWMC/2yTE+0=;
	h=Date:From:To:Cc:Subject:References:In-Reply-To:From;
	b=h653xH121WtcrzTO8qU+B1JgZegJwKSjxOfBwYj7ZTot8Zew364vr2gOo/ae39m4Z
	 Dtdu6roTTYzRnfBHMdyX3HUZuzP0uVD1iGDx4mC202t6z6Za7e1bs3REt8Nd/sVh7T
	 Z7+D5WHKPc8t+oeJBgwnpJHTDUvANrt5GDcqbmD2B6LwgsbWgjc1pgFx+bB59WVlXT
	 acPnqrO5wg0B6yjIVgEMaiYDUj43+YQb2VW9Ey7WDf6VThJRPP/ANT+n88rrPMe19I
	 rOYgXz1lfJ+HTvmvCwV62MhNZQmhdEwfD5oM/pAWVUhf5oyutbvOk+nsrISqFFC5ck
	 F+D8Y5lupeU8Q==
Date: Tue, 14 Apr 2026 13:00:16 -0700
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: akpm@linux-foundation.org, david@kernel.org, brauner@kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, surenb@google.com,
	timmurray@google.com
Subject: Re: [RFC 0/3]  mm: process_mrelease: expedited reclaim and auto-kill
 support
Message-ID: <ad6c0DN7AGoOQ_Iq@google.com>
References: <20260413223948.556351-1-minchan@kernel.org>
 <ad3ldU4DZxga_J2L@tiehlicka>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <ad3ldU4DZxga_J2L@tiehlicka>

On Tue, Apr 14, 2026 at 08:57:57AM +0200, Michal Hocko wrote:
> On Mon 13-04-26 15:39:45, Minchan Kim wrote:
> > This patch series introduces optimizations to expedite memory reclamation
> > in process_mrelease() and provides a secure, race-free "auto-kill"
> > mechanism for efficient container shutdown and OOM handling.
> > 
> > Currently, process_mrelease() unmaps pages but leaves clean file folios
> > on the LRU list, relying on standard memory reclaim to eventually free
> > them. Furthermore, requiring userspace to send a SIGKILL prior to
> > invoking process_mrelease() introduces scheduling race conditions where
> > the victim task may enter the exit path prematurely, bypassing expedited
> > reclamation hooks.
> > 
> > This series addresses these limitations in three logical steps.
> > 
> > Patch #1: mm: process_mrelease: expedite clean file folio reclaim via mmu_gather
> > Integrates clean file folio eviction directly into the low-level TLB
> > batching (mmu_gather) infrastructure. Symmetrically truncates clean file
> > folios alongside anonymous pages during the unmap loop.
> 
> Why do we need to care about clean page cache? Is this a form of
> drop_caches?

The goal is to ensure the memory is actually freed by the time
process_mrelease returns. Currently, process_mrelease unmaps pages, but
page caches remain on the LRU, leaving them to be reclaimed later
by kswapd or direct reclaim. This delay defeats the purpose of
"expedited" release. It’s not a global drop_caches, but rather a
targeted eviction for the victim process to make its memory immediately
available for other urgent allocations.

> 
> > Patch #2: mm: process_mrelease: skip LRU movement for exclusive file folios
> > Skips costly LRU marking (folio_mark_accessed) for exclusive file-backed
> > folios undergoing process_mrelease reclaim. Perf profiling reveals that
> > LRU movement accounts for ~55% of overhead during unmap.
> 
> OK, but why is this not desirable behavior fir mrelease?

In Android, lmkd kills background apps under memory pressure and then calls
process_mrelease. If the memory release is slow due to LRU overhead (~55% as noted),
it cannot keep up with the allocation speed of the foreground app.
This delay often leads to "over-killing" - killing more background apps
than necessary because the system hasn't yet "seen" the memory freed
from the first kill.

> 
> > Patch #3: mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag
> > Adds an auto-kill flag supporting atomic teardown. Utilizes a dedicated
> > signal code (KILL_MRELEASE) to guarantee MMF_UNSTABLE is marked in the
> > signal delivery path, preventing scheduling races.
> 
> Could you explain why those races are a real problem?

The race occurs when the victim process starts its own exit path (after
SIGKILL) before the caller can invoke process_mrelease. If the victim
reaches the exit path first, the caller might lose the window to apply
these expedited reclamation optimizations. By combining the kill and the
release into an atomic operation with a dedicated signal code, we
guarantee that the process is reaped efficiently without competing with
the process's own teardown logic.

> -- 
> Michal Hocko
> SUSE Labs