From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F47E199E89 for ; Tue, 14 Apr 2026 20:00:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776196818; cv=none; b=SUOIdGS6nT3+GDG1wnw6L4zytNGQzenLSA/ObzEH7F6JVo5XmJWwEZlX0j8CMuIq7BMXlUVr45Xpjd8LOhQuHtmSbs7mLm4fYOCah/6L0H1Z18IsZNGWRgvsnQXOThpjGUGHHeM0ORfESfhb/g0JSB96sIW0QZVoWXaX9IRZiXk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776196818; c=relaxed/simple; bh=E++00pwx7ahLkiLbRIEGgnYWDaPnO/diQWMC/2yTE+0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Do3uSBkaVLzRu2d3Ea1GKLFOfDy1GInY7iIMCOXGNPqycmAoQhyKn7dbiXvmP71Ty5hSpw2X/026gub+0O6QasfFySfpBzAVpCBiFZAC9zUNgjQ4p0gXIuzRs4wbItze9InMKwQ1qztk1E3DYrBW/vBZPQP9uPA4tC+TKs9kb9Y= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=h653xH12; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="h653xH12" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 856E9C19425; Tue, 14 Apr 2026 20:00:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776196817; bh=E++00pwx7ahLkiLbRIEGgnYWDaPnO/diQWMC/2yTE+0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=h653xH121WtcrzTO8qU+B1JgZegJwKSjxOfBwYj7ZTot8Zew364vr2gOo/ae39m4Z Dtdu6roTTYzRnfBHMdyX3HUZuzP0uVD1iGDx4mC202t6z6Za7e1bs3REt8Nd/sVh7T Z7+D5WHKPc8t+oeJBgwnpJHTDUvANrt5GDcqbmD2B6LwgsbWgjc1pgFx+bB59WVlXT acPnqrO5wg0B6yjIVgEMaiYDUj43+YQb2VW9Ey7WDf6VThJRPP/ANT+n88rrPMe19I rOYgXz1lfJ+HTvmvCwV62MhNZQmhdEwfD5oM/pAWVUhf5oyutbvOk+nsrISqFFC5ck F+D8Y5lupeU8Q== Date: Tue, 14 Apr 2026 13:00:16 -0700 From: Minchan Kim To: Michal Hocko Cc: akpm@linux-foundation.org, david@kernel.org, brauner@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, surenb@google.com, timmurray@google.com Subject: Re: [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support Message-ID: References: <20260413223948.556351-1-minchan@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Tue, Apr 14, 2026 at 08:57:57AM +0200, Michal Hocko wrote: > On Mon 13-04-26 15:39:45, Minchan Kim wrote: > > This patch series introduces optimizations to expedite memory reclamation > > in process_mrelease() and provides a secure, race-free "auto-kill" > > mechanism for efficient container shutdown and OOM handling. > > > > Currently, process_mrelease() unmaps pages but leaves clean file folios > > on the LRU list, relying on standard memory reclaim to eventually free > > them. Furthermore, requiring userspace to send a SIGKILL prior to > > invoking process_mrelease() introduces scheduling race conditions where > > the victim task may enter the exit path prematurely, bypassing expedited > > reclamation hooks. > > > > This series addresses these limitations in three logical steps. > > > > Patch #1: mm: process_mrelease: expedite clean file folio reclaim via mmu_gather > > Integrates clean file folio eviction directly into the low-level TLB > > batching (mmu_gather) infrastructure. Symmetrically truncates clean file > > folios alongside anonymous pages during the unmap loop. > > Why do we need to care about clean page cache? Is this a form of > drop_caches? The goal is to ensure the memory is actually freed by the time process_mrelease returns. Currently, process_mrelease unmaps pages, but page caches remain on the LRU, leaving them to be reclaimed later by kswapd or direct reclaim. This delay defeats the purpose of "expedited" release. It’s not a global drop_caches, but rather a targeted eviction for the victim process to make its memory immediately available for other urgent allocations. > > > Patch #2: mm: process_mrelease: skip LRU movement for exclusive file folios > > Skips costly LRU marking (folio_mark_accessed) for exclusive file-backed > > folios undergoing process_mrelease reclaim. Perf profiling reveals that > > LRU movement accounts for ~55% of overhead during unmap. > > OK, but why is this not desirable behavior fir mrelease? In Android, lmkd kills background apps under memory pressure and then calls process_mrelease. If the memory release is slow due to LRU overhead (~55% as noted), it cannot keep up with the allocation speed of the foreground app. This delay often leads to "over-killing" - killing more background apps than necessary because the system hasn't yet "seen" the memory freed from the first kill. > > > Patch #3: mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag > > Adds an auto-kill flag supporting atomic teardown. Utilizes a dedicated > > signal code (KILL_MRELEASE) to guarantee MMF_UNSTABLE is marked in the > > signal delivery path, preventing scheduling races. > > Could you explain why those races are a real problem? The race occurs when the victim process starts its own exit path (after SIGKILL) before the caller can invoke process_mrelease. If the victim reaches the exit path first, the caller might lose the window to apply these expedited reclamation optimizations. By combining the kill and the release into an atomic operation with a dedicated signal code, we guarantee that the process is reaped efficiently without competing with the process's own teardown logic. > -- > Michal Hocko > SUSE Labs