From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5346217736 for ; Fri, 24 Apr 2026 00:08:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776989321; cv=none; b=TAQTLbDLaUzYdE1SwDcLoHs9IeKgrvtfO/EO0NY7/YITFHkn83BJoKn5+zu/7JTvvJkf71P7KGEKJSMiOYzRXDLvfAKyp4OfIJ6vaMxhS4CrVvxWpNNVWtc4fSX87jQltOMH8snOLsC+2SZQts3cuPwpaOQwMFWjxcUULOXg+jw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776989321; c=relaxed/simple; bh=+P1MRbncfPNSZCuKVTFzs10FnL8JYlgrg9/6Uyx3XaU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YVyAf8M/6ODho60b0GJiTDBN4vHRHYpDKlbOU15edGaZ2HFIBfKbVCEDvDznVqJ/LccKARUywAlP6VY4BolJWj1osQU40wNT1VgHvhDnGnv9N/XgIi5NpoB6h6E/xgRe5DvjdQfjR2YaPcn15QoqOuidJlK8YtjCl9YyutmOoPQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LpOa89wR; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LpOa89wR" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A79EBC2BCAF; Fri, 24 Apr 2026 00:08:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776989321; bh=+P1MRbncfPNSZCuKVTFzs10FnL8JYlgrg9/6Uyx3XaU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=LpOa89wRNfFuxGL02wxiK1czOdncRvWCeNALWGXRBSKwmqLZ3l9R3n8FCxhopwRDM F3OPkHZPLiF6A0X5APbB6tjJOl6bgMTf+YvzGROIB4tM+wGhqr22sgjQlgV7uruxVf R9xN+m6cdNcO73fV/rbFskaObOyMOJuzb6Otll0BUm6U/TamiSJYcn6X12POuIWt4v 0/e10Tfe59utUINa6fGp3Wv4WoVFsf+Cnt/fwU+CI/NDwrQUiqbz7pN8acGoVAXcoR 4RK3HvrNWNVmzbmsOTMLM8Ut4TEVoaKU6Xa+RkWOoTdMargixKC0F4JgqNlGgXV4nr myhBjTUF2aTjw== Date: Thu, 23 Apr 2026 17:08:39 -0700 From: Minchan Kim To: Suren Baghdasaryan Cc: "David Hildenbrand (Arm)" , Michal Hocko , akpm@linux-foundation.org, brauner@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, timmurray@google.com, Johannes Weiner Subject: Re: [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support Message-ID: References: <28c3ae9f-c974-454c-b8ed-ba0ba0a5706d@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Thu, Apr 23, 2026 at 03:36:57PM -0700, Suren Baghdasaryan wrote: > On Thu, Apr 23, 2026 at 2:50 AM David Hildenbrand (Arm) > wrote: > > > > On 4/23/26 09:50, Michal Hocko wrote: > > > On Mon 20-04-26 14:53:23, Minchan Kim wrote: > > >> On Fri, Apr 17, 2026 at 09:11:21AM +0200, Michal Hocko wrote: > > > [...] > > >>> Yes. All which make sense, really. I am still not convinced about the > > >>> clean page cache because that just seems like a hack to workaround wrong > > >>> userspace oom heuristics. > > >> > > >> I see it a bit differently. When paltform decides to kill a process > > >> to free up memory, they want that memory back right away. > > >> > > >> So it doesn't make much sense for the kernel to ignore that and leave the clean > > >> file pages to be picked up slowly by kswapd later. > > >> > > >> In some aspects, you can think of LMKD as a more specialized, userspace version > > >> of kswapd. It has high-level knowledge of process priorities and knows exactly > > >> which process is safe to kill to get memory instantly. The kernel's kswapd, > > >> however, operates globally without this specific process-level awareness, which > > >> makes it less suited for this kind of targeted reclamation. > > >> > > >> If we force LMKD to rely on the slower global kswapd to actually free the clean > > >> pages, it defeats the whole purpose of targeting a specific process. > > >> > > >> So letting process_mrelease speed this up isn't a hack at all. It's just helping > > >> the kernel do what the admin wanted in the first place: fast, targeted memory. > > > > > > This is a very creative/disruptive way to do a memory reclaim. From a > > > user POV I would much rather see clean page cache reclaimed before my > > > apps start to disappear. But this is obviously your call and your users > > > that will care. > > > > > > Anyway, I still maintain my position. I do not think it is a good > > > idea to drop clean page cache as you do not know whether there are other > > > users. > > I'm very much familiar with these issues in Android and really want to > find a good solution for them. IIUC, this RFC tries to address 2 > things at once: > 1. handling clean private page cache when reaping memory of a kill victim; > 2. addressing a race between kill() and process_release() when > process_release() can't happen before the kill() but if it happens too > late after the victim passed its exit_mm() then process_release() > fails to find the mm to reap. This defeats the purpose of > process_release() call because the actual memory (released by > exit_mmap()) might not yet be free and a successful process_release() > would be very beneficial. > > I see these two as separate issues and I'm not sure combining them > into a single discussion is a good idea. Yeah, they are two different issues so I tried to show those problems in cover-letter and address each issues one by one from each patch. I can easily drop either of them if it's not received well. I am fine to send them separately, too if that's confused. No problem. > > > > > IIRC, Johannes raised in the past the we cannot predict the future. > > > > For example, if an app gets OOM-killed, wouldn't we usually try restarting it, > > re-consuming the clean pagecache pages we would be evicting here? > > Sure, we can't predict which app the user will use next, so when > killing we usually kill the least recently used one. That's a > reasonable strategy in most cases. > In general, if speeding up the victim's reclaim negatively affects the > overall user workflow then this would mean we are selecting wrong kill > targets. In that case, we would need to adjust the target selection > strategy. > > Thanks for tackling this Minchan! I'll try to review the patches this > weekend and provide my feedback. Please go with second patchset. https://lore.kernel.org/linux-mm/20260421230239.172582-1-minchan@kernel.org/ Thanks!