From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE31F3A1E7C for ; Thu, 18 Dec 2025 13:16:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766063799; cv=none; b=JiLnE9o7PKDqy01l48PyGnP8Td2pD+m2H6VX2YAmpXdPCxFLI/jzQluBkfdJ/bmB9g2X4Y/mFFvz7EOzIp9JbHcgzS/MOQxmhAFHUYPtK9wFLry4ciDIjhEbep3+RfFknbIgKRchHPolw7JhhsGtHEiLL0hJJufSP+9POp0zpoA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766063799; c=relaxed/simple; bh=Bujyn3bxWOjs6gWSmPYUpLTY9vpa8r5FjGdARTU0DP8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=m9EvmEBAZCXwV2s9UCrvov/9bSak9d4datAdlYhStJiHVj4Q3vbLwAfsA3PF/eXrI3bXNz0cIR7XwFwuniSbtyOhm9avtSVcYnR+WOjR3agahdPcfT2LMD1amuK9vUci6KLMzXM/GEozzDW1hcpHp7JPTX7bX27/m5CRLB+gBHo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=YjV+vqLm; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="YjV+vqLm" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1766063788; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LoPCD9Q1tLO32LKLo1wra7n/7+sv/Ika5Xpd76iQ9qc=; b=YjV+vqLmCwHos0s8P5DDywxWwvDyIJIcwQtaJiSqYJ7mlXKNxMQ5MMVfxS0o7A43cIzePy TE+SrEkPlTBYAnn1N5vyc973X/mKLP5O/bIZVxsnPA+RqHTc8AdP6nmJlPQHbZ/qMG9VYE M+9CoImDm/Hz43Tp47WIQmiwGYLyUuI= Date: Thu, 18 Dec 2025 21:16:11 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v2 13/28] mm: migrate: prevent memory cgroup release in folio_migrate_mapping() To: "David Hildenbrand (Red Hat)" , hannes@cmpxchg.org, hughd@google.com, mhocko@suse.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, lorenzo.stoakes@oracle.com, ziy@nvidia.com, harry.yoo@oracle.com, imran.f.khan@oracle.com, kamalesh.babulal@oracle.com, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, chenridong@huaweicloud.com, mkoutny@suse.com, akpm@linux-foundation.org, hamzamahfooz@linux.microsoft.com, apais@linux.microsoft.com, lance.yang@linux.dev Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Muchun Song , Qi Zheng References: <1554459c705a46324b83799ede617b670b9e22fb.1765956025.git.zhengqi.arch@bytedance.com> <3a6ab69e-a2cc-4c61-9de1-9b0958c72dda@kernel.org> <02c3be32-4826-408d-8b96-1db51dcababf@linux.dev> <4effa243-bae3-45e4-8662-dca86a7e5d12@linux.dev> <11a60eba-3447-47de-9d59-af5842f5dc5e@kernel.org> <3c32d80a-ba0e-4ed2-87ae-fb80fc3374f7@linux.dev> <49341ca3-1fc9-43d9-abbd-ecaabdda6ce0@kernel.org> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Qi Zheng In-Reply-To: <49341ca3-1fc9-43d9-abbd-ecaabdda6ce0@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 12/18/25 9:04 PM, David Hildenbrand (Red Hat) wrote: > On 12/18/25 14:00, Qi Zheng wrote: >> >> >> On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote: >>> On 12/18/25 12:40, Qi Zheng wrote: >>>> >>>> >>>> On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote: >>>>> On 12/18/25 10:36, Qi Zheng wrote: >>>>>> >>>>>> >>>>>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote: >>>>>>> On 12/17/25 08:27, Qi Zheng wrote: >>>>>>>> From: Muchun Song >>>>>>>> >>>>>>>> In the near future, a folio will no longer pin its corresponding >>>>>>>> memory cgroup. To ensure safety, it will only be appropriate to >>>>>>>> hold the rcu read lock or acquire a reference to the memory cgroup >>>>>>>> returned by folio_memcg(), thereby preventing it from being >>>>>>>> released. >>>>>>>> >>>>>>>> In the current patch, the rcu read lock is employed to safeguard >>>>>>>> against the release of the memory cgroup in >>>>>>>> folio_migrate_mapping(). >>>>>>> >>>>>>> We usually avoid talking about "patches". >>>>>> >>>>>> Got it. >>>>>> >>>>>>> >>>>>>> In __folio_migrate_mapping(), the rcu read lock ... >>>>>> >>>>>> Will do. >>>>>> >>>>>>> >>>>>>>> >>>>>>>> This serves as a preparatory measure for the reparenting of the >>>>>>>> LRU pages. >>>>>>>> >>>>>>>> Signed-off-by: Muchun Song >>>>>>>> Signed-off-by: Qi Zheng >>>>>>>> Reviewed-by: Harry Yoo >>>>>>>> --- >>>>>>>>      mm/migrate.c | 2 ++ >>>>>>>>      1 file changed, 2 insertions(+) >>>>>>>> >>>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>>>>>> index 5169f9717f606..8bcd588c083ca 100644 >>>>>>>> --- a/mm/migrate.c >>>>>>>> +++ b/mm/migrate.c >>>>>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct >>>>>>>> address_space *mapping, >>>>>>>>              struct lruvec *old_lruvec, *new_lruvec; >>>>>>>>              struct mem_cgroup *memcg; >>>>>>>> +        rcu_read_lock(); >>>>>>>>              memcg = folio_memcg(folio); >>>>>>> >>>>>>> In general, LGTM >>>>>>> >>>>>>> I wonder, though, whether we should embed that in the ABI. >>>>>>> >>>>>>> Like "lock RCU and get the memcg" in one operation, to the "return >>>>>>> memcg >>>>>>> and unock rcu" in another operation. >>>>>> >>>>>> Do you mean adding a helper function like >>>>>> get_mem_cgroup_from_folio()? >>>>> >>>>> Right, something like >>>>> >>>>> memcg = folio_memcg_begin(folio); >>>>> folio_memcg_end(memcg); >>>> >>>> For some longer or might-sleep critical sections (such as those pointed >>>> by Johannes), perhaps it can be defined like this: >>>> >>>> struct mem_cgroup *folio_memcg_begin(struct folio *folio) >>>> { >>>>      return get_mem_cgroup_from_folio(folio); >>>> } >>>> >>>> void folio_memcg_end(struct mem_cgroup *memcg) >>>> { >>>>      mem_cgroup_put(memcg); >>>> } >>>> >>>> But for some short critical sections, using RCU lock directly might >>>> be the most convention option? >>>> >>> >>> Then put the rcu read locking in there instead? >> >> So for some longer or might-sleep critical sections, using: >> >> memcg = folio_memcg_begin(folio); >> do_some_thing(memcg); >> folio_memcg_end(folio); >> >> for some short critical sections, using: >> >> rcu_read_lock(); >> memcg = folio_memcg(folio); >> do_some_thing(memcg); >> rcu_read_unlock(); >> >> Right? > > What I mean is: > > memcg = folio_memcg_begin(folio); > do_some_thing(memcg); > folio_memcg_end(folio); > > but do the rcu_read_lock() in folio_memcg_begin() and the > rcu_read_unlock() in folio_memcg_end(). > > You could also have (expensive) variants, as you describe, that mess > with getting/dopping the memcg. Or simple use folio_memcg_begin(memcg)/folio_memcg_end(memcg) in all cases. Or add a parameter to them: struct mem_cgroup *folio_memcg_begin(struct folio *folio, bool get_refcnt) { struct mem_cgroup *memcg; if (get_refcnt) memcg = get_mem_cgroup_from_folio(folio); else { rcu_read_lock(); memcg = folio_memcg(folio); } return memcg; } void folio_memcg_end(struct mem_cgroup *memcg, bool get_refcnt) { if (get_refcnt) mem_cgroup_put(memcg); else rcu_read_unlock(); } > > But my points was about hiding the rcu details in a set of helpers. > > Sorry if what I say is confusing. >