From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-170.mta1.migadu.com (out-170.mta1.migadu.com [95.215.58.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5980C390C8A for ; Tue, 30 Jun 2026 16:31:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782837076; cv=none; b=m1FyiNIPwE0yCIcqsLIF/E5tZJcvMmjLkYmHt61Zs4kRaaUli1IH44cFqdYtuFBlPgebqv4vB1WipWZewKn2XoOj8zvOksO189Wm/hplFRyE1MQpihDd6eWyvHFB/W76EzbLFAS+Y0J2pFZhBnqLBJRxcYp1NlhYActcfQyFqsA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782837076; c=relaxed/simple; bh=wtwFpbp6pIPLwrKWd4uh5V4g7nNfmeyD0J5Ko1tGRxA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=dFV97CIC0QesNBzjGezzy6X26OJjrdLK7BOLxSLGZyfDDt3SH5xUh12YNGSr6J6W3x7NpW9xi37gkDMELNpob9NX45z4aBFI1OZYecB2NywM2ZzFH8EmFxwv5C+EuWI5NTqi+aYXkWVJNudvkC2jCASDUQXrgINJ4FT0B6/Tybw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=qtt7N7T+; arc=none smtp.client-ip=95.215.58.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="qtt7N7T+" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1782837072; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Xu6GzUkNCE14ufar9+KLOm5/yfYEVOMR8CoxaWNrKU=; b=qtt7N7T+U2sk/W/u8fGrzG+0CoCpe4CScmp3L/9RHZMf5U8PEFP6fLKvEoaqOe+fhohd25 0IIYU2Lrm1tLUBArCArKtmdFuqsNX0jE7PLJnvF/+SA1c7YdGqrOyF9hS8+w5D8jsU3SJt 6Ka614oNxoNUFFgYvJsjrLXazUgxkQw= Date: Tue, 30 Jun 2026 17:30:58 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v3 1/2] mm/vmpressure: skip tree=true accounting on cgroup v2 To: Johannes Weiner Cc: Andrew Morton , david@kernel.org, linux-mm@kvack.org, tj@kernel.org, mkoutny@suse.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, liam@infradead.org, linux-kernel@vger.kernel.org, ljs@kernel.org, mhocko@suse.com, rppt@kernel.org, surenb@google.com, vbabka@kernel.org, kernel-team@meta.com References: <20260630112617.1198623-1-usama.arif@linux.dev> <20260630112617.1198623-2-usama.arif@linux.dev> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 30/06/2026 17:07, Johannes Weiner wrote: > On Tue, Jun 30, 2026 at 04:23:32AM -0700, Usama Arif wrote: >> vmpressure() has two outputs gated by the @tree argument: >> >> @tree=false drives in-kernel socket pressure (mem_cgroup_set_ >> socket_pressure), consumed by TCP/SCTP. This only >> applies on cgroup v2; on v1 socket memory is charged >> separately via tcpmem and the consumer reads >> memcg->tcpmem_pressure instead. >> >> @tree=true drives userspace eventfd notifications via the v1 >> memory.pressure_level / cgroup.event_control interface. >> v2 has no equivalent: userspace gets reclaim signals >> through memory.pressure (PSI), which does not touch >> vmpressure. >> >> The existing early return covered v1 + @tree=false. The symmetric >> v2 + @tree=true case was falling through and doing the full lock / >> accumulate / schedule_work / parent-walk dance for an events list >> that can never be populated. bpftrace on a 176-core production host >> (cgroup v2, CONFIG_MEMCG_V1=n, 285 memcgs, sustained reclaim) showed >> ~16,200 @tree=true vmpressure() calls per minute. Add an early return >> that skips cgroup v2 + tree = true which avoids us doing all this work. >> On a v2-only host this also eliminates a lock contention path that can >> serialise reclaimers on a single global sr_lock. >> >> Acked-by: Shakeel Butt >> Signed-off-by: Usama Arif >> --- >> mm/vmpressure.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/mm/vmpressure.c b/mm/vmpressure.c >> index f053554e5826..c82cee1ab43b 100644 >> --- a/mm/vmpressure.c >> +++ b/mm/vmpressure.c >> @@ -246,11 +246,13 @@ void vmpressure(gfp_t gfp, int order, struct mem_cgroup *memcg, bool tree, >> return; >> >> /* >> - * The in-kernel users only care about the reclaim efficiency >> - * for this @memcg rather than the whole subtree, and there >> - * isn't and won't be any in-kernel user in a legacy cgroup. >> + * Only two combinations have a consumer: >> + * cgroup v2 + tree=false -> in-kernel socket pressure >> + * cgroup v1 + tree=true -> userspace eventfds (memory.pressure_level) >> + * Skip the other two: nothing consumes the result. >> */ >> - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !tree) >> + if ((!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !tree) || >> + (cgroup_subsys_on_dfl(memory_cgrp_subsys) && tree)) >> return; > > I had already acked this one, with a half serious suggestion to make > this > > if (cgroup_subsys_on_dfl(memory_cgrp_subsys) == tree) > return; > > Anyway, no strong feelings. If nobody agrees, > > Acked-by: Johannes Weiner Yeah sorry about this! I just amended my last patch to move code from vmpressure-v1.c to memcontrol-v1.c and just sent it, without other changes. Forgot Shakeels ack on v2 as well :( Andrew would you mind applying the below fixlet? I can also respin if its easier. Thanks!! >From 969c19da782bbcd77ae4b9e94d3a9e1d78c198d7 Mon Sep 17 00:00:00 2001 From: Usama Arif Date: Tue, 30 Jun 2026 09:25:05 -0700 Subject: [fixlet] mm/vmpressure: skip tree=true accounting on cgroup v2 Simplify the guard. Both cgroup_subsys_on_dfl() and tree are bool, so the two combinations that have no consumer (v1 + tree=false, v2 + tree=true) are exactly the cases where dfl == tree. Suggested-by: Johannes Weiner Signed-off-by: Usama Arif --- mm/vmpressure.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/vmpressure.c b/mm/vmpressure.c index 14470141bbe6..9629240d77ad 100644 --- a/mm/vmpressure.c +++ b/mm/vmpressure.c @@ -120,8 +120,7 @@ void vmpressure(gfp_t gfp, int order, struct mem_cgroup *memcg, bool tree, * cgroup v1 + tree=true -> userspace eventfds (memory.pressure_level) * Skip the other two: nothing consumes the result. */ - if ((!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !tree) || - (cgroup_subsys_on_dfl(memory_cgrp_subsys) && tree)) + if (cgroup_subsys_on_dfl(memory_cgrp_subsys) == tree) return; vmpr = memcg_to_vmpressure(memcg); -- 2.53.0-Meta