From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D5B1C83F03 for ; Thu, 3 Jul 2025 20:00:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EC146B0295; Thu, 3 Jul 2025 16:00:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C1BC6B0297; Thu, 3 Jul 2025 16:00:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D7F76B0298; Thu, 3 Jul 2025 16:00:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7F2566B0295 for ; Thu, 3 Jul 2025 16:00:38 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 29673107557 for ; Thu, 3 Jul 2025 20:00:38 +0000 (UTC) X-FDA: 83624020956.12.93BCCD1 Received: from out-171.mta1.migadu.com (out-171.mta1.migadu.com [95.215.58.171]) by imf01.hostedemail.com (Postfix) with ESMTP id 56AC14000B for ; Thu, 3 Jul 2025 20:00:36 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=wIRI2tw6; spf=pass (imf01.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.171 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751572836; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y3htcIr0kFzw+gCDg4WtbffUMbfThV2pCzDBCjKYADI=; b=uJ8r6xfAkFbH/ZZHpV35U3b1rnvfk6cvziJJSitxw5ybwaw6FOIO6jQfblJei4ATBIKvIO mK6NvXJdUe/q04QDxVNQF9SXh4NMdbdFACF9Nulije3HIoUkixJgI5jbNZuK7RdRg5HB7p o/cNXavuZoUnHKqixJnnyWbZevjSlks= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751572836; a=rsa-sha256; cv=none; b=jxxiONf3aAHdJOjSUMqzXwZzn55LLwcsikwJ5DGxsZHMdCPoMcyyG5f11Smc04Cb99wC40 c7Wr5BgCbdWUpd8Y08Uo7JYR9Co6IxXjRBRXF6JXV+Q84V1s2bJbMK800lQQ91m/k6x3IK p1T304b9M2+FHpkPnbRzhz+L30NyQSc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=wIRI2tw6; spf=pass (imf01.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.171 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1751572834; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y3htcIr0kFzw+gCDg4WtbffUMbfThV2pCzDBCjKYADI=; b=wIRI2tw6towRkHe5udmYk4iVq151wFSGoN9j+bmKupg9YRKgw5VYYowQthaOTKXA/aoxwT BNzll83RO3sPFNS3F1amVFCLOcR0CjyfLCj1uDPt4aqORfAZtNyoaVP+jIG0etfCPVR8ER icna312iBjyCS/cIKYJ/P+xk4FZkI0U= From: Shakeel Butt To: Tejun Heo Cc: "Paul E . McKenney" , Andrew Morton , JP Kobryn , Johannes Weiner , Ying Huang , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , =?UTF-8?q?Michal=20Koutn=C3=BD?= , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 2/2] cgroup: explain the race between updater and flusher Date: Thu, 3 Jul 2025 13:00:12 -0700 Message-ID: <20250703200012.3734798-2-shakeel.butt@linux.dev> In-Reply-To: <20250703200012.3734798-1-shakeel.butt@linux.dev> References: <20250703200012.3734798-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 56AC14000B X-Stat-Signature: 476io5xz7sywjhhqq8cn9yp4755is5mp X-HE-Tag: 1751572836-841387 X-HE-Meta: U2FsdGVkX19c2VnUaIA7MSlHQYb6T7s88NsYeDeoJCW+3KWKT80WQ6LztLKQx8liRSlCNFBTzdsv2Y7yfKLScEFKzKUun809M2d4jGthyIeWz+imTc3/1tMVeh+u35G1WuHY4H1XQLLE060n0mAR3hxEZYBic7LPEqsBsu+GPqRi0s0JFbU4q/dYNzQ8euAiHxxlZoKRQf9uHVBlrjYu/JB5uJOgpG5gHqLmkt69AsvQDX86TNAC3cB3wu3oHKm+pyS+uDiufE7iKke6lFxCf8x0kVNtTeGzT4wT0eZ7PkXCv2d3dru5ONCQ7ujaTkt6g3itDrRJIS391ldKXLPzaxbjRv6tKGxzDR9p5VHU4o/MxxM5ZePNIFWxSRdV6uFGjWXYmToUNg6wZR/6YsyDLGpdiKih2IR7Q5Fdjl2MAL8J6HVqTX50TgZ7Q1294QUChqmtm1I/+VpXKbQD66EnuTQbDkUo7JezAYPnfBGUFSbpXuaRZayMJWdUpXOmUBu9ssZnqStqMQV/veTZxx6Hm18yDhGii+250mjpd9E/c591gADN9MTOV17CmKE22RzfOQynG5NPiBRpm/aUXbpo8xb9kaoaoQB/KQJRzG6/letiGd+8q6bei9oS89z4UQUeUH6FbGPp54syOwiWbpR3U1WizL01l2wUGUdOnU8hFPBKKAUrZrDpKVHo3QwRLRn1jQ0VinecyVaOpQHqFuyP/GHW1fGbPwUT5M0LBQdMAIIuPqoYXzmnJlA1RSfn3JC8nocprelNqJzKKLjNw/UImRAQNTno9ZlrOprU+XKTOkYBnzhQm2Qvm+xScNhCR+O4pRQPlPZTHVvq1DoUbxjsvCCk0XE3aFfxsFTVEkUIUfbQ8GhOVTVi7Ampe75boEq757JUwOswEufnFQHXgebnU2Uyw865r1+pkhQ9jvdj+1E7OkLnLcP/fvi/FevSsw6vV2xT8Y9vfL3Kk2kN56A RapewaYS DufmLuLz9CJOQB8tQBsVMcXUTdXaXZhwSgDG/5j1tx1/jbKA0LXVUVV033sr6wciNQ8fRfp0uhhJYLX1xswOC2/jByBJqihvfppQ3eUSVBbsfcMpzc8xpq+Z72poGlnIwkBIcrLmgbph1U09shTsuE8uRczUhsJdsiGVVc5pFaS5kDemYileuHqCR7lP9s9p5UIH7c+UYRO++dnIWuugcbExUcxAKtCX2Vioh9opfhVZ8bkv+fZznoJeviHXu1S6erIJjYxihqC+1rOE2kRo3/fBd8sigCvRm8cZuy2L+oSa2d/rr/OIgVRFSlQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently the rstat updater and the flusher can race and cause a scenario where the stats updater skips adding the css to the lockless list but the flusher might not see those updates done by the skipped updater. This is benign race and the subsequent flusher will flush those stats and at the moment there aren't any rstat users which are not fine with this kind of race. However some future user might want more stricter guarantee, so let's add appropriate comments and data_race() tags to ease the job of future users. Signed-off-by: Shakeel Butt --- kernel/cgroup/rstat.c | 32 +++++++++++++++++++++++++++++--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index c8a48cf83878..b98c03b1af25 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -60,6 +60,12 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu) * Atomically inserts the css in the ss's llist for the given cpu. This is * reentrant safe i.e. safe against softirq, hardirq and nmi. The ss's llist * will be processed at the flush time to create the update tree. + * + * NOTE: if the user needs the guarantee that the updater either add itself in + * the lockless list or the concurrent flusher flushes its updated stats, a + * memory barrier is needed before the call to css_rstat_updated() i.e. a + * barrier after updating the per-cpu stats and before calling + * css_rstat_updated(). */ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) { @@ -86,8 +92,13 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) return; rstatc = css_rstat_cpu(css, cpu); - /* If already on list return. */ - if (llist_on_list(&rstatc->lnode)) + /* + * If already on list return. This check is racy and smp_mb() is needed + * to pair it with the smp_mb() in css_process_update_tree() if the + * guarantee that the updated stats are visible to concurrent flusher is + * needed. + */ + if (data_race(llist_on_list(&rstatc->lnode))) return; /* @@ -145,9 +156,24 @@ static void css_process_update_tree(struct cgroup_subsys *ss, int cpu) struct llist_head *lhead = ss_lhead_cpu(ss, cpu); struct llist_node *lnode; - while ((lnode = llist_del_first_init(lhead))) { + while ((lnode = data_race(llist_del_first_init(lhead)))) { struct css_rstat_cpu *rstatc; + /* + * smp_mb() is needed here (more specifically in between + * init_llist_node() and per-cpu stats flushing) if the + * guarantee is required by a rstat user where etiher the + * updater should add itself on the lockless list or the + * flusher flush the stats updated by the updater who have + * observed that they are already on the list. The + * corresponding barrier pair for this one should be before + * css_rstat_updated() by the user. + * + * For now, there aren't any such user, so not adding the + * barrier here but if such a use-case arise, please add + * smp_mb() here. + */ + rstatc = container_of(lnode, struct css_rstat_cpu, lnode); __css_process_update_tree(rstatc->owner, cpu); } -- 2.47.1