From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19E21C71157 for ; Tue, 17 Jun 2025 19:57:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98C846B00A3; Tue, 17 Jun 2025 15:57:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 93CFF6B00A5; Tue, 17 Jun 2025 15:57:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82BE26B00A7; Tue, 17 Jun 2025 15:57:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 72B446B00A3 for ; Tue, 17 Jun 2025 15:57:52 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4F1C980CFE for ; Tue, 17 Jun 2025 19:57:52 +0000 (UTC) X-FDA: 83565953184.09.AEE0286 Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com [95.215.58.177]) by imf20.hostedemail.com (Postfix) with ESMTP id 9D07B1C000A for ; Tue, 17 Jun 2025 19:57:50 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=JFrlZu7X; spf=pass (imf20.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750190270; a=rsa-sha256; cv=none; b=hl+uGW8yPgeqxkIxchU1QhsFOhhBBaWuxqgD8Cn/F+yVFVj5lqptUyDHyzUqZjeHGiri17 xHUBZVRJFhTf0A8wbcBpe1ZXzSvqArRgkeBl7GdBMPSrnqFzYBJ7a0EmizOeNhIG0bwWI9 gQhtL37ukMyIg8tK/GoMjxV2g1TidsQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=JFrlZu7X; spf=pass (imf20.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750190270; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WOZQdqjTaj2ZWtoNQCZgZm9OalbP9EhjyF/kfQvNGpw=; b=yf1lLZe8RNcmNrkS9SZwO9PkmAvXjs2PKek9l4RxEUV7INE4pgEaqUDRJG3Bdy8inUTE7L JMBu5p22iDzznyJLZ6OK63nwa/rRiMBcbQ1LLKBx5xp9NW4qufpSLBDq+5uMtJuo8H6Grz ZzZj93ebTbJ9pn1LLx95lfotCd9MTf4= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1750190269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WOZQdqjTaj2ZWtoNQCZgZm9OalbP9EhjyF/kfQvNGpw=; b=JFrlZu7XSgyCbar/+PJF2y7CB1zhqzR74Aol+v6VK7/KW8ynmi3rnrA0IzJw96adMCbQ5F 90gvb6Me5N+LwctMunixiAS3hR13g5wxv7szpIiAY0S97t7Nw0uMX1v83n9hj5CAGOpIcd JBKyEDmu3867f24tyU4UsAmp4wgD1d0= From: Shakeel Butt To: Tejun Heo Cc: Andrew Morton , JP Kobryn , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v3 3/4] cgroup: remove per-cpu per-subsystem locks Date: Tue, 17 Jun 2025 12:57:24 -0700 Message-ID: <20250617195725.1191132-4-shakeel.butt@linux.dev> In-Reply-To: <20250617195725.1191132-1-shakeel.butt@linux.dev> References: <20250617195725.1191132-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9D07B1C000A X-Stat-Signature: 3otg1ef1qra84a9irw894s1hycz6r69n X-Rspam-User: X-HE-Tag: 1750190270-59258 X-HE-Meta: U2FsdGVkX182P/ZSsXzcPXjz1hxNWnCMHIXlYXi8y6N1jYo3TXTLKc85T8PYpy6vYH1zStDFOM1MgvuJVVy9aQXvDtAisbzvPTlZAqcwmIrPK93sat1k4kI//RVnZAi1gJc2XwTB1AjaL4nbl8Lx61/M+L6qneicXEMDPV2zbeHZoYYskKXTPAgoX7BEAoIA1kDD49x0w0IIryKPRIpVUddpMYdxkBnrDW1vUYxscfzNMYtrIWgas++E6THuiMhJzlIZqOwjDnNjAMawvvMar9x8+TKUcXJ/Yd64uyGA4rz5q4AM0YbuzTOu2Uw9ELOtwpJYTBBVrdbzQjxGqI0pganhKvTxPOdv9gjacsxSMYFwQ7n0q+18LYYhjHKVnQA6NRKplULGz8szJdHP+lqS7Lr0xxrCmweHGsA8zvWwGbYX6sbtpxNH2J+Nugw1tA+CZBBN1VaVfRjRkCfuG5cHzlLh8gYRfPPSA6YnAYZ/kzjQSudLHhlsM1z1wAy7oDjzDC3m8pUN09G2UizleI7inIpHmdLjd8kGRhgfX1OQzcdg8VUmM9wxTwE3I6C6km8O+bctkv5Oz51PT/mGRYDyHTll62f2XA0z/5ZH/a38DgVV0Gkj42aP4y7CcL2xiPU5Fa+TKmZ6jU58DWP3EkMDPC32eemrAE1+jonBpt/XEB8/Yfv90octj61+PEjkRD25CpodVHcFooizHIXHBTQibpSqzSrVaUsFxo0sOo2uJpKt3X/Xr3DHpQDlxcfNMABKFwusKjU2uvL248LjpMRRHij6QvXNwX0NM7PCSswJjI68WlFkNmj6SwD0rHxyvCWwGcwSlYOraCRM6yGyAOltOTFAJLnRAoY/di4A/ael5GLGo+d9I0zx22D5tR8C/0t4t1RRV+TJYs1C/ntK4UW8opeQrIKC06CUTSWKQYxwD8t0exnQZt508Ihh04Y31gBNGJn3qdTJpwV/VWL6RNm qLjH9Nog NPJNHwk3N21Iej4NKAfIwE7OumfbT6QMMzs7Ckvk85QL033lkByh5CdzhenMvo4CmdMI8aQE92XnMAXx041Q6jz+CUeE5vNlTAuvHmhHQwtEAgv2Uncq/2s7LLKUtGT967AGdCTjEEZWIe5xV/nTCizMc+FX/GCyj0WNjN5o/3i4RrtifnkwGlRCI8n9C2Kk/vQn7h3n6hME4WfuEBIy2U5RjfRhx9dOB/Gu1MWAc3hg4HDo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The rstat update side used to insert the cgroup whose stats are updated in the update tree and the read side flush the update tree to get the latest uptodate stats. The per-cpu per-subsystem locks were used to synchronize the update and flush side. However now the update side does not access update tree but uses per-cpu lockless lists. So there is no need for locks to synchronize update and flush side. Let's remove them. Suggested-by: JP Kobryn Signed-off-by: Shakeel Butt --- include/linux/cgroup-defs.h | 7 --- include/trace/events/cgroup.h | 47 ---------------- kernel/cgroup/rstat.c | 100 ++-------------------------------- 3 files changed, 4 insertions(+), 150 deletions(-) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 04191d99228c..6b93a64115fe 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -375,12 +375,6 @@ struct css_rstat_cpu { * Child cgroups with stat updates on this cpu since the last read * are linked on the parent's ->updated_children through * ->updated_next. updated_children is terminated by its container css. - * - * In addition to being more compact, singly-linked list pointing to - * the css makes it unnecessary for each per-cpu struct to point back - * to the associated css. - * - * Protected by per-cpu css->ss->rstat_ss_cpu_lock. */ struct cgroup_subsys_state *updated_children; struct cgroup_subsys_state *updated_next; /* NULL if not on the list */ @@ -824,7 +818,6 @@ struct cgroup_subsys { unsigned int depends_on; spinlock_t rstat_ss_lock; - raw_spinlock_t __percpu *rstat_ss_cpu_lock; struct llist_head __percpu *lhead; /* lockless update list head */ }; diff --git a/include/trace/events/cgroup.h b/include/trace/events/cgroup.h index 7d332387be6c..ba9229af9a34 100644 --- a/include/trace/events/cgroup.h +++ b/include/trace/events/cgroup.h @@ -257,53 +257,6 @@ DEFINE_EVENT(cgroup_rstat, cgroup_rstat_unlock, TP_ARGS(cgrp, cpu, contended) ); -/* - * Related to per CPU locks: - * global rstat_base_cpu_lock for base stats - * cgroup_subsys::rstat_ss_cpu_lock for subsystem stats - */ -DEFINE_EVENT(cgroup_rstat, cgroup_rstat_cpu_lock_contended, - - TP_PROTO(struct cgroup *cgrp, int cpu, bool contended), - - TP_ARGS(cgrp, cpu, contended) -); - -DEFINE_EVENT(cgroup_rstat, cgroup_rstat_cpu_lock_contended_fastpath, - - TP_PROTO(struct cgroup *cgrp, int cpu, bool contended), - - TP_ARGS(cgrp, cpu, contended) -); - -DEFINE_EVENT(cgroup_rstat, cgroup_rstat_cpu_locked, - - TP_PROTO(struct cgroup *cgrp, int cpu, bool contended), - - TP_ARGS(cgrp, cpu, contended) -); - -DEFINE_EVENT(cgroup_rstat, cgroup_rstat_cpu_locked_fastpath, - - TP_PROTO(struct cgroup *cgrp, int cpu, bool contended), - - TP_ARGS(cgrp, cpu, contended) -); - -DEFINE_EVENT(cgroup_rstat, cgroup_rstat_cpu_unlock, - - TP_PROTO(struct cgroup *cgrp, int cpu, bool contended), - - TP_ARGS(cgrp, cpu, contended) -); - -DEFINE_EVENT(cgroup_rstat, cgroup_rstat_cpu_unlock_fastpath, - - TP_PROTO(struct cgroup *cgrp, int cpu, bool contended), - - TP_ARGS(cgrp, cpu, contended) -); - #endif /* _TRACE_CGROUP_H */ /* This part must be outside protection */ diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index 823a4c7c3fea..c8a48cf83878 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -10,7 +10,6 @@ #include static DEFINE_SPINLOCK(rstat_base_lock); -static DEFINE_PER_CPU(raw_spinlock_t, rstat_base_cpu_lock); static DEFINE_PER_CPU(struct llist_head, rstat_backlog_list); static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu); @@ -53,74 +52,6 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu) return per_cpu_ptr(&rstat_backlog_list, cpu); } -static raw_spinlock_t *ss_rstat_cpu_lock(struct cgroup_subsys *ss, int cpu) -{ - if (ss) - return per_cpu_ptr(ss->rstat_ss_cpu_lock, cpu); - - return per_cpu_ptr(&rstat_base_cpu_lock, cpu); -} - -/* - * Helper functions for rstat per CPU locks. - * - * This makes it easier to diagnose locking issues and contention in - * production environments. The parameter @fast_path determine the - * tracepoints being added, allowing us to diagnose "flush" related - * operations without handling high-frequency fast-path "update" events. - */ -static __always_inline -unsigned long _css_rstat_cpu_lock(struct cgroup_subsys_state *css, int cpu, - const bool fast_path) -{ - struct cgroup *cgrp = css->cgroup; - raw_spinlock_t *cpu_lock; - unsigned long flags; - bool contended; - - /* - * The _irqsave() is needed because the locks used for flushing are - * spinlock_t which is a sleeping lock on PREEMPT_RT. Acquiring this lock - * with the _irq() suffix only disables interrupts on a non-PREEMPT_RT - * kernel. The raw_spinlock_t below disables interrupts on both - * configurations. The _irqsave() ensures that interrupts are always - * disabled and later restored. - */ - cpu_lock = ss_rstat_cpu_lock(css->ss, cpu); - contended = !raw_spin_trylock_irqsave(cpu_lock, flags); - if (contended) { - if (fast_path) - trace_cgroup_rstat_cpu_lock_contended_fastpath(cgrp, cpu, contended); - else - trace_cgroup_rstat_cpu_lock_contended(cgrp, cpu, contended); - - raw_spin_lock_irqsave(cpu_lock, flags); - } - - if (fast_path) - trace_cgroup_rstat_cpu_locked_fastpath(cgrp, cpu, contended); - else - trace_cgroup_rstat_cpu_locked(cgrp, cpu, contended); - - return flags; -} - -static __always_inline -void _css_rstat_cpu_unlock(struct cgroup_subsys_state *css, int cpu, - unsigned long flags, const bool fast_path) -{ - struct cgroup *cgrp = css->cgroup; - raw_spinlock_t *cpu_lock; - - if (fast_path) - trace_cgroup_rstat_cpu_unlock_fastpath(cgrp, cpu, false); - else - trace_cgroup_rstat_cpu_unlock(cgrp, cpu, false); - - cpu_lock = ss_rstat_cpu_lock(css->ss, cpu); - raw_spin_unlock_irqrestore(cpu_lock, flags); -} - /** * css_rstat_updated - keep track of updated rstat_cpu * @css: target cgroup subsystem state @@ -323,15 +254,12 @@ static struct cgroup_subsys_state *css_rstat_updated_list( { struct css_rstat_cpu *rstatc = css_rstat_cpu(root, cpu); struct cgroup_subsys_state *head = NULL, *parent, *child; - unsigned long flags; - - flags = _css_rstat_cpu_lock(root, cpu, false); css_process_update_tree(root->ss, cpu); /* Return NULL if this subtree is not on-list */ if (!rstatc->updated_next) - goto unlock_ret; + return NULL; /* * Unlink @root from its parent. As the updated_children list is @@ -363,8 +291,7 @@ static struct cgroup_subsys_state *css_rstat_updated_list( rstatc->updated_children = root; if (child != root) head = css_rstat_push_children(head, child, cpu); -unlock_ret: - _css_rstat_cpu_unlock(root, cpu, flags, false); + return head; } @@ -560,34 +487,15 @@ int __init ss_rstat_init(struct cgroup_subsys *ss) { int cpu; -#ifdef CONFIG_SMP - /* - * On uniprocessor machines, arch_spinlock_t is defined as an empty - * struct. Avoid allocating a size of zero by having this block - * excluded in this case. It's acceptable to leave the subsystem locks - * unitialized since the associated lock functions are no-ops in the - * non-smp case. - */ - if (ss) { - ss->rstat_ss_cpu_lock = alloc_percpu(raw_spinlock_t); - if (!ss->rstat_ss_cpu_lock) - return -ENOMEM; - } -#endif - if (ss) { ss->lhead = alloc_percpu(struct llist_head); - if (!ss->lhead) { - free_percpu(ss->rstat_ss_cpu_lock); + if (!ss->lhead) return -ENOMEM; - } } spin_lock_init(ss_rstat_lock(ss)); - for_each_possible_cpu(cpu) { - raw_spin_lock_init(ss_rstat_cpu_lock(ss, cpu)); + for_each_possible_cpu(cpu) init_llist_head(ss_lhead_cpu(ss, cpu)); - } return 0; } -- 2.47.1