From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EB57C369D3 for ; Tue, 22 Apr 2025 14:01:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 188916B0006; Tue, 22 Apr 2025 10:01:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 139346B0007; Tue, 22 Apr 2025 10:01:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F41A36B000A; Tue, 22 Apr 2025 10:01:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D61236B0006 for ; Tue, 22 Apr 2025 10:01:23 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id CDC8EC188C for ; Tue, 22 Apr 2025 14:01:24 +0000 (UTC) X-FDA: 83361842088.14.93E5958 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) by imf12.hostedemail.com (Postfix) with ESMTP id E34BC40004 for ; Tue, 22 Apr 2025 14:01:22 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=QBRFHDv1; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf12.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.181 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745330483; a=rsa-sha256; cv=none; b=FdyWmeoRjMb2crf4i00GBzrcIM19BzYPntHlGbhJOb6b8qM3xgeyDuoIEbinJNygcQjaHX Y6qq7zZpqXTWQXeXKXTJwo5wWDa1GmqcjfqQXKM97MagX5EyUHUBlXENNGRrMZIxZ8a36c ZIxcg5zCK0R/vKl7lhw3ZULE6BJOVyQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=QBRFHDv1; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf12.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.181 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745330483; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KZiyAKGmgRNuRBsZBpRX/xVMhFS+3p/30Wi1DCLvGIo=; b=XRnGprDLOvjRiRShP5TYaYZR87eTS7PeLdJ597esdX3kts11CfZsgfecudFYulpwkaezmY dq70rQRUJ2ixGMhP/wZpqNhEPrS6DD5UNbi9EG3kcaos2HdoVbeYYuyi+2+iB3H9tTYCz2 m65Bpk5ZtjjtNp4Twwg2BsqNSZ3M+z8= Date: Tue, 22 Apr 2025 07:01:15 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1745330480; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KZiyAKGmgRNuRBsZBpRX/xVMhFS+3p/30Wi1DCLvGIo=; b=QBRFHDv1EvPS1zuu6ArnOt8XsHWRYR3F8ozD+tjUPysKe5ofjqULjfrAJKg2qxBmlIHq3Q 16+1tmHmJU9+znGcrRm1Z+G+p0nbgd8qdOvFgDh7VdhJBjeHqh2Sad9ZFxFZBEaBfCqzqV KCFEl/TZJIsbTdHKRdoNjok+Sfwq6fE= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: JP Kobryn Cc: tj@kernel.org, shakeel.butt@linux.dev, yosryahmed@google.com, mkoutny@suse.com, hannes@cmpxchg.org, akpm@linux-foundation.org, linux-mm@kvack.org, cgroups@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH v4 5/5] cgroup: use subsystem-specific rstat locks to avoid contention Message-ID: References: <20250404011050.121777-1-inwardvessel@gmail.com> <20250404011050.121777-6-inwardvessel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250404011050.121777-6-inwardvessel@gmail.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E34BC40004 X-Stat-Signature: cqhcbc6fig84jih1g6j4occ3w1da5b7b X-Rspam-User: X-HE-Tag: 1745330482-110672 X-HE-Meta: U2FsdGVkX191x+CARaN5LT7UbQmhjAG2+ZetyhT/mucxLisHlrRW6FlCk4dg0hE7dP1+Hs1HRin+CizJnqt2kZ8NWDTPBynJcoavYwaeDcUgLqdKUNBoDioCcJYQN/1o7J6witGVjbgilVH+KUO8CEqAIgwAIJiqyFFAJkPzDx0HoWXOnyf2g6OiFdkA830L7dwm2/4OklKq1m6p8KW3WjOZDo8pdZKXFjRmj2vKfCZ8Q2VaEqLUrnsEbQ9ns5ml0ooJUEgnXm7pV/l4JjgocgN7tAinJOJJv8YtTAAj8f7cuXtMbZcxC9oFEDPU6Y6/iA2J5atw0rpGRTtArvjmvJDoidaY6F3aMst124wUbXfIe1R5i5figw+I1YC5SIbKSFPZT5oTss6LV9Nsbhe7xczggzIXmlTRuEWDn0IiJM8qNGzVsSaTGP28Nzac/7CSff6QNhQmY5smrVJr0wz4KHnb2VeAwP1z17qxDzCqhQh3Y3Ctq2j8v/Pejx6h+Yj3ytXBO4B/5yvlUXM3aHCCsfWaISw35k5ViNFvzTcoKEPWsN6Dzlj5kk8JMWK+zVW1YLXupKMIzWBoLaobmaS5coo2gGPBekpIe3Gwy41EqUxxn2hmnmSKTwuLDiiDDNB4dVwHEkZgIVWJ9DArBplFCZt6b0EQ4Rokd84mrPJG+cLEAHMUXBktYwXnEOXIMAKNwObwzRbBO0RQSQNf+YHyCCtmWZ7pw2CQtiHZZxvie1IXJGl5Qa8S753gSI4Wr66mN/Xf9ChWc9xCyf5qmprwk3Uh4aUzeL3A8QfPpkxfX32BHr5k0F6O4+/OFX2zGUrwtlZyNIW11iSpzLxPwdgZkVb0J38EGjAWkHZ+Qvj5FwCPnKIuQGq3eAHW2CXe/0yL3ObrLlMBFl/WmHWZkhzeV8hvyZYNlzAHk/7b0fCGbFR8eNLnLGQNZhxyhW1Syxy0SE3cII9rntkuoxhvMen ELQOGafM aPvzfbKS5AGeHJ1QE38cD4CZqpVzFGsqF6pMkjoNNT7HFf8xhsVYd5scAkckUYI2g9ADB9n+EngUxlQwgoK1mgqNVSCLiW10cUBHPD1GbDERa9bMBA7ISwve1wKWMJAaBCELZyA8gO0kqmX0FhA2XdpfTwAObQlB87hshYN6ZAGj+gJgFg0bdsos8SD1+omi1hJVpq29iLqHMHC4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 03, 2025 at 06:10:50PM -0700, JP Kobryn wrote: > It is possible to eliminate contention between subsystems when > updating/flushing stats by using subsystem-specific locks. Let the existing > rstat locks be dedicated to the cgroup base stats and rename them to > reflect that. Add similar locks to the cgroup_subsys struct for use with > individual subsystems. > > Lock initialization is done in the new function ss_rstat_init(ss) which > replaces cgroup_rstat_boot(void). If NULL is passed to this function, the > global base stat locks will be initialized. Otherwise, the subsystem locks > will be initialized. > > Change the existing lock helper functions to accept a reference to a css. > Then within these functions, conditionally select the appropriate locks > based on the subsystem affiliation of the given css. Add helper functions > for this selection routine to avoid repeated code. > > Signed-off-by: JP Kobryn > --- > block/blk-cgroup.c | 2 +- > include/linux/cgroup-defs.h | 16 +++-- > include/trace/events/cgroup.h | 12 +++- > kernel/cgroup/cgroup-internal.h | 2 +- > kernel/cgroup/cgroup.c | 10 +++- > kernel/cgroup/rstat.c | 101 +++++++++++++++++++++++--------- > 6 files changed, 103 insertions(+), 40 deletions(-) > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index 0560ea402856..62d0bf1e1a04 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -1074,7 +1074,7 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu) > /* > * For covering concurrent parent blkg update from blkg_release(). > * > - * When flushing from cgroup, cgroup_rstat_lock is always held, so > + * When flushing from cgroup, the subsystem lock is always held, so > * this lock won't cause contention most of time. > */ > raw_spin_lock_irqsave(&blkg_stat_lock, flags); > diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h > index c58c21c2110a..bb5a355524d6 100644 > --- a/include/linux/cgroup-defs.h > +++ b/include/linux/cgroup-defs.h > @@ -223,7 +223,10 @@ struct cgroup_subsys_state { > /* > * A singly-linked list of css structures to be rstat flushed. > * This is a scratch field to be used exclusively by > - * css_rstat_flush_locked() and protected by cgroup_rstat_lock. > + * css_rstat_flush_locked(). > + * > + * Protected by rstat_base_lock when css is cgroup::self. > + * Protected by css->ss->rstat_ss_lock otherwise. > */ > struct cgroup_subsys_state *rstat_flush_next; > }; > @@ -359,11 +362,11 @@ struct css_rstat_cpu { > * are linked on the parent's ->updated_children through > * ->updated_next. > * > - * In addition to being more compact, singly-linked list pointing > - * to the cgroup makes it unnecessary for each per-cpu struct to > - * point back to the associated cgroup. > + * In addition to being more compact, singly-linked list pointing to > + * the css makes it unnecessary for each per-cpu struct to point back > + * to the associated css. > * > - * Protected by per-cpu cgroup_rstat_cpu_lock. > + * Protected by per-cpu css->ss->rstat_ss_cpu_lock. > */ > struct cgroup_subsys_state *updated_children; /* terminated by self cgroup */ This rename belongs in the previous patch, also the comment about updated_children should probably say "self css" now. > struct cgroup_subsys_state *updated_next; /* NULL iff not on the list */ > @@ -793,6 +796,9 @@ struct cgroup_subsys { > * specifies the mask of subsystems that this one depends on. > */ > unsigned int depends_on; > + > + spinlock_t rstat_ss_lock; > + raw_spinlock_t __percpu *rstat_ss_cpu_lock; Can we use local_lock_t here instead? I guess it would be annoying because we won't be able to have common code for locking/unlocking. It's annoying because the local lock is a spinlock under the hood for non-RT kernels anyway.. > }; > > extern struct percpu_rw_semaphore cgroup_threadgroup_rwsem; [..] > diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c > index 37d9e5012b2d..bcc253aec774 100644 > --- a/kernel/cgroup/rstat.c > +++ b/kernel/cgroup/rstat.c > @@ -9,8 +9,8 @@ > > #include > > -static DEFINE_SPINLOCK(cgroup_rstat_lock); > -static DEFINE_PER_CPU(raw_spinlock_t, cgroup_rstat_cpu_lock); > +static DEFINE_SPINLOCK(rstat_base_lock); > +static DEFINE_PER_CPU(raw_spinlock_t, rstat_base_cpu_lock); Can we do something like this (not sure the macro usage is correct): static DEFINE_PER_CPU(raw_spinlock_t, rstat_base_cpu_lock) = __SPIN_LOCK_UNLOCKED(rstat_base_cpu_lock); This should initialize the per-CPU spinlocks the same way DEFINE_SPINLOCK does IIUC. > > static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu); > [..] > @@ -422,12 +443,36 @@ void css_rstat_exit(struct cgroup_subsys_state *css) > css->rstat_cpu = NULL; > } > > -void __init cgroup_rstat_boot(void) > +/** > + * ss_rstat_init - subsystem-specific rstat initialization > + * @ss: target subsystem > + * > + * If @ss is NULL, the static locks associated with the base stats > + * are initialized. If @ss is non-NULL, the subsystem-specific locks > + * are initialized. > + */ > +int __init ss_rstat_init(struct cgroup_subsys *ss) > { > int cpu; > > + if (!ss) { > + spin_lock_init(&rstat_base_lock); IIUC locks defined with DEFINE_SPINLOCK() do not need to be initialized, and I believe we can achieve the same for the per-CPU locks as I described above and eliminate this branch completely. > + > + for_each_possible_cpu(cpu) > + raw_spin_lock_init(per_cpu_ptr(&rstat_base_cpu_lock, cpu)); > + > + return 0; > + } > + > + spin_lock_init(&ss->rstat_ss_lock); > + ss->rstat_ss_cpu_lock = alloc_percpu(raw_spinlock_t); > + if (!ss->rstat_ss_cpu_lock) > + return -ENOMEM; > + > for_each_possible_cpu(cpu) > - raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)); > + raw_spin_lock_init(per_cpu_ptr(ss->rstat_ss_cpu_lock, cpu)); > + > + return 0; > } > > /* > -- > 2.47.1 > >