From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: JP Kobryn <inwardvessel@gmail.com>
Cc: tj@kernel.org, shakeel.butt@linux.dev, yosryahmed@google.com,
mkoutny@suse.com, hannes@cmpxchg.org, akpm@linux-foundation.org,
linux-mm@kvack.org, cgroups@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH v4 5/5] cgroup: use subsystem-specific rstat locks to avoid contention
Date: Tue, 22 Apr 2025 07:01:15 -0700 [thread overview]
Message-ID: <aAehK23MNX1FsRjF@Asmaa.> (raw)
In-Reply-To: <20250404011050.121777-6-inwardvessel@gmail.com>
On Thu, Apr 03, 2025 at 06:10:50PM -0700, JP Kobryn wrote:
> It is possible to eliminate contention between subsystems when
> updating/flushing stats by using subsystem-specific locks. Let the existing
> rstat locks be dedicated to the cgroup base stats and rename them to
> reflect that. Add similar locks to the cgroup_subsys struct for use with
> individual subsystems.
>
> Lock initialization is done in the new function ss_rstat_init(ss) which
> replaces cgroup_rstat_boot(void). If NULL is passed to this function, the
> global base stat locks will be initialized. Otherwise, the subsystem locks
> will be initialized.
>
> Change the existing lock helper functions to accept a reference to a css.
> Then within these functions, conditionally select the appropriate locks
> based on the subsystem affiliation of the given css. Add helper functions
> for this selection routine to avoid repeated code.
>
> Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
> ---
> block/blk-cgroup.c | 2 +-
> include/linux/cgroup-defs.h | 16 +++--
> include/trace/events/cgroup.h | 12 +++-
> kernel/cgroup/cgroup-internal.h | 2 +-
> kernel/cgroup/cgroup.c | 10 +++-
> kernel/cgroup/rstat.c | 101 +++++++++++++++++++++++---------
> 6 files changed, 103 insertions(+), 40 deletions(-)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 0560ea402856..62d0bf1e1a04 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1074,7 +1074,7 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu)
> /*
> * For covering concurrent parent blkg update from blkg_release().
> *
> - * When flushing from cgroup, cgroup_rstat_lock is always held, so
> + * When flushing from cgroup, the subsystem lock is always held, so
> * this lock won't cause contention most of time.
> */
> raw_spin_lock_irqsave(&blkg_stat_lock, flags);
> diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
> index c58c21c2110a..bb5a355524d6 100644
> --- a/include/linux/cgroup-defs.h
> +++ b/include/linux/cgroup-defs.h
> @@ -223,7 +223,10 @@ struct cgroup_subsys_state {
> /*
> * A singly-linked list of css structures to be rstat flushed.
> * This is a scratch field to be used exclusively by
> - * css_rstat_flush_locked() and protected by cgroup_rstat_lock.
> + * css_rstat_flush_locked().
> + *
> + * Protected by rstat_base_lock when css is cgroup::self.
> + * Protected by css->ss->rstat_ss_lock otherwise.
> */
> struct cgroup_subsys_state *rstat_flush_next;
> };
> @@ -359,11 +362,11 @@ struct css_rstat_cpu {
> * are linked on the parent's ->updated_children through
> * ->updated_next.
> *
> - * In addition to being more compact, singly-linked list pointing
> - * to the cgroup makes it unnecessary for each per-cpu struct to
> - * point back to the associated cgroup.
> + * In addition to being more compact, singly-linked list pointing to
> + * the css makes it unnecessary for each per-cpu struct to point back
> + * to the associated css.
> *
> - * Protected by per-cpu cgroup_rstat_cpu_lock.
> + * Protected by per-cpu css->ss->rstat_ss_cpu_lock.
> */
> struct cgroup_subsys_state *updated_children; /* terminated by self cgroup */
This rename belongs in the previous patch, also the comment about
updated_children should probably say "self css" now.
> struct cgroup_subsys_state *updated_next; /* NULL iff not on the list */
> @@ -793,6 +796,9 @@ struct cgroup_subsys {
> * specifies the mask of subsystems that this one depends on.
> */
> unsigned int depends_on;
> +
> + spinlock_t rstat_ss_lock;
> + raw_spinlock_t __percpu *rstat_ss_cpu_lock;
Can we use local_lock_t here instead? I guess it would be annoying
because we won't be able to have common code for locking/unlocking. It's
annoying because the local lock is a spinlock under the hood for non-RT
kernels anyway..
> };
>
> extern struct percpu_rw_semaphore cgroup_threadgroup_rwsem;
[..]
> diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
> index 37d9e5012b2d..bcc253aec774 100644
> --- a/kernel/cgroup/rstat.c
> +++ b/kernel/cgroup/rstat.c
> @@ -9,8 +9,8 @@
>
> #include <trace/events/cgroup.h>
>
> -static DEFINE_SPINLOCK(cgroup_rstat_lock);
> -static DEFINE_PER_CPU(raw_spinlock_t, cgroup_rstat_cpu_lock);
> +static DEFINE_SPINLOCK(rstat_base_lock);
> +static DEFINE_PER_CPU(raw_spinlock_t, rstat_base_cpu_lock);
Can we do something like this (not sure the macro usage is correct):
static DEFINE_PER_CPU(raw_spinlock_t, rstat_base_cpu_lock) = __SPIN_LOCK_UNLOCKED(rstat_base_cpu_lock);
This should initialize the per-CPU spinlocks the same way
DEFINE_SPINLOCK does IIUC.
>
> static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu);
>
[..]
> @@ -422,12 +443,36 @@ void css_rstat_exit(struct cgroup_subsys_state *css)
> css->rstat_cpu = NULL;
> }
>
> -void __init cgroup_rstat_boot(void)
> +/**
> + * ss_rstat_init - subsystem-specific rstat initialization
> + * @ss: target subsystem
> + *
> + * If @ss is NULL, the static locks associated with the base stats
> + * are initialized. If @ss is non-NULL, the subsystem-specific locks
> + * are initialized.
> + */
> +int __init ss_rstat_init(struct cgroup_subsys *ss)
> {
> int cpu;
>
> + if (!ss) {
> + spin_lock_init(&rstat_base_lock);
IIUC locks defined with DEFINE_SPINLOCK() do not need to be initialized,
and I believe we can achieve the same for the per-CPU locks as I
described above and eliminate this branch completely.
> +
> + for_each_possible_cpu(cpu)
> + raw_spin_lock_init(per_cpu_ptr(&rstat_base_cpu_lock, cpu));
> +
> + return 0;
> + }
> +
> + spin_lock_init(&ss->rstat_ss_lock);
> + ss->rstat_ss_cpu_lock = alloc_percpu(raw_spinlock_t);
> + if (!ss->rstat_ss_cpu_lock)
> + return -ENOMEM;
> +
> for_each_possible_cpu(cpu)
> - raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu));
> + raw_spin_lock_init(per_cpu_ptr(ss->rstat_ss_cpu_lock, cpu));
> +
> + return 0;
> }
>
> /*
> --
> 2.47.1
>
>
next prev parent reply other threads:[~2025-04-22 14:01 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-04 1:10 [PATCH v4 0/5] cgroup: separate rstat trees JP Kobryn
2025-04-04 1:10 ` [PATCH v4 1/5] cgroup: move rstat base stat objects into their own struct JP Kobryn
2025-04-15 17:16 ` Michal Koutný
2025-04-22 12:13 ` Yosry Ahmed
2025-05-29 18:58 ` Ihor Solodrai
2025-05-29 19:11 ` Yonghong Song
2025-04-04 1:10 ` [PATCH v4 2/5] cgroup: add helper for checking when css is cgroup::self JP Kobryn
2025-04-22 12:19 ` Yosry Ahmed
[not found] ` <68078968.5d0a0220.2c3c35.bab3SMTPIN_ADDED_BROKEN@mx.google.com>
2025-04-24 16:59 ` JP Kobryn
2025-04-04 1:10 ` [PATCH v4 3/5] cgroup: change rstat function signatures from cgroup-based to css-based JP Kobryn
2025-04-04 20:00 ` Tejun Heo
2025-04-04 20:09 ` Tejun Heo
2025-04-04 21:21 ` JP Kobryn
2025-04-22 12:35 ` Yosry Ahmed
2025-04-22 12:39 ` Yosry Ahmed
[not found] ` <68078d3c.050a0220.3d37e.6d82SMTPIN_ADDED_BROKEN@mx.google.com>
2025-04-24 17:10 ` JP Kobryn
2025-04-04 1:10 ` [PATCH v4 4/5] cgroup: use separate rstat trees for each subsystem JP Kobryn
2025-04-15 17:15 ` Michal Koutný
2025-04-16 21:43 ` JP Kobryn
2025-04-17 9:26 ` Michal Koutný
2025-04-17 19:05 ` JP Kobryn
2025-04-17 20:10 ` JP Kobryn
2025-04-21 18:18 ` JP Kobryn
2025-04-22 13:33 ` Yosry Ahmed
[not found] ` <68079aa7.df0a0220.30a1a0.cbb2SMTPIN_ADDED_BROKEN@mx.google.com>
2025-04-30 23:43 ` JP Kobryn
2025-05-06 9:37 ` Yosry Ahmed
2025-04-04 1:10 ` [PATCH v4 5/5] cgroup: use subsystem-specific rstat locks to avoid contention JP Kobryn
2025-04-04 20:28 ` Tejun Heo
2025-04-11 3:31 ` JP Kobryn
2025-04-15 17:15 ` Michal Koutný
2025-04-15 19:30 ` Tejun Heo
2025-04-16 9:50 ` Michal Koutný
2025-04-16 18:10 ` JP Kobryn
2025-04-16 18:14 ` Tejun Heo
2025-04-16 18:01 ` JP Kobryn
2025-04-22 14:01 ` Yosry Ahmed [this message]
2025-04-24 17:25 ` Shakeel Butt
[not found] ` <6807a132.df0a0220.28dc80.a1f0SMTPIN_ADDED_BROKEN@mx.google.com>
2025-04-25 0:18 ` JP Kobryn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aAehK23MNX1FsRjF@Asmaa. \
--to=yosry.ahmed@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=inwardvessel@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-mm@kvack.org \
--cc=mkoutny@suse.com \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.