public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, "Yafang Shao" <laoar.shao@gmail.com>,
	"Tejun Heo" <tj@kernel.org>, "Michal Koutný" <mkoutny@suse.com>
Subject: [PATCH 6.1 31/38] cgroup: Make operations on the cgroup root_list RCU safe
Date: Thu, 15 Aug 2024 15:26:05 +0200	[thread overview]
Message-ID: <20240815131834.154399557@linuxfoundation.org> (raw)
In-Reply-To: <20240815131832.944273699@linuxfoundation.org>

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yafang Shao <laoar.shao@gmail.com>

commit d23b5c577715892c87533b13923306acc6243f93 upstream.

At present, when we perform operations on the cgroup root_list, we must
hold the cgroup_mutex, which is a relatively heavyweight lock. In reality,
we can make operations on this list RCU-safe, eliminating the need to hold
the cgroup_mutex during traversal. Modifications to the list only occur in
the cgroup root setup and destroy paths, which should be infrequent in a
production environment. In contrast, traversal may occur frequently.
Therefore, making it RCU-safe would be beneficial.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/cgroup-defs.h     |    1 +
 kernel/cgroup/cgroup-internal.h |    3 ++-
 kernel/cgroup/cgroup.c          |   23 ++++++++++++++++-------
 3 files changed, 19 insertions(+), 8 deletions(-)

--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -540,6 +540,7 @@ struct cgroup_root {
 
 	/* A list running through the active hierarchies */
 	struct list_head root_list;
+	struct rcu_head rcu;
 
 	/* Hierarchy-specific flags */
 	unsigned int flags;
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -170,7 +170,8 @@ extern struct list_head cgroup_roots;
 
 /* iterate across the hierarchies */
 #define for_each_root(root)						\
-	list_for_each_entry((root), &cgroup_roots, root_list)
+	list_for_each_entry_rcu((root), &cgroup_roots, root_list,	\
+				lockdep_is_held(&cgroup_mutex))
 
 /**
  * for_each_subsys - iterate all enabled cgroup subsystems
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1346,7 +1346,7 @@ static void cgroup_exit_root_id(struct c
 
 void cgroup_free_root(struct cgroup_root *root)
 {
-	kfree(root);
+	kfree_rcu(root, rcu);
 }
 
 static void cgroup_destroy_root(struct cgroup_root *root)
@@ -1379,7 +1379,7 @@ static void cgroup_destroy_root(struct c
 	spin_unlock_irq(&css_set_lock);
 
 	if (!list_empty(&root->root_list)) {
-		list_del(&root->root_list);
+		list_del_rcu(&root->root_list);
 		cgroup_root_count--;
 	}
 
@@ -1419,7 +1419,15 @@ static inline struct cgroup *__cset_cgro
 		}
 	}
 
-	BUG_ON(!res_cgroup);
+	/*
+	 * If cgroup_mutex is not held, the cgrp_cset_link will be freed
+	 * before we remove the cgroup root from the root_list. Consequently,
+	 * when accessing a cgroup root, the cset_link may have already been
+	 * freed, resulting in a NULL res_cgroup. However, by holding the
+	 * cgroup_mutex, we ensure that res_cgroup can't be NULL.
+	 * If we don't hold cgroup_mutex in the caller, we must do the NULL
+	 * check.
+	 */
 	return res_cgroup;
 }
 
@@ -1468,7 +1476,6 @@ static struct cgroup *current_cgns_cgrou
 static struct cgroup *cset_cgroup_from_root(struct css_set *cset,
 					    struct cgroup_root *root)
 {
-	lockdep_assert_held(&cgroup_mutex);
 	lockdep_assert_held(&css_set_lock);
 
 	return __cset_cgroup_from_root(cset, root);
@@ -1476,7 +1483,9 @@ static struct cgroup *cset_cgroup_from_r
 
 /*
  * Return the cgroup for "task" from the given hierarchy. Must be
- * called with cgroup_mutex and css_set_lock held.
+ * called with css_set_lock held to prevent task's groups from being modified.
+ * Must be called with either cgroup_mutex or rcu read lock to prevent the
+ * cgroup root from being destroyed.
  */
 struct cgroup *task_cgroup_from_root(struct task_struct *task,
 				     struct cgroup_root *root)
@@ -2037,7 +2046,7 @@ void init_cgroup_root(struct cgroup_fs_c
 	struct cgroup_root *root = ctx->root;
 	struct cgroup *cgrp = &root->cgrp;
 
-	INIT_LIST_HEAD(&root->root_list);
+	INIT_LIST_HEAD_RCU(&root->root_list);
 	atomic_set(&root->nr_cgrps, 1);
 	cgrp->root = root;
 	init_cgroup_housekeeping(cgrp);
@@ -2120,7 +2129,7 @@ int cgroup_setup_root(struct cgroup_root
 	 * care of subsystems' refcounts, which are explicitly dropped in
 	 * the failure exit path.
 	 */
-	list_add(&root->root_list, &cgroup_roots);
+	list_add_rcu(&root->root_list, &cgroup_roots);
 	cgroup_root_count++;
 
 	/*



  parent reply	other threads:[~2024-08-15 14:06 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-15 13:25 [PATCH 6.1 00/38] 6.1.106-rc1 review Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 01/38] mptcp: pass addr to mptcp_pm_alloc_anno_list Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 02/38] mptcp: pm: reduce indentation blocks Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 03/38] mptcp: pm: dont try to create sf if alloc failed Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 04/38] mptcp: pm: do not ignore subflow if signal flag is also set Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 05/38] selftests: mptcp: join: test both signal & subflow Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 06/38] ASoC: topology: Clean up route loading Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 07/38] ASoC: topology: Fix route memory corruption Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 08/38] exec: Fix ToCToU between perm check and set-uid/gid usage Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 09/38] LoongArch: Define __ARCH_WANT_NEW_STAT in unistd.h Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 10/38] nfsd: move reply cache initialization into nfsd startup Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 11/38] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 12/38] NFSD: Refactor nfsd_reply_cache_free_locked() Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 13/38] NFSD: Rename nfsd_reply_cache_alloc() Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 14/38] NFSD: Replace nfsd_prune_bucket() Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 15/38] NFSD: Refactor the duplicate reply cache shrinker Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 16/38] NFSD: Rewrite synopsis of nfsd_percpu_counters_init() Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 17/38] NFSD: Fix frame size warning in svc_export_parse() Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 18/38] sunrpc: dont change ->sv_stats if it doesnt exist Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 19/38] nfsd: stop setting ->pg_stats for unused stats Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 20/38] sunrpc: pass in the sv_stats struct through svc_create_pooled Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 21/38] sunrpc: remove ->pg_stats from svc_program Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 22/38] sunrpc: use the struct net as the svc proc private Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 23/38] nfsd: rename NFSD_NET_* to NFSD_STATS_* Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 24/38] nfsd: expose /proc/net/sunrpc/nfsd in net namespaces Greg Kroah-Hartman
2024-08-15 13:25 ` [PATCH 6.1 25/38] nfsd: make all of the nfsd stats per-network namespace Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 26/38] nfsd: remove nfsd_stats, make th_cnt a global counter Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 27/38] nfsd: make svc_stat per-network namespace instead of global Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 28/38] nvme/pci: Add APST quirk for Lenovo N60z laptop Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 29/38] mptcp: fully established after ADD_ADDR echo on MPJ Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 30/38] drm/i915/gem: Fix Virtual Memory mapping boundaries calculation Greg Kroah-Hartman
2024-08-15 13:26 ` Greg Kroah-Hartman [this message]
2024-08-15 13:26 ` [PATCH 6.1 32/38] drm/i915: Add a function to mmap framebuffer obj Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 33/38] drm/i915: Fix a NULL vs IS_ERR() bug Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 34/38] drm/i915/gem: Adjust vma offset for framebuffer mmap offset Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 35/38] binfmt_flat: Fix corruption when not offsetting data start Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 36/38] cgroup: Move rcu_head up near the top of cgroup_root Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 37/38] wifi: cfg80211: restrict NL80211_ATTR_TXQ_QUANTUM values Greg Kroah-Hartman
2024-08-15 13:26 ` [PATCH 6.1 38/38] KVM: arm64: Dont pass a TLBI level hint when zapping table entries Greg Kroah-Hartman
2024-08-15 18:54 ` [PATCH 6.1 00/38] 6.1.106-rc1 review Pavel Machek
2024-08-15 18:55 ` Peter Schneider
2024-08-15 21:43 ` Florian Fainelli
2024-08-16  8:47 ` Anders Roxell
2024-08-16 11:24 ` Mark Brown
2024-08-16 19:44 ` Jon Hunter
2024-08-16 20:44 ` Ron Economos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240815131834.154399557@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=laoar.shao@gmail.com \
    --cc=mkoutny@suse.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox