cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: gregkh@linuxfoundation.org
Cc: kernel-team@fb.com, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org,
	namhyung@kernel.org, ast@kernel.org, daniel@iogearbox.net,
	Tejun Heo <tj@kernel.org>
Subject: [PATCH 04/10] kernfs: use dumber locking for kernfs_find_and_get_node_by_ino()
Date: Mon,  4 Nov 2019 15:59:38 -0800	[thread overview]
Message-ID: <20191104235944.3470866-5-tj@kernel.org> (raw)
In-Reply-To: <20191104235944.3470866-1-tj@kernel.org>

kernfs_find_and_get_node_by_ino() uses RCU protection.  It's currently
a bit buggy because it can look up a node which hasn't been activated
yet and thus may end up exposing a node that the kernfs user is still
prepping.

While it can be fixed by pushing it further in the current direction,
it's already complicated and isn't clear whether the complexity is
justified.  The main use of kernfs_find_and_get_node_by_ino() is for
exportfs operations.  They aren't super hot and all the follow-up
operations (e.g. mapping to path) use normal locking anyway.

Let's switch to a dumber locking scheme and protect the lookup with
kernfs_idr_lock.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 fs/kernfs/dir.c   | 45 +++++++++------------------------------------
 fs/kernfs/mount.c | 11 +----------
 2 files changed, 10 insertions(+), 46 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 7d4af6cea2a6..798f0f03b62b 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -508,10 +508,6 @@ void kernfs_put(struct kernfs_node *kn)
 	struct kernfs_node *parent;
 	struct kernfs_root *root;
 
-	/*
-	 * kernfs_node is freed with ->count 0, kernfs_find_and_get_node_by_ino
-	 * depends on this to filter reused stale node
-	 */
 	if (!kn || !atomic_dec_and_test(&kn->count))
 		return;
 	root = kernfs_root(kn);
@@ -646,11 +642,7 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 	kn->id.ino = ret;
 	kn->id.generation = gen;
 
-	/*
-	 * set ino first. This RELEASE is paired with atomic_inc_not_zero in
-	 * kernfs_find_and_get_node_by_ino
-	 */
-	atomic_set_release(&kn->count, 1);
+	atomic_set(&kn->count, 1);
 	atomic_set(&kn->active, KN_DEACTIVATED_BIAS);
 	RB_CLEAR_NODE(&kn->rb);
 
@@ -716,38 +708,19 @@ struct kernfs_node *kernfs_find_and_get_node_by_ino(struct kernfs_root *root,
 {
 	struct kernfs_node *kn;
 
-	rcu_read_lock();
+	spin_lock(&kernfs_idr_lock);
+
 	kn = idr_find(&root->ino_idr, ino);
 	if (!kn)
-		goto out;
+		goto err_unlock;
 
-	/*
-	 * Since kernfs_node is freed in RCU, it's possible an old node for ino
-	 * is freed, but reused before RCU grace period. But a freed node (see
-	 * kernfs_put) or an incompletedly initialized node (see
-	 * __kernfs_new_node) should have 'count' 0. We can use this fact to
-	 * filter out such node.
-	 */
-	if (!atomic_inc_not_zero(&kn->count)) {
-		kn = NULL;
-		goto out;
-	}
-
-	/*
-	 * The node could be a new node or a reused node. If it's a new node,
-	 * we are ok. If it's reused because of RCU (because of
-	 * SLAB_TYPESAFE_BY_RCU), the __kernfs_new_node always sets its 'ino'
-	 * before 'count'. So if 'count' is uptodate, 'ino' should be uptodate,
-	 * hence we can use 'ino' to filter stale node.
-	 */
-	if (kn->id.ino != ino)
-		goto out;
-	rcu_read_unlock();
+	if (unlikely(!atomic_inc_not_zero(&kn->count)))
+		goto err_unlock;
 
+	spin_unlock(&kernfs_idr_lock);
 	return kn;
-out:
-	rcu_read_unlock();
-	kernfs_put(kn);
+err_unlock:
+	spin_unlock(&kernfs_idr_lock);
 	return NULL;
 }
 
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 6c12fac2c287..067b7c380056 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -363,18 +363,9 @@ void kernfs_kill_sb(struct super_block *sb)
 
 void __init kernfs_init(void)
 {
-
-	/*
-	 * the slab is freed in RCU context, so kernfs_find_and_get_node_by_ino
-	 * can access the slab lock free. This could introduce stale nodes,
-	 * please see how kernfs_find_and_get_node_by_ino filters out stale
-	 * nodes.
-	 */
 	kernfs_node_cache = kmem_cache_create("kernfs_node_cache",
 					      sizeof(struct kernfs_node),
-					      0,
-					      SLAB_PANIC | SLAB_TYPESAFE_BY_RCU,
-					      NULL);
+					      0, SLAB_PANIC, NULL);
 
 	/* Creates slab cache for kernfs inode attributes */
 	kernfs_iattrs_cache  = kmem_cache_create("kernfs_iattrs_cache",
-- 
2.17.1


  parent reply	other threads:[~2019-11-04 23:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-04 23:59 [PATCHSET cgroup/for-5.5] kernfs,cgroup: support 64bit inos and unify cgroup IDs Tejun Heo
2019-11-04 23:59 ` [PATCH 01/10] kernfs: fix ino wrap-around detection Tejun Heo
2019-11-04 23:59 ` [PATCH 02/10] writeback: use ino_t for inodes in tracepoints Tejun Heo
2019-11-05  9:11   ` Jan Kara
2019-11-04 23:59 ` [PATCH 03/10] netprio: use css ID instead of cgroup ID Tejun Heo
2019-11-05 13:27   ` Neil Horman
2019-11-04 23:59 ` Tejun Heo [this message]
2019-11-04 23:59 ` [PATCH 05/10] kernfs: kernfs_find_and_get_node_by_ino() should only look up activated nodes Tejun Heo
2019-11-04 23:59 ` [PATCH 06/10] kernfs: convert kernfs_node->id from union kernfs_node_id to u64 Tejun Heo
2019-11-04 23:59 ` [PATCH 07/10] kernfs: combine ino/id lookup functions into kernfs_find_and_get_node_by_id() Tejun Heo
2019-11-04 23:59 ` [PATCH 08/10] kernfs: implement custom exportfs ops and fid type Tejun Heo
2019-11-04 23:59 ` [PATCH 09/10] kernfs: use 64bit inos if ino_t is 64bit Tejun Heo
2019-11-04 23:59 ` [PATCH 10/10] cgroup: use cgrp->kn->id as the cgroup ID Tejun Heo
2019-11-06 13:50 ` [PATCHSET cgroup/for-5.5] kernfs,cgroup: support 64bit inos and unify cgroup IDs Namhyung Kim
2019-11-07 15:31   ` Tejun Heo
2019-11-12 15:51 ` Tejun Heo
2019-11-12 16:08   ` Greg KH
2019-11-12 16:09 ` Greg KH
2019-11-12 16:19   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191104235944.3470866-5-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=ast@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).