public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC] make take_dentry_name_snapshot() lockless
@ 2024-12-09  3:52 Al Viro
  2024-12-09  6:33 ` Mateusz Guzik
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Al Viro @ 2024-12-09  3:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Linus Torvalds

There's a bunch of places where we are accessing dentry names without
sufficient protection and where locking environment is not predictable
enough to fix the things that way; take_dentry_name_snapshot() is
one variant of solution.  It does, however, have a problem - copying
is cheap, but bouncing ->d_lock may be nasty on seriously shared dentries.

How about the following (completely untested)?

Use ->d_seq instead of grabbing ->d_lock; in case of shortname dentries
that avoids any stores to shared data objects and in case of long names
we are down to (unavoidable) atomic_inc on the external_name refcount.
    
Makes the thing safer as well - the areas where ->d_seq is held odd are
all nested inside the areas where ->d_lock is held, and the latter are
much more numerous.
    
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/dcache.c b/fs/dcache.c
index b4d5e9e1e43d..78fd7e2a3011 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -329,16 +329,34 @@ static inline int dname_external(const struct dentry *dentry)
 
 void take_dentry_name_snapshot(struct name_snapshot *name, struct dentry *dentry)
 {
-	spin_lock(&dentry->d_lock);
+	unsigned seq;
+
+	rcu_read_lock();
+retry:
+	seq = read_seqcount_begin(&dentry->d_seq);
 	name->name = dentry->d_name;
-	if (unlikely(dname_external(dentry))) {
-		atomic_inc(&external_name(dentry)->u.count);
-	} else {
-		memcpy(name->inline_name, dentry->d_iname,
-		       dentry->d_name.len + 1);
+	if (read_seqcount_retry(&dentry->d_seq, seq))
+		goto retry;
+	// ->name and ->len are at least consistent with each other, so if
+	// ->name points to dentry->d_iname, ->len is below DNAME_INLINE_LEN
+	if (likely(name->name.name == dentry->d_iname)) {
+		memcpy(name->inline_name, dentry->d_iname, name->name.len + 1);
 		name->name.name = name->inline_name;
+		if (read_seqcount_retry(&dentry->d_seq, seq))
+			goto retry;
+	} else {
+		struct external_name *p;
+		p = container_of(name->name.name, struct external_name, name[0]);
+		// get a valid reference
+		if (unlikely(!atomic_inc_not_zero(&p->u.count)))
+			goto retry;
+		if (read_seqcount_retry(&dentry->d_seq, seq)) {
+			if (unlikely(atomic_dec_and_test(&p->u.count)))
+				kfree_rcu(p, u.head);
+			goto retry;
+		}
 	}
-	spin_unlock(&dentry->d_lock);
+	rcu_read_unlock();
 }
 EXPORT_SYMBOL(take_dentry_name_snapshot);
 

^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-12-24 20:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-09  3:52 [PATCH][RFC] make take_dentry_name_snapshot() lockless Al Viro
2024-12-09  6:33 ` Mateusz Guzik
2024-12-09  6:58   ` Al Viro
2024-12-09  7:18     ` Mateusz Guzik
2024-12-09  7:41       ` Al Viro
2024-12-09 18:27 ` Linus Torvalds
2024-12-09 21:17   ` Al Viro
2024-12-09 22:28     ` Al Viro
2024-12-09 22:49       ` Linus Torvalds
2024-12-09 22:55         ` Linus Torvalds
2024-12-09 23:12           ` Al Viro
2024-12-10  2:45             ` Al Viro
2024-12-10  2:48               ` [PATCH 1/5] make sure that DCACHE_INLINE_LEN is a multiple of word size Al Viro
2024-12-10  2:48                 ` [PATCH 2/5] dcache: back inline names with a struct-wrapped array of unsigned long Al Viro
2024-12-10  2:48                 ` [PATCH 3/5] make take_dentry_name_snapshot() lockless Al Viro
2024-12-10  2:48                 ` [PATCH 4/5] dissolve external_name.u into separate members Al Viro
2024-12-10  2:48                 ` [PATCH 5/5] ext4 fast_commit: make use of name_snapshot primitives Al Viro
2024-12-23  4:25               ` [PATCH][RFC] make take_dentry_name_snapshot() lockless Al Viro
2024-12-23  4:37                 ` Al Viro
2024-12-23 21:31                 ` Jens Axboe
2024-12-24 19:18                   ` Al Viro
2024-12-24 19:44                     ` Linus Torvalds
2024-12-24 20:24                       ` Al Viro
2024-12-09 19:06 ` Linus Torvalds
2024-12-09 20:27   ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox