From: Mateusz Guzik <mjguzik@gmail.com>
To: brauner@kernel.org
Cc: viro@zeniv.linux.org.uk, jack@suse.cz,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Mateusz Guzik <mjguzik@gmail.com>
Subject: [RFC PATCH] fs: call inode_sb_list_add() outside of inode hash lock
Date: Thu, 20 Mar 2025 01:46:43 +0100 [thread overview]
Message-ID: <20250320004643.1903287-1-mjguzik@gmail.com> (raw)
As both locks are highly contended during significant inode churn,
holding the inode hash lock while waiting for the sb list lock
exacerbates the problem.
Why moving it out is safe: the inode at hand still has I_NEW set and
anyone who finds it through legitimate means waits for the bit to clear,
by which time inode_sb_list_add() is guaranteed to have finished.
This significantly drops hash lock contention for me when stating 20
separate trees in parallel, each with 1000 directories * 1000 files.
However, no speed up was observed as contention increased on the other
locks, notably dentry LRU.
Even so, removal of the lock ordering will help making this faster
later.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
There were ideas about using bitlocks, which ran into trouble because of
spinlocks being taken *after* and RT kernel not liking that.
I'm thinking thanks to RCU this very much can be hacked around to
reverse the hash-specific lock -> inode lock: you find the inode you
are looking for, lock it and only then take the bitlock and validate it
remained in place.
The above patch removes an impediment -- the only other lock possibly
taken with the hash thing.
Even if the above idea does not pan out, the patch has some value in
decoupling these.
I am however not going to strongly argue for it given lack of results.
fs/inode.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index e188bb1eb07a..8efd38c27321 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1307,8 +1307,8 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
}
if (set && unlikely(set(inode, data))) {
- inode = NULL;
- goto unlock;
+ spin_unlock(&inode_hash_lock);
+ return NULL;
}
/*
@@ -1320,14 +1320,14 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
hlist_add_head_rcu(&inode->i_hash, head);
spin_unlock(&inode->i_lock);
+ spin_unlock(&inode_hash_lock);
+
/*
* Add inode to the sb list if it's not already. It has I_NEW at this
* point, so it should be safe to test i_sb_list locklessly.
*/
if (list_empty(&inode->i_sb_list))
inode_sb_list_add(inode);
-unlock:
- spin_unlock(&inode_hash_lock);
return inode;
}
@@ -1456,8 +1456,8 @@ struct inode *iget_locked(struct super_block *sb, unsigned long ino)
inode->i_state = I_NEW;
hlist_add_head_rcu(&inode->i_hash, head);
spin_unlock(&inode->i_lock);
- inode_sb_list_add(inode);
spin_unlock(&inode_hash_lock);
+ inode_sb_list_add(inode);
/* Return the locked inode with I_NEW set, the
* caller is responsible for filling in the contents
--
2.43.0
next reply other threads:[~2025-03-20 0:46 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 0:46 Mateusz Guzik [this message]
2025-03-20 11:05 ` [RFC PATCH] fs: call inode_sb_list_add() outside of inode hash lock Jan Kara
2025-03-20 12:07 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250320004643.1903287-1-mjguzik@gmail.com \
--to=mjguzik@gmail.com \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).