public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Tejun Heo <tj@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>,
	gregkh@linuxfoundation.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems.
Date: Mon, 21 Mar 2022 17:55:53 +0000	[thread overview]
Message-ID: <Yji8KT2K7ZKOQ+6S@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <Yjir/d5S3J1PTiux@slm.duckdns.org>

On Mon, Mar 21, 2022 at 06:46:53AM -1000, Tejun Heo wrote:
> On Mon, Mar 21, 2022 at 07:29:45AM +0000, Al Viro wrote:
> ...
> > stabilizing the tree topology.  Turn it into rwlock if you wish,
> > with that thing being a reader and existing users - writers.
> > And don't bother with further scaling, until and unless you see a real
> > contention on it.
> 
> Given how rare these renames are, in the (unlikely) case the rename rwsem
> becomes a problem, we should probably just switch it to a percpu_rwsem.

Why bother with rwsem, when we don't need anything blocking under it?
DEFINE_RWLOCK instead of DEFINE_SPINLOCK and don't make it static.

Again, we already have a spinlock protecting ->parent and ->name.
Existing users:

kernfs_name() - can be shared.
kernfs_path_from_node() - can be shared.

pr_cont_kernfs_name() - exclusive, since that thing works into a static buffer.
pr_cont_kernfs_path() - exclusive, same reasons.

kernfs_get_parent() - can be shared, but its callers need to be reviewed;
that's the prime breeding ground for rename races.

kernfs_walk_ns() - this is fucking insane; on the surface, it needs to
be exclusive due to the use of the same static buffer.  It uses that
buffer to generate a pathname, *THEN* walks over it with strsep().
That's an... interesting approach, for the lack of other printable
terms - we walk the chain of ancestors, concatenating their names
into a buffer and separating those names with slashes, then we walk
that buffer, searching for slashes...  WTF?

kernfs_rename_ns() - exclusive; that's where the tree topology gets
changed.

So we can just turn that spinlock into rwlock, replace the existing
uses with read_lock()/read_unlock() in kernfs_{name,path_from_node,get_parent}
and with write_lock()/write_unlock() in the rest of fs/kernfs/dir.c,
make it non-static, put extern into kernfs-internal.h and there you
go...

Wait a sec; what happens if e.g. kernfs_path_from_node() races with
__kernfs_remove()?  We do _not_ clear ->parent, but we do drop references
that used to pin what it used to point to, unless I'm misreading that
code...  Or is it somehow prevented by drain-related logics?  Seeing
that it seems to be possible to have kernfs_path_from_node() called from
an interrupt context, that could be delicate...

  reply	other threads:[~2022-03-21 17:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17  7:26 [RESEND PATCH v7 0/8] kernfs: Introduce interface to access global kernfs_open_file_mutex Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 1/8] " Imran Khan
2022-03-17 21:34   ` Al Viro
2022-04-05  5:36     ` Imran Khan
2022-04-05 14:24       ` Al Viro
2022-04-06  4:54         ` Imran Khan
2022-04-06 14:54           ` Al Viro
2022-04-06 15:18             ` Tejun Heo
2022-04-14  0:01           ` Imran Khan
2022-03-18 17:10   ` Eric W. Biederman
2022-03-21  0:10     ` Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 2/8] kernfs: Replace global kernfs_open_file_mutex with hashed mutexes Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 3/8] kernfs: Introduce interface to access kernfs_open_node_lock Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 4/8] kernfs: Replace global kernfs_open_node_lock with hashed spinlocks Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 5/8] kernfs: Use a per-fs rwsem to protect per-fs list of kernfs_super_info Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 6/8] kernfs: Introduce interface to access per-fs rwsem Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems Imran Khan
2022-03-18  0:07   ` Al Viro
2022-03-21  1:57     ` Imran Khan
2022-03-21  7:29       ` Al Viro
2022-03-21 16:46         ` Tejun Heo
2022-03-21 17:55           ` Al Viro [this message]
2022-03-21 19:20             ` Tejun Heo
2022-03-22  2:40               ` Al Viro
2022-03-22 17:08                 ` Tejun Heo
2022-03-22 20:26                   ` Al Viro
2022-03-22 21:20                     ` Tejun Heo
2022-03-28  0:15                 ` Imran Khan
2022-03-28 17:30                   ` Tejun Heo
2022-03-30  2:23                 ` Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 8/8] kernfs: Add a document to describe hashed locks used in kernfs Imran Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yji8KT2K7ZKOQ+6S@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=imran.f.khan@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox