public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Imran Khan <imran.f.khan@oracle.com>
Cc: tj@kernel.org, gregkh@linuxfoundation.org,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH v7 1/8] kernfs: Introduce interface to access global kernfs_open_file_mutex.
Date: Thu, 17 Mar 2022 21:34:49 +0000	[thread overview]
Message-ID: <YjOpedPDj+3KCJjk@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <20220317072612.163143-2-imran.f.khan@oracle.com>

On Thu, Mar 17, 2022 at 06:26:05PM +1100, Imran Khan wrote:

> @@ -570,9 +571,10 @@ static void kernfs_put_open_node(struct kernfs_node *kn,
>  				 struct kernfs_open_file *of)
>  {
>  	struct kernfs_open_node *on = kn->attr.open;
> +	struct mutex *mutex = NULL;
>  	unsigned long flags;
>  
> -	mutex_lock(&kernfs_open_file_mutex);
> +	mutex = kernfs_open_file_mutex_lock(kn);
>  	spin_lock_irqsave(&kernfs_open_node_lock, flags);

Can that ever be reached with local interrupts disabled?  I mean, what is
that spin_lock_irqsave() about?

> @@ -745,11 +747,12 @@ static int kernfs_fop_release(struct inode *inode, struct file *filp)
>  {
>  	struct kernfs_node *kn = inode->i_private;
>  	struct kernfs_open_file *of = kernfs_of(filp);
> +	struct mutex *lock = NULL;
>  
>  	if (kn->flags & KERNFS_HAS_RELEASE) {
> -		mutex_lock(&kernfs_open_file_mutex);
> +		lock = kernfs_open_file_mutex_lock(kn);
>  		kernfs_release_file(kn, of);
> -		mutex_unlock(&kernfs_open_file_mutex);
> +		mutex_unlock(lock);

Careful - you are about to remove the existing exclusion between *all*
->release() instances, same node or not.

In particular, if some driver had them manipulate a driver-local list of
some kind, relying upon the kernfs to provide the exclusion, it'd break
as soon as you turn that thing into per-node (or hashed) mutex.

It's _probably_ safe, seeing that the one and only instance of ->release()
in the entire tree (cgroup_file_release()) is rather limited in what
it's doing, and while it calls a submethod (cftype.release()) there's only
a couple of instances of that (cgroup_procs_release() and
cgroup_pressure_release(), both in kernel/cgroup/cgroup.c).  Neither
seems to rely upon the global exclusion.

However, that's a change of rules and it needs to be documented as such.

Incidentally, what's the point of having kernfs_open_node->refcnt
atomic_t?  All users are under kernfs_open_node_lock...  AFAICS,
it's simply "->files is non-empty or something is in
kernfs_drain_open_files() for the node in question", so I'm not
even sure we want a counter there...

Note that kernfs_drain_open_files() can't overlap with
kernfs_fops_open() adding to the list of files (and we seriously
rely upon that - you don't want ops->release() called while in
the middle of ops->open()).  kernfs_fops_open() starts with
grabbing an active reference; kernfs_drain_open_files() is
not called until we had
	* prevented new active references being grabbed and
	* waited for all active references to be dropped.

So kernfs_drain_open_files() can do the following:
	1) optimistically check for ->attr.open being NULL;
bugger off if it is.  We know that nobody could be currently
trying to add anything to it, mutex or no mutex.
	2) grab the mutex
	3) recheck ->attr.open; it might have become NULL.
If it had, unlock and bugger off.
	4) walk the list, doing unmaps/releases.
	5) unlock and bugger off.
The only thing doing removals from the list is
kernfs_put_open_node() and it grabs that mutex.
So it can't get to the "remove from list, free the container
of list head" until we are through.

IOW, there's no reason to hold a reference to kernfs_open_node
in kernfs_drain_open_files() at all.  And that makes ->refcnt
completely useless - kernfs_put_open_node() should do
	list_del(&of->list);
	if (list_empty(&on->files))
		kn->attr.open = NULL;
	else
		on = NULL;
and to hell with refcounting.

As the matter of fact, we can do even better - make freeing
that thing rcu-delayed, use rcu_assign_pointer() for stores,
rcu_dereference() for loads and have kernfs_notify() do
	rcu_read_lock();
	on = rcu_dereference(kn->attr.open);
	if (on) {
		atomic_inc(&on->event);
		wake_up_interruptible(&on->poll);
	}
	rcu_read_unlock();
and kernfs_open_node_lock becomes useless - all places that
grab it are under kernfs_open_file_mutex.

  reply	other threads:[~2022-03-17 21:37 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17  7:26 [RESEND PATCH v7 0/8] kernfs: Introduce interface to access global kernfs_open_file_mutex Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 1/8] " Imran Khan
2022-03-17 21:34   ` Al Viro [this message]
2022-04-05  5:36     ` Imran Khan
2022-04-05 14:24       ` Al Viro
2022-04-06  4:54         ` Imran Khan
2022-04-06 14:54           ` Al Viro
2022-04-06 15:18             ` Tejun Heo
2022-04-14  0:01           ` Imran Khan
2022-03-18 17:10   ` Eric W. Biederman
2022-03-21  0:10     ` Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 2/8] kernfs: Replace global kernfs_open_file_mutex with hashed mutexes Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 3/8] kernfs: Introduce interface to access kernfs_open_node_lock Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 4/8] kernfs: Replace global kernfs_open_node_lock with hashed spinlocks Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 5/8] kernfs: Use a per-fs rwsem to protect per-fs list of kernfs_super_info Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 6/8] kernfs: Introduce interface to access per-fs rwsem Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems Imran Khan
2022-03-18  0:07   ` Al Viro
2022-03-21  1:57     ` Imran Khan
2022-03-21  7:29       ` Al Viro
2022-03-21 16:46         ` Tejun Heo
2022-03-21 17:55           ` Al Viro
2022-03-21 19:20             ` Tejun Heo
2022-03-22  2:40               ` Al Viro
2022-03-22 17:08                 ` Tejun Heo
2022-03-22 20:26                   ` Al Viro
2022-03-22 21:20                     ` Tejun Heo
2022-03-28  0:15                 ` Imran Khan
2022-03-28 17:30                   ` Tejun Heo
2022-03-30  2:23                 ` Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 8/8] kernfs: Add a document to describe hashed locks used in kernfs Imran Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjOpedPDj+3KCJjk@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=imran.f.khan@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox