public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: David Howells <dhowells@redhat.com>
Cc: linux-afs@lists.infradead.org,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/27] vfs, afs, ext4: Make the inode hash table RCU searchable
Date: Sun, 31 May 2020 14:09:24 +0100	[thread overview]
Message-ID: <20200531130924.GY23230@ZenIV.linux.org.uk> (raw)
In-Reply-To: <159078960778.679399.5682252853189947919.stgit@warthog.procyon.org.uk>

On Fri, May 29, 2020 at 11:00:07PM +0100, David Howells wrote:

> @@ -1245,15 +1282,9 @@ static int test_inode_iunique(struct super_block *sb, unsigned long ino)
>  	struct inode *inode;
>  
>  	spin_lock(&inode_hash_lock);
> -	hlist_for_each_entry(inode, b, i_hash) {
> -		if (inode->i_ino == ino && inode->i_sb == sb) {
> -			spin_unlock(&inode_hash_lock);
> -			return 0;
> -		}
> -	}
> +	inode = __find_inode_by_ino_rcu(sb, b, ino);
>  	spin_unlock(&inode_hash_lock);
> -
> -	return 1;
> +	return inode ? 0 : 1;
>  }

Nit: that's really return !inode

> +/**
> + * find_inode_rcu - find an inode in the inode cache
> + * @sb:		Super block of file system to search
> + * @hashval:	Key to hash
> + * @test:	Function to test match on an inode
> + * @data:	Data for test function
> + *
> + * Search for the inode specified by @hashval and @data in the inode cache,
> + * where the helper function @test will return 0 if the inode does not match
> + * and 1 if it does.  The @test function must be responsible for taking the
> + * i_lock spin_lock and checking i_state for an inode being freed or being
> + * initialized.
> + *
> + * If successful, this will return the inode for which the @test function
> + * returned 1 and NULL otherwise.
> + *
> + * The @test function is not permitted to take a ref on any inode presented
> + * unless the caller is holding the inode hashtable lock.  It is also not
> + * permitted to sleep, since it may be called with the RCU read lock held.
> + *
> + * The caller must hold either the RCU read lock or the inode hashtable lock.

Just how could that caller be holding inode_hash_lock?  It's static and IMO
should remain such - it's too low-level detail of fs/inode.c for having the
code outside play with it.

Require the caller to hold rcu_read_lock() and make "not permitted to take
a ref or sleep" unconditional.

> +struct inode *find_inode_rcu(struct super_block *sb, unsigned long hashval,
> +			     int (*test)(struct inode *, void *), void *data)
> +{
> +	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
> +	struct inode *inode;
> +
> +	RCU_LOCKDEP_WARN(!lockdep_is_held(&inode_hash_lock) && !rcu_read_lock_held(),
> +			 "suspicious find_inode_by_ino_rcu() usage");

... and modify that RCU_LOCKDEP_WARN (including the function name, preferably ;-)

> +
> +	hlist_for_each_entry_rcu(inode, head, i_hash) {
> +		if (inode->i_sb == sb &&
> +		    !(READ_ONCE(inode->i_state) & (I_FREEING | I_WILL_FREE)) &&
> +		    test(inode, data))
> +			return inode;
> +	}
> +	return NULL;
> +}
> +EXPORT_SYMBOL(find_inode_rcu);
> +
> +/**
> + * find_inode_by_rcu - Find an inode in the inode cache
> + * @sb:		Super block of file system to search
> + * @ino:	The inode number to match
> + *
> + * Search for the inode specified by @hashval and @data in the inode cache,
> + * where the helper function @test will return 0 if the inode does not match
> + * and 1 if it does.  The @test function must be responsible for taking the
> + * i_lock spin_lock and checking i_state for an inode being freed or being
> + * initialized.
> + *
> + * If successful, this will return the inode for which the @test function
> + * returned 1 and NULL otherwise.
> + *
> + * The @test function is not permitted to take a ref on any inode presented
> + * unless the caller is holding the inode hashtable lock.  It is also not
> + * permitted to sleep, since it may be called with the RCU read lock held.
> + *
> + * The caller must hold either the RCU read lock or the inode hashtable lock.
> + */

Ditto.

> +struct inode *find_inode_by_ino_rcu(struct super_block *sb,
> +				    unsigned long ino)
> +{
> +	struct hlist_head *head = inode_hashtable + hash(sb, ino);
> +	struct inode *inode;
> +
> +	RCU_LOCKDEP_WARN(!lockdep_is_held(&inode_hash_lock) && !rcu_read_lock_held(),
> +			 "suspicious find_inode_by_ino_rcu() usage");
> +
> +	hlist_for_each_entry_rcu(inode, head, i_hash) {
> +		if (inode->i_ino == ino &&
> +		    inode->i_sb == sb &&
> +		    !(READ_ONCE(inode->i_state) & (I_FREEING | I_WILL_FREE)))
> +		    return inode;
> +	}
> +	return NULL;
> +}
> +EXPORT_SYMBOL(find_inode_by_ino_rcu);

> @@ -1540,6 +1652,7 @@ static void iput_final(struct inode *inode)
>  {
>  	struct super_block *sb = inode->i_sb;
>  	const struct super_operations *op = inode->i_sb->s_op;
> +	unsigned long state;
>  	int drop;
>  
>  	WARN_ON(inode->i_state & I_NEW);
> @@ -1555,16 +1668,20 @@ static void iput_final(struct inode *inode)
>  		return;
>  	}
>  
> +	state = READ_ONCE(inode->i_state);
>  	if (!drop) {
> -		inode->i_state |= I_WILL_FREE;
> +		WRITE_ONCE(inode->i_state, state | I_WILL_FREE);
>  		spin_unlock(&inode->i_lock);
> +
>  		write_inode_now(inode, 1);
> +
>  		spin_lock(&inode->i_lock);
> -		WARN_ON(inode->i_state & I_NEW);
> -		inode->i_state &= ~I_WILL_FREE;
> +		state = READ_ONCE(inode->i_state);
> +		WARN_ON(state & I_NEW);
> +		state &= ~I_WILL_FREE;
>  	}
>  
> -	inode->i_state |= I_FREEING;
> +	WRITE_ONCE(inode->i_state, state | I_FREEING);
>  	if (!list_empty(&inode->i_lru))
>  		inode_lru_list_del(inode);
>  	spin_unlock(&inode->i_lock);

Umm..  I see the point of those WRITE_ONCE, but what's READ_ONCE for?

  reply	other threads:[~2020-05-31 13:09 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-29 21:59 [PATCH 00/27] afs: Improvements David Howells
2020-05-29 22:00 ` [PATCH 01/27] vfs, afs, ext4: Make the inode hash table RCU searchable David Howells
2020-05-31 13:09   ` Al Viro [this message]
2020-05-31 14:20     ` David Howells
2020-05-29 22:00 ` [PATCH 02/27] rxrpc: Map the EACCES error produced by some ICMP6 to EHOSTUNREACH David Howells
2020-05-29 22:00 ` [PATCH 03/27] rxrpc: Adjust /proc/net/rxrpc/calls to display call->debug_id not user_ID David Howells
2020-05-29 22:00 ` [PATCH 04/27] afs: Always include dir in bulk status fetch from afs_do_lookup() David Howells
2020-05-29 22:00 ` [PATCH 05/27] afs: Use the serverUnique field in the UVLDB record to reduce rpc ops David Howells
2020-05-29 22:00 ` [PATCH 06/27] afs: Split the usage count on struct afs_server David Howells
2020-05-29 22:00 ` [PATCH 07/27] afs: Actively poll fileservers to maintain NAT or firewall openings David Howells
2020-05-29 22:01 ` [PATCH 08/27] afs: Show more information in /proc/net/afs/servers David Howells
2020-05-29 22:01 ` [PATCH 09/27] afs: Make callback processing more efficient David Howells
2020-05-29 22:01 ` [PATCH 10/27] afs: Set error flag rather than return error from file status decode David Howells
2020-05-29 22:01 ` [PATCH 11/27] afs: Remove the error argument from afs_protocol_error() David Howells
2020-05-29 22:01 ` [PATCH 12/27] afs: Rename struct afs_fs_cursor to afs_operation David Howells
2020-05-29 22:01 ` [PATCH 13/27] afs: Build an abstraction around an "operation" concept David Howells
2020-05-29 22:01 ` [PATCH 14/27] afs: Don't get epoch from a server because it may be ambiguous David Howells
2020-05-29 22:01 ` [PATCH 15/27] afs: Fix handling of CB.ProbeUuid cache manager op David Howells
2020-05-29 22:02 ` [PATCH 16/27] afs: Retain more of the VLDB record for alias detection David Howells
2020-05-29 22:02 ` [PATCH 17/27] afs: Implement client support for the YFSVL.GetCellName RPC op David Howells
2020-05-29 22:02 ` [PATCH 18/27] afs: Detect cell aliases 1 - Cells with root volumes David Howells
2020-06-06  1:58   ` Kees Cook
2020-05-29 22:02 ` [PATCH 19/27] afs: Detect cell aliases 2 - Cells with no " David Howells
2020-05-29 22:02 ` [PATCH 20/27] afs: Detect cell aliases 3 - YFS Cells with a canonical cell name op David Howells
2020-05-29 22:02 ` [PATCH 21/27] afs: Add a tracepoint to track the lifetime of the afs_volume struct David Howells
2020-05-29 22:02 ` [PATCH 22/27] afs: Reorganise volume and server trees to be rooted on the cell David Howells
2020-05-29 22:02 ` [PATCH 23/27] afs: Fix the by-UUID server tree to allow servers with the same UUID David Howells
2020-05-29 22:02 ` [PATCH 24/27] afs: Fix afs_statfs() to not let the values go below zero David Howells
2020-05-29 22:03 ` [PATCH 25/27] afs: Don't use probe running state to make decisions outside probe code David Howells
2020-05-29 22:03 ` [PATCH 26/27] afs: Show more a bit more server state in /proc/net/afs/servers David Howells
2020-05-29 22:03 ` [PATCH 27/27] afs: Adjust the fileserver rotation algorithm to reprobe/retry more quickly David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200531130924.GY23230@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=dhowells@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox