From: Dave Chinner <david@fromorbit.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
viro@zeniv.linux.org.uk, linux-nfs@vger.kernel.org,
neilb@suse.de, jack@suse.de, linux-ext4@vger.kernel.org,
tytso@mit.edu, adilger.kernel@dilger.ca,
linux-xfs@vger.kernel.org, darrick.wong@oracle.com,
linux-btrfs@vger.kernel.org, clm@fb.com, jbacik@fb.com,
dsterba@suse.com, linux-integrity@vger.kernel.org,
zohar@linux.vnet.ibm.com, dmitry.kasatkin@gmail.com,
linux-afs@lists.infradead.org, dhowells@redhat.com,
jaltman@auristor.com
Subject: Re: [PATCH v3 19/19] fs: handle inode->i_version more efficiently
Date: Tue, 19 Dec 2017 09:07:59 +1100 [thread overview]
Message-ID: <20171218220759.GF4094@dastard> (raw)
In-Reply-To: <1513625720.3422.68.camel@kernel.org>
On Mon, Dec 18, 2017 at 02:35:20PM -0500, Jeff Layton wrote:
> [PATCH] SQUASH: add memory barriers around i_version accesses
Why explicit memory barriers rather than annotating the operations
with the required semantics and getting the barriers the arch
requires automatically? I suspect this should be using
atomic_read_acquire() and atomic_cmpxchg_release(), because AFAICT
the atomic_cmpxchg needs to have release semantics to match the
acquire semantics needed for the load of the current value.
>From include/linux/atomics.h:
* For compound atomics performing both a load and a store, ACQUIRE
* semantics apply only to the load and RELEASE semantics only to the
* store portion of the operation. Note that a failed cmpxchg_acquire
* does -not- imply any memory ordering constraints.
Memory barriers hurt my brain. :/
At minimum, shouldn't the atomic op specific barriers be used rather
than full memory barriers? i.e:
> diff --git a/include/linux/iversion.h b/include/linux/iversion.h
> index a9fbf99709df..39ec9aa9e08e 100644
> --- a/include/linux/iversion.h
> +++ b/include/linux/iversion.h
> @@ -87,6 +87,25 @@ static inline void
> inode_set_iversion_raw(struct inode *inode, const u64 val)
> {
> atomic64_set(&inode->i_version, val);
> + smp_wmb();
smp_mb__after_atomic();
.....
> +static inline u64
> +inode_peek_iversion_raw(const struct inode *inode)
> +{
> + smp_rmb();
smp_mb__before_atomic();
> + return atomic64_read(&inode->i_version);
> }
And, of course, these will require comments explaining what they
match with and what they are ordering.
> @@ -152,7 +171,7 @@ inode_maybe_inc_iversion(struct inode *inode, bool force)
> {
> u64 cur, old, new;
>
> - cur = (u64)atomic64_read(&inode->i_version);
> + cur = inode_peek_iversion_raw(inode);
> for (;;) {
> /* If flag is clear then we needn't do anything */
> if (!force && !(cur & I_VERSION_QUERIED))
cmpxchg in this loop needs a release barrier so everyone will
see the change?
> @@ -183,23 +202,6 @@ inode_inc_iversion(struct inode *inode)
> inode_maybe_inc_iversion(inode, true);
> }
>
> -/**
> - * inode_peek_iversion_raw - grab a "raw" iversion value
> - * @inode: inode from which i_version should be read
> - *
> - * Grab a "raw" inode->i_version value and return it. The i_version is not
> - * flagged or converted in any way. This is mostly used to access a self-managed
> - * i_version.
> - *
> - * With those filesystems, we want to treat the i_version as an entirely
> - * opaque value.
> - */
> -static inline u64
> -inode_peek_iversion_raw(const struct inode *inode)
> -{
> - return atomic64_read(&inode->i_version);
> -}
> -
> /**
> * inode_iversion_need_inc - is the i_version in need of being incremented?
> * @inode: inode to check
> @@ -248,7 +250,7 @@ inode_query_iversion(struct inode *inode)
> {
> u64 cur, old, new;
>
> - cur = atomic64_read(&inode->i_version);
> + cur = inode_peek_iversion_raw(inode);
> for (;;) {
> /* If flag is already set, then no need to swap */
> if (cur & I_VERSION_QUERIED)
cmpxchg in this loop needs a release barrier so everyone will
see the change on load?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2017-12-18 22:08 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-18 15:11 [PATCH v3 00/19] fs: rework and optimize i_version handling in filesystems Jeff Layton
2017-12-18 15:11 ` [PATCH v3 01/19] fs: new API for handling inode->i_version Jeff Layton
2017-12-18 17:46 ` Jeff Layton
2017-12-18 15:11 ` [PATCH v3 02/19] fs: don't take the i_lock in inode_inc_iversion Jeff Layton
2017-12-18 15:11 ` [PATCH v3 03/19] fat: convert to new i_version API Jeff Layton
2017-12-18 15:11 ` [PATCH v3 04/19] affs: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 05/19] afs: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 06/19] btrfs: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 07/19] exofs: switch " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 08/19] ext2: convert " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 09/19] ext4: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 10/19] nfs: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 11/19] nfsd: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 12/19] ocfs2: " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 13/19] ufs: use " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 14/19] xfs: convert to " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 15/19] IMA: switch IMA over " Jeff Layton
2017-12-18 15:11 ` [PATCH v3 16/19] fs: only set S_VERSION when updating times if necessary Jeff Layton
2017-12-18 16:07 ` Jan Kara
2017-12-18 17:25 ` Jeff Layton
2017-12-18 15:11 ` [PATCH v3 17/19] xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing Jeff Layton
2017-12-18 15:11 ` [PATCH v3 18/19] btrfs: only dirty the inode in btrfs_update_time if something was changed Jeff Layton
2017-12-18 15:11 ` [PATCH v3 19/19] fs: handle inode->i_version more efficiently Jeff Layton
2017-12-18 16:34 ` Jan Kara
2017-12-18 17:22 ` Jeff Layton
2017-12-18 17:36 ` J. Bruce Fields
2017-12-18 19:35 ` Jeff Layton
2017-12-18 22:07 ` Dave Chinner [this message]
2017-12-20 14:03 ` Jeff Layton
2017-12-20 16:41 ` Jan Kara
2017-12-21 11:25 ` Jeff Layton
2017-12-21 11:48 ` Jan Kara
2017-12-19 9:29 ` Jan Kara
2017-12-19 17:14 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171218220759.GF4094@dastard \
--to=david@fromorbit.com \
--cc=adilger.kernel@dilger.ca \
--cc=bfields@fieldses.org \
--cc=clm@fb.com \
--cc=darrick.wong@oracle.com \
--cc=dhowells@redhat.com \
--cc=dmitry.kasatkin@gmail.com \
--cc=dsterba@suse.com \
--cc=jack@suse.cz \
--cc=jack@suse.de \
--cc=jaltman@auristor.com \
--cc=jbacik@fb.com \
--cc=jlayton@kernel.org \
--cc=linux-afs@lists.infradead.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=zohar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).