From: bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org (J. Bruce Fields)
To: Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org>
Cc: dsterba-AlSwsSmVLrQ@public.gmane.org,
Liu Bo <bo.li.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
fdmanana-IBi9RG/b67k@public.gmane.org,
kzak-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
mingming.cao-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org
Subject: Re: i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION)
Date: Thu, 25 Jun 2015 14:46:44 -0400 [thread overview]
Message-ID: <20150625184644.GA12300@fieldses.org> (raw)
In-Reply-To: <20150623163241.GA6645-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
On Tue, Jun 23, 2015 at 12:32:41PM -0400, Theodore Ts'o wrote:
> On Thu, Jun 18, 2015 at 04:38:56PM +0200, David Sterba wrote:
> > Moving the discussion to fsdevel.
> >
> > Summary: disabling MS_I_VERSION brings some speedups to btrfs, but the
> > generic 'noiversion' option cannot be used to achieve that. It is
> > processed before it reaches btrfs superblock callback, where
> > MS_I_VERSION is forced.
> >
> > The proposed fix is to add btrfs-specific i_version/noi_version to btrfs,
> > to which I object.
>
> I was talking to Mingming about this on today's ext4 conference call,
> and one of the reasons why ext4 turns off i_version update by default
> is because it does a real number on our performance as well --- and
> furthermore, the only real user of the field from what we can tell is
> NFSv4, which not all that many ext4 users actually care about.
>
> This has caused pain for the nfsv4 folks since it means that they need
> to tell people to use a special mount option for ext4 if they are
> actually using this for nfsv4, and I suspect they won't be all that
> eager to hear that btrfs is going to go the same way.
Yes, thanks for looking into this!
> This however got us thinking --- even in if NFSv4 is depending on
> i_version, it doesn't actually _look_ at that field all that often.
Most clients will query it on every write. (I just took a quick look at
the code and I believe the Linux client's requesting it immediately
after every write, except in the O_DIRECT and delegated cases.)
> It's only going to look at it in a response to a client's getattr
> call, and that in turn is used to so the client can do its local disk
> cache invalidation if anby of the data blocks of the inode has changed.
>
> So what if we have a per-inode flag which "don't update I_VERSION",
> which is off by default, but after the i_version has been updated at
> least once, is set, so the i_version field won't be updated again ---
> at least until something has actually looked at the i_version field,
> when the "don't update I_VERSOIN" flag will get cleared again.
>
> So basically, if we know there are no microphones in the forest, we
> don't need to make the tree fall. However, if someone has sampled the
> i_version field, then the next time the inode gets updated, we will
> update the i_version field so the NFSv4 client can hear the sound of
> the tree crashing to the forst floor and so it can invalidate its
> local cache of the file. :-)
>
> This should significantly improve the performance of using the
> i_version field if the file system is being exported via NFSv4, and if
> NFSv4 is not in use, no one will be looking at the i_version field, so
> the performance impact will be very slight, and thus we could enable
> i_version updates by default for btrfs and ext4.
>
> And this should make the distribution folks happy, since it will unify
> the behavior of all file systems, and make life easier for users who
> won't need to set certain magic mount options depending on what file
> system they are using and whether they are using NFSv4 or not.
>
> Does this sound reasonable?
Just to make sure I understand, the logic is something like:
to read the i_version:
inode->i_version_seen = true;
return inode->i_version
to update the i_version:
/*
* If nobody's seen this value of i_version then we can
* keep using it, otherwise we need a new one:
*/
if (inode->i_version_seen)
inode->i_version++;
inode->i_version_seen = false;
Looks OK to me. As I say I'd expect i_version_seen == true to end up
being the common case in a lot of v4 workloads, so I'm more skeptical of
the claim of a performance improvement in the v4 case.
Could maintaining the new flag be a significant drag in itself? If not,
then I guess we're not making things any worse there, so fine.
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-06-25 18:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1434527672-5762-1-git-send-email-bo.li.liu@oracle.com>
[not found] ` <20150617153306.GY6761@twin.jikos.cz>
[not found] ` <20150617155234.GB7773@localhost.localdomain>
[not found] ` <20150617170118.GA6761@twin.jikos.cz>
[not found] ` <20150618024607.GA8530@localhost.localdomain>
2015-06-18 14:38 ` i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION) David Sterba
2015-06-19 11:44 ` Karel Zak
2015-06-22 20:42 ` Dave Chinner
2015-06-24 17:28 ` David Sterba
2015-06-23 16:32 ` Theodore Ts'o
2015-06-24 8:23 ` Liu Bo
2015-06-24 18:02 ` David Sterba
[not found] ` <20150624180215.GC726-AlSwsSmVLrQ@public.gmane.org>
2015-06-24 23:17 ` Theodore Ts'o
[not found] ` <20150624231750.GE14324-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2015-06-24 23:59 ` Dave Chinner
[not found] ` <20150623163241.GA6645-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2015-06-25 18:46 ` J. Bruce Fields [this message]
2015-06-25 22:12 ` Theodore Ts'o
2015-06-26 13:32 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150625184644.GA12300@fieldses.org \
--to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
--cc=bo.li.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=dsterba-AlSwsSmVLrQ@public.gmane.org \
--cc=fdmanana-IBi9RG/b67k@public.gmane.org \
--cc=kzak-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mingming.cao-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=tytso-3s7WtUTddSA@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).