linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org (J. Bruce Fields)
To: Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org>
Cc: dsterba-AlSwsSmVLrQ@public.gmane.org,
	Liu Bo <bo.li.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	fdmanana-IBi9RG/b67k@public.gmane.org,
	kzak-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	mingming.cao-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org
Subject: Re: i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION)
Date: Thu, 25 Jun 2015 14:46:44 -0400	[thread overview]
Message-ID: <20150625184644.GA12300@fieldses.org> (raw)
In-Reply-To: <20150623163241.GA6645-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>

On Tue, Jun 23, 2015 at 12:32:41PM -0400, Theodore Ts'o wrote:
> On Thu, Jun 18, 2015 at 04:38:56PM +0200, David Sterba wrote:
> > Moving the discussion to fsdevel.
> > 
> > Summary: disabling MS_I_VERSION brings some speedups to btrfs, but the
> > generic 'noiversion' option cannot be used to achieve that. It is
> > processed before it reaches btrfs superblock callback, where
> > MS_I_VERSION is forced.
> > 
> > The proposed fix is to add btrfs-specific i_version/noi_version to btrfs,
> > to which I object.
> 
> I was talking to Mingming about this on today's ext4 conference call,
> and one of the reasons why ext4 turns off i_version update by default
> is because it does a real number on our performance as well --- and
> furthermore, the only real user of the field from what we can tell is
> NFSv4, which not all that many ext4 users actually care about.
> 
> This has caused pain for the nfsv4 folks since it means that they need
> to tell people to use a special mount option for ext4 if they are
> actually using this for nfsv4, and I suspect they won't be all that
> eager to hear that btrfs is going to go the same way.

Yes, thanks for looking into this!

> This however got us thinking --- even in if NFSv4 is depending on
> i_version, it doesn't actually _look_ at that field all that often.

Most clients will query it on every write.  (I just took a quick look at
the code and I believe the Linux client's requesting it immediately
after every write, except in the O_DIRECT and delegated cases.)

> It's only going to look at it in a response to a client's getattr
> call, and that in turn is used to so the client can do its local disk
> cache invalidation if anby of the data blocks of the inode has changed.
> 
> So what if we have a per-inode flag which "don't update I_VERSION",
> which is off by default, but after the i_version has been updated at
> least once, is set, so the i_version field won't be updated again ---
> at least until something has actually looked at the i_version field,
> when the "don't update I_VERSOIN" flag will get cleared again.
> 
> So basically, if we know there are no microphones in the forest, we
> don't need to make the tree fall.  However, if someone has sampled the
> i_version field, then the next time the inode gets updated, we will
> update the i_version field so the NFSv4 client can hear the sound of
> the tree crashing to the forst floor and so it can invalidate its
> local cache of the file.  :-)
> 
> This should significantly improve the performance of using the
> i_version field if the file system is being exported via NFSv4, and if
> NFSv4 is not in use, no one will be looking at the i_version field, so
> the performance impact will be very slight, and thus we could enable
> i_version updates by default for btrfs and ext4.
> 
> And this should make the distribution folks happy, since it will unify
> the behavior of all file systems, and make life easier for users who
> won't need to set certain magic mount options depending on what file
> system they are using and whether they are using NFSv4 or not.
> 
> Does this sound reasonable?

Just to make sure I understand, the logic is something like:

	to read the i_version:

		inode->i_version_seen = true;
		return inode->i_version

	to update the i_version:

		/*
		 * If nobody's seen this value of i_version then we can
		 * keep using it, otherwise we need a new one:
		 */
		if (inode->i_version_seen)
			inode->i_version++;
		inode->i_version_seen = false;

Looks OK to me.  As I say I'd expect i_version_seen == true to end up
being the common case in a lot of v4 workloads, so I'm more skeptical of
the claim of a performance improvement in the v4 case.

Could maintaining the new flag be a significant drag in itself?  If not,
then I guess we're not making things any worse there, so fine.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-06-25 18:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1434527672-5762-1-git-send-email-bo.li.liu@oracle.com>
     [not found] ` <20150617153306.GY6761@twin.jikos.cz>
     [not found]   ` <20150617155234.GB7773@localhost.localdomain>
     [not found]     ` <20150617170118.GA6761@twin.jikos.cz>
     [not found]       ` <20150618024607.GA8530@localhost.localdomain>
2015-06-18 14:38         ` i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION) David Sterba
2015-06-19 11:44           ` Karel Zak
2015-06-22 20:42           ` Dave Chinner
2015-06-24 17:28             ` David Sterba
2015-06-23 16:32           ` Theodore Ts'o
2015-06-24  8:23             ` Liu Bo
2015-06-24 18:02             ` David Sterba
     [not found]               ` <20150624180215.GC726-AlSwsSmVLrQ@public.gmane.org>
2015-06-24 23:17                 ` Theodore Ts'o
     [not found]                   ` <20150624231750.GE14324-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2015-06-24 23:59                     ` Dave Chinner
     [not found]             ` <20150623163241.GA6645-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2015-06-25 18:46               ` J. Bruce Fields [this message]
2015-06-25 22:12                 ` Theodore Ts'o
2015-06-26 13:32                   ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150625184644.GA12300@fieldses.org \
    --to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
    --cc=bo.li.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=dsterba-AlSwsSmVLrQ@public.gmane.org \
    --cc=fdmanana-IBi9RG/b67k@public.gmane.org \
    --cc=kzak-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mingming.cao-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=tytso-3s7WtUTddSA@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).