public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Ben Myers <bpm@sgi.com>
To: Eric Sandeen <sandeen@sandeen.net>,
	Jean Noel Cordenner <jean-noel.cordenner@bull.net>,
	Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 4/4] xfs: open code inc_inode_iversion when logging an inode
Date: Mon, 30 Sep 2013 17:39:46 -0500	[thread overview]
Message-ID: <20130930223946.GQ1935@sgi.com> (raw)
In-Reply-To: <5249FA36.1070609@sandeen.net>

Hi Gents,

On Mon, Sep 30, 2013 at 05:24:54PM -0500, Eric Sandeen wrote:
> On 9/29/13 6:37 PM, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Michael L Semon reported that generic/069 runtime increased on v5
> > superblocks by 100% compared to v4 superblocks. his perf-based
> > analysis pointed directly at the timestamp updates being done by the
> > write path in this workload. The append writers are doing 4-byte
> > writes, so there are lots of timestamp updates occurring.
> > 
> > The thing is, they aren't being triggered by timestamp changes -
> > they are being triggered by the inode change counter needing to be
> > updated. That is, every write(2) system call needs to bump the inode
> > version count, and it does that through the timestamp update
> > mechanism. Hence for v5 filesystems, test generic/069 is running 3
> > orders of magnitude more timestmap update transactions on v5
> > filesystems due to the fact it does a huge number of *4 byte*
> > write(2) calls.
> > 
> > This isn't a real world scenario we really need to address - anyone
> > doing such sequential IO should be using fwrite(3), not write(2).
> > i.e. fwrite(3) buffers the writes in userspace to minimise the
> > number of write(2) syscalls, and the problem goes away.
> > 
> > However, there is a small change we can make to improve the
> > situation - removing the expensive lock operation on the change
> > counter update.  All inode version counter changes in XFS occur
> > under the ip->i_ilock during a transaction, and therefore we
> > don't actually need the spin lock that provides exclusive access to
> > it through inc_inode_iversion().
> > 
> > Hence avoid the lock and just open code the increment ourselves when
> > logging the inode.
> > 
> > Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/xfs_trans_inode.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
> > index 53dfe46..e6601c1 100644
> > --- a/fs/xfs/xfs_trans_inode.c
> > +++ b/fs/xfs/xfs_trans_inode.c
> > @@ -118,8 +118,7 @@ xfs_trans_log_inode(
> >  	 */
> >  	if (!(ip->i_itemp->ili_item.li_desc->lid_flags & XFS_LID_DIRTY) &&
> >  	    IS_I_VERSION(VFS_I(ip))) {
> > -		inode_inc_iversion(VFS_I(ip));
> > -		ip->i_d.di_changecount = VFS_I(ip)->i_version;
> 
> comment about the reason for the open-code might be good, too?
> 
> otherwise some semantic patcher might "fix" it for you again later...
> 
> -Eric
> 
> > +		ip->i_d.di_changecount = ++VFS_I(ip)->i_version;
> >  		flags |= XFS_ILOG_CORE;
> >  	}
> >  
> > 

Adding a comment strikes me as a good idea too... But isn't that lock there for
a reason?  I suspect that will break i_version like i_size on 32 bit systems.
Jean added this function, hopefully he can shed some light.

Thanks,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-09-30 22:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-29 23:37 [PATCH 0/4] xfs: candidate fixes for 3.12-rc4 Dave Chinner
2013-09-29 23:37 ` [PATCH 1/4] xfs: lockdep needs to know about 3 dquot-deep nesting Dave Chinner
2013-09-30 21:19   ` Ben Myers
2013-09-29 23:37 ` [PATCH 2/4] xfs: dirent dtype presence is dependent on directory magic numbers Dave Chinner
2013-09-30 22:02   ` Ben Myers
2013-10-18 16:56   ` Rich Johnston
2013-10-18 22:30     ` Dave Chinner
2013-10-18 22:41       ` Rich Johnston
2013-09-29 23:37 ` [PATCH 3/4] xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering Dave Chinner
2013-09-30 22:14   ` Ben Myers
2013-10-01 10:57     ` Dave Chinner
2013-09-29 23:37 ` [PATCH 4/4] xfs: open code inc_inode_iversion when logging an inode Dave Chinner
2013-09-30 22:24   ` Eric Sandeen
2013-09-30 22:39     ` Ben Myers [this message]
2013-10-01 11:12       ` Dave Chinner
2013-10-01 23:04         ` Ben Myers
2013-09-30 22:52 ` [PATCH 0/4] xfs: candidate fixes for 3.12-rc4 Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130930223946.GQ1935@sgi.com \
    --to=bpm@sgi.com \
    --cc=david@fromorbit.com \
    --cc=jean-noel.cordenner@bull.net \
    --cc=sandeen@sandeen.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox