public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sandeen.net>
To: Dave Chinner <david@fromorbit.com>, xfs@oss.sgi.com
Subject: Re: [PATCH 2/5] xfs: open code inc_inode_iversion when logging an inode
Date: Mon, 18 Nov 2013 15:54:24 -0600	[thread overview]
Message-ID: <528A8C90.3010401@sandeen.net> (raw)
In-Reply-To: <1383280040-21979-3-git-send-email-david@fromorbit.com>

On 10/31/13, 11:27 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Michael L Semon reported that generic/069 runtime increased on v5
> superblocks by 100% compared to v4 superblocks. his perf-based
> analysis pointed directly at the timestamp updates being done by the
> write path in this workload. The append writers are doing 4-byte
> writes, so there are lots of timestamp updates occurring.
> 
> The thing is, they aren't being triggered by timestamp changes -
> they are being triggered by the inode change counter needing to be
> updated. That is, every write(2) system call needs to bump the inode
> version count, and it does that through the timestamp update
> mechanism. Hence for v5 filesystems, test generic/069 is running 3
> orders of magnitude more timestmap update transactions on v5
> filesystems due to the fact it does a huge number of *4 byte*
> write(2) calls.
> 
> This isn't a real world scenario we really need to address - anyone
> doing such sequential IO should be using fwrite(3), not write(2).
> i.e. fwrite(3) buffers the writes in userspace to minimise the
> number of write(2) syscalls, and the problem goes away.
> 
> However, there is a small change we can make to improve the
> situation - removing the expensive lock operation on the change
> counter update.  All inode version counter changes in XFS occur
> under the ip->i_ilock during a transaction, and therefore we
> don't actually need the spin lock that provides exclusive access to
> it through inc_inode_iversion().
> 
> Hence avoid the lock and just open code the increment ourselves when
> logging the inode.

Well, ok.  Maybe worth a note about why the unlocked read is 99.9999% ok...

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_trans_inode.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
> index 1bba7f6..50c3f56 100644
> --- a/fs/xfs/xfs_trans_inode.c
> +++ b/fs/xfs/xfs_trans_inode.c
> @@ -111,12 +111,14 @@ xfs_trans_log_inode(
>  
>  	/*
>  	 * First time we log the inode in a transaction, bump the inode change
> -	 * counter if it is configured for this to occur.
> +	 * counter if it is configured for this to occur. We don't use
> +	 * inode_inc_version() because there is no need for extra locking around
> +	 * i_version as we already hold the inode locked exclusively for
> +	 * metadata modification.
>  	 */
>  	if (!(ip->i_itemp->ili_item.li_desc->lid_flags & XFS_LID_DIRTY) &&
>  	    IS_I_VERSION(VFS_I(ip))) {
> -		inode_inc_iversion(VFS_I(ip));
> -		ip->i_d.di_changecount = VFS_I(ip)->i_version;
> +		ip->i_d.di_changecount = ++VFS_I(ip)->i_version;
>  		flags |= XFS_ILOG_CORE;
>  	}
>  
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2013-11-18 21:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-01  4:27 [PATCH 0/5] xfs: more patches for 3.13 Dave Chinner
2013-11-01  4:27 ` [PATCH 1/5] xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering Dave Chinner
2013-11-01  4:27 ` [PATCH 2/5] xfs: open code inc_inode_iversion when logging an inode Dave Chinner
2013-11-05 16:41   ` Christoph Hellwig
2013-11-18 21:54   ` Eric Sandeen [this message]
2013-11-18 22:28     ` Ben Myers
2013-11-18 22:45       ` Eric Sandeen
2013-11-01  4:27 ` [PATCH 3/5] xfs: trace AIL manipulations Dave Chinner
2013-11-05 16:41   ` Christoph Hellwig
2013-11-01  4:27 ` [PATCH 4/5] xfs: add tracepoints to AGF/AGI read operations Dave Chinner
2013-11-05 16:42   ` Christoph Hellwig
2013-11-01  4:27 ` [PATCH 5/5] xfs: increase inode cluster size for v5 filesystems Dave Chinner
2013-11-05 16:43   ` Christoph Hellwig
2013-11-05 19:56     ` Dave Chinner
2013-11-06 21:31       ` Ben Myers
2013-11-07  0:32         ` Dave Chinner
2013-11-12 17:33       ` Christoph Hellwig
2013-11-08 18:21   ` Eric Sandeen
2013-11-11 22:45     ` Dave Chinner
2013-11-12  0:24       ` Eric Sandeen
2013-11-14 18:51   ` Eric Sandeen
2013-11-06 23:01 ` [PATCH 0/5] xfs: more patches for 3.13 Ben Myers
2013-11-07  1:57   ` Dave Chinner
2013-11-13  1:16     ` Eric Sandeen
2013-11-14  1:16       ` Dave Chinner
2013-11-15 17:19         ` Eric Sandeen
2013-11-15 17:55           ` Eric Sandeen
2013-11-17 19:48             ` Dave Chinner
2013-11-18 21:52               ` Eric Sandeen
2013-11-18 20:30 ` Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528A8C90.3010401@sandeen.net \
    --to=sandeen@sandeen.net \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox