From: Eric Sandeen <sandeen@sandeen.net>
To: Dave Chinner <david@fromorbit.com>, xfs@oss.sgi.com
Subject: Re: [PATCH 2/5] xfs: open code inc_inode_iversion when logging an inode
Date: Mon, 18 Nov 2013 15:54:24 -0600 [thread overview]
Message-ID: <528A8C90.3010401@sandeen.net> (raw)
In-Reply-To: <1383280040-21979-3-git-send-email-david@fromorbit.com>
On 10/31/13, 11:27 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Michael L Semon reported that generic/069 runtime increased on v5
> superblocks by 100% compared to v4 superblocks. his perf-based
> analysis pointed directly at the timestamp updates being done by the
> write path in this workload. The append writers are doing 4-byte
> writes, so there are lots of timestamp updates occurring.
>
> The thing is, they aren't being triggered by timestamp changes -
> they are being triggered by the inode change counter needing to be
> updated. That is, every write(2) system call needs to bump the inode
> version count, and it does that through the timestamp update
> mechanism. Hence for v5 filesystems, test generic/069 is running 3
> orders of magnitude more timestmap update transactions on v5
> filesystems due to the fact it does a huge number of *4 byte*
> write(2) calls.
>
> This isn't a real world scenario we really need to address - anyone
> doing such sequential IO should be using fwrite(3), not write(2).
> i.e. fwrite(3) buffers the writes in userspace to minimise the
> number of write(2) syscalls, and the problem goes away.
>
> However, there is a small change we can make to improve the
> situation - removing the expensive lock operation on the change
> counter update. All inode version counter changes in XFS occur
> under the ip->i_ilock during a transaction, and therefore we
> don't actually need the spin lock that provides exclusive access to
> it through inc_inode_iversion().
>
> Hence avoid the lock and just open code the increment ourselves when
> logging the inode.
Well, ok. Maybe worth a note about why the unlocked read is 99.9999% ok...
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_trans_inode.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
> index 1bba7f6..50c3f56 100644
> --- a/fs/xfs/xfs_trans_inode.c
> +++ b/fs/xfs/xfs_trans_inode.c
> @@ -111,12 +111,14 @@ xfs_trans_log_inode(
>
> /*
> * First time we log the inode in a transaction, bump the inode change
> - * counter if it is configured for this to occur.
> + * counter if it is configured for this to occur. We don't use
> + * inode_inc_version() because there is no need for extra locking around
> + * i_version as we already hold the inode locked exclusively for
> + * metadata modification.
> */
> if (!(ip->i_itemp->ili_item.li_desc->lid_flags & XFS_LID_DIRTY) &&
> IS_I_VERSION(VFS_I(ip))) {
> - inode_inc_iversion(VFS_I(ip));
> - ip->i_d.di_changecount = VFS_I(ip)->i_version;
> + ip->i_d.di_changecount = ++VFS_I(ip)->i_version;
> flags |= XFS_ILOG_CORE;
> }
>
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-11-18 21:54 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-01 4:27 [PATCH 0/5] xfs: more patches for 3.13 Dave Chinner
2013-11-01 4:27 ` [PATCH 1/5] xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering Dave Chinner
2013-11-01 4:27 ` [PATCH 2/5] xfs: open code inc_inode_iversion when logging an inode Dave Chinner
2013-11-05 16:41 ` Christoph Hellwig
2013-11-18 21:54 ` Eric Sandeen [this message]
2013-11-18 22:28 ` Ben Myers
2013-11-18 22:45 ` Eric Sandeen
2013-11-01 4:27 ` [PATCH 3/5] xfs: trace AIL manipulations Dave Chinner
2013-11-05 16:41 ` Christoph Hellwig
2013-11-01 4:27 ` [PATCH 4/5] xfs: add tracepoints to AGF/AGI read operations Dave Chinner
2013-11-05 16:42 ` Christoph Hellwig
2013-11-01 4:27 ` [PATCH 5/5] xfs: increase inode cluster size for v5 filesystems Dave Chinner
2013-11-05 16:43 ` Christoph Hellwig
2013-11-05 19:56 ` Dave Chinner
2013-11-06 21:31 ` Ben Myers
2013-11-07 0:32 ` Dave Chinner
2013-11-12 17:33 ` Christoph Hellwig
2013-11-08 18:21 ` Eric Sandeen
2013-11-11 22:45 ` Dave Chinner
2013-11-12 0:24 ` Eric Sandeen
2013-11-14 18:51 ` Eric Sandeen
2013-11-06 23:01 ` [PATCH 0/5] xfs: more patches for 3.13 Ben Myers
2013-11-07 1:57 ` Dave Chinner
2013-11-13 1:16 ` Eric Sandeen
2013-11-14 1:16 ` Dave Chinner
2013-11-15 17:19 ` Eric Sandeen
2013-11-15 17:55 ` Eric Sandeen
2013-11-17 19:48 ` Dave Chinner
2013-11-18 21:52 ` Eric Sandeen
2013-11-18 20:30 ` Ben Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=528A8C90.3010401@sandeen.net \
--to=sandeen@sandeen.net \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.