From: Alex Elder <aelder@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 2/2] xfs: don't truncate prealloc from frequently accessed inodes
Date: Thu, 14 Oct 2010 12:22:50 -0500 [thread overview]
Message-ID: <1287076970.2362.521.camel@doink> (raw)
In-Reply-To: <1286187236-16682-3-git-send-email-david@fromorbit.com>
On Mon, 2010-10-04 at 21:13 +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> A long standing problem for streaming writeѕ through the NFS server
> has been that the NFS server opens and closes file descriptors on an
> inode for every write. The result of this behaviour is that the
> ->release() function is called on every close and that results in
> XFS truncating speculative preallocation beyond the EOF. This has
> an adverse effect on file layout when multiple files are being
> written at the same time - they interleave their extents and can
> result in severe fragmentation.
>
> To avoid this problem, keep a count of the number of ->release calls
> made on an inode. For most cases, an inode is only going to be opened
> once for writing and then closed again during it's lifetime in
> cache. Hence if there are multiple ->release calls, there is a good
> chance that the inode is being accessed by the NFS server. Hence
> count up every time ->release is called while there are delalloc
> blocks still outstanding on the inode.
>
> If this count is non-zero when ->release is next called, then do no
> truncate away the speculative preallocation - leave it there so that
> subsequent writes do not need to reallocate the delalloc space. This
> will prevent interleaving of extents of different inodes written
> concurrently to the same AG.
>
> If we get this wrong, it is not a big deal as we truncate
> speculative allocation beyond EOF anyway in xfs_inactive() when the
> inode is thrown out of the cache.
>
> The new counter in the struct xfs_inode fits into a hole in the
> structure on 64 bit machines, so does not grow the size of the inode
> at all.
This seems reasonable, and I have no real objection to
it. However, I have a question and a comment related
to the affected code (and not your specific change).
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_inode.h | 1 +
> fs/xfs/xfs_vnodeops.c | 15 ++++++++++++++-
> 2 files changed, 15 insertions(+), 1 deletions(-)
>
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 1594190..82aad5e 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -261,6 +261,7 @@ typedef struct xfs_inode {
> xfs_fsize_t i_size; /* in-memory size */
> xfs_fsize_t i_new_size; /* size when write completes */
> atomic_t i_iocount; /* outstanding I/O count */
> + int i_dirty_releases; /* dirty ->release calls */
>
> /* VFS inode */
> struct inode i_vnode; /* embedded VFS inode */
> diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
> index b7bdc43..0c8eeba 100644
> --- a/fs/xfs/xfs_vnodeops.c
> +++ b/fs/xfs/xfs_vnodeops.c
OK, this comment is unrelated to your exact change. But just above
the next hunk there's a big nasty condition, which appears to
be *almost* duplicated in xfs_inactive() (twice!). It would be
very nice if, while you're at modifying this nearby code, you
could encapsulate that condition in a macro that has a meaningful
name.
> @@ -979,14 +979,27 @@ xfs_release(
> * chance to drop them once the last reference to
> * the inode is dropped, so we'll never leak blocks
> * permanently.
I'm curious what the effect is if we simply don't do the truncate
*except* when the inode becomes inactive. It means we hang onto
the stuff for a while longer, and maybe it makes things messier
in the event of a crash. Can you tell me why we do the truncate
here as well as in xfs_inactive() (or what the problem is of
*not* doing it here)?
> + *
> + * Further, count the number of times we get here in
> + * the life of this inode. If the inode is being
> + * opened, written and closed frequently and we have
> + * delayed allocation blocks oustanding (e.g. streaming
> + * writes from the NFS server), truncating the
> + * blocks past EOF will cause fragmentation to occur.
> + * In this case don't do the truncation, either.
> */
> + if (ip->i_delayed_blks)
> + ip->i_dirty_releases++;
> + if (ip->i_dirty_releases > 1)
> + goto out;
> +
> error = xfs_free_eofblocks(mp, ip,
> XFS_FREE_EOF_TRYLOCK);
> if (error)
> return error;
> }
> }
> -
> +out:
> return 0;
> }
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-10-14 17:22 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-04 10:13 [RFC, PATCH 0/2] xfs: dynamic speculative preallocation for delalloc Dave Chinner
2010-10-04 10:13 ` [PATCH 1/2] xfs: dynamic speculative EOF preallocation Dave Chinner
2010-10-14 17:22 ` Alex Elder
2010-10-14 21:33 ` Dave Chinner
2010-10-15 6:51 ` allocsize mount option, was: " Michael Monnerie
2010-10-15 11:59 ` Dave Chinner
2010-10-04 10:13 ` [PATCH 2/2] xfs: don't truncate prealloc from frequently accessed inodes Dave Chinner
2010-10-14 17:22 ` Alex Elder [this message]
2010-10-14 21:28 ` Dave Chinner
2010-10-14 17:22 ` [RFC, PATCH 0/2] xfs: dynamic speculative preallocation for delalloc Alex Elder
2010-10-14 21:16 ` Dave Chinner
2010-10-14 21:50 ` Ivan.Novick
2010-10-15 7:14 ` Michael Monnerie
2010-10-15 11:45 ` Dave Chinner
2010-10-17 14:31 ` Michael Monnerie
2010-10-17 23:49 ` Dave Chinner
2010-10-18 6:39 ` Michael Monnerie
-- strict thread matches above, loose matches on Subject: below --
2010-11-29 0:43 [PATCH 0/2] xfs: dynamic speculative allocation beyond EOF V3 Dave Chinner
2010-11-29 0:43 ` [PATCH 2/2] xfs: don't truncate prealloc from frequently accessed inodes Dave Chinner
2010-11-29 9:42 ` Andi Kleen
2010-11-30 1:00 ` Dave Chinner
2010-11-30 17:03 ` Christoph Hellwig
2010-11-30 22:00 ` Dave Chinner
2010-12-13 1:25 [PATCH 0/2] xfs: dynamic speculative allocation beyond EOF V4 Dave Chinner
2010-12-13 1:25 ` [PATCH 2/2] xfs: don't truncate prealloc from frequently accessed inodes Dave Chinner
2010-12-16 15:46 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1287076970.2362.521.camel@doink \
--to=aelder@sgi.com \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox