From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 7/7] xfs: xfs_fs_write_inode() can fail to write inodes synchronously
Date: Tue, 26 Jan 2010 15:51:49 +1100 [thread overview]
Message-ID: <20100126045149.GC15853@discord.disaster> (raw)
In-Reply-To: <20100125160354.GA30227@infradead.org>
On Mon, Jan 25, 2010 at 11:03:54AM -0500, Christoph Hellwig wrote:
> On Mon, Jan 25, 2010 at 05:22:44PM +1100, Dave Chinner wrote:
> > When an inode has already be flushed delayed write,
> > xfs_inode_clean() returns true and hence xfs_fs_write_inode() can
> > return on a synchronous inode write without having written the
> > inode. Currently these sycnhronous writes only come from the unmount
> > path or the nfsd on a synchronous export so should be fairly rare.
>
> They also come from sync_filesystem, which is uses by the sync system
> call, in the unmount code and from cachefiles.
True - I'll update the comment - but I still think it'll be fairly
rare.
> > Realistically, a synchronous inode write is not necessary here; we
> > can treat this like fsync where we either force the log if there are
> > no unlogged changes, or do a sync transaction if there are unlogged
> > changes. The will result real synchronous semantics as the fsync
> > will issue barriers, but may slow down the above two configurations
> > as a result. However, if the inode is not pinned and has no unlogged
> > changes, then the fsync code is a no-op and hence it may be faster
> > than the existing code.
>
> If we get a lot of cases where we need to write out the inode
> synchronously the barrier might hit us really hard, though.
No different to running wsync, though, where all transactions
are synchronous and will issue barriers all the time.
> If
> we have a lot of delalloc I/O outstanding I fear this might actually
> happen in practice as the inode gets modified between the first
> ->write_inode with wait == 0 by I/O completion.
So far in my testing I haven't seen a big hit - the performance
tests I've done are on filesystems with barriers enabled. I just
checked barrier vs nobarrier sync times after creating 400,000
single block files in parallel - nobarrier = 27s, barrier = 29s.
> > + error = EAGAIN;
> > + if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
> > + goto out;
> > + if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip))
> > + goto out_unlock;
>
> So if we make this non-blocking even for the wait case, don't we
> still have a race window there bulkstat could miss the updates, even
> after a sync?
Yes, you're right. But even if we lock here properly, a delwri flush
is non-blocking and hence can still return EAGAIN. We really only need
this if a newly allocated inode has not been previously flushed for
bulkstat to work correctly. We would need to race with a concurrent
transaction between the fsync call and the below checks for this
flush to fail, which I think should be a relatively rare ocurrence.
What I will look at is whether I can get xfs_fsync() to take a
locked inode and return with it still locked. Then this race
condition will go away completely and hence the delwri flush
will only occur if the inode has not been flushed yet (based
on the flock).
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2010-01-26 4:50 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-25 6:22 [PATCH 0/7] Delayed write metadata writeback V3 Dave Chinner
2010-01-25 6:22 ` [PATCH 1/7] xfs: Make inode reclaim states explicit Dave Chinner
2010-01-25 11:47 ` Christoph Hellwig
2010-01-25 6:22 ` [PATCH 2/7] xfs: Use delayed write for inodes rather than async Dave Chinner
2010-01-25 11:53 ` Christoph Hellwig
2010-01-25 22:31 ` Dave Chinner
2010-01-26 7:28 ` Dave Chinner
2010-01-26 16:10 ` Christoph Hellwig
2010-02-01 22:59 ` Dave Chinner
2010-01-25 15:29 ` Christoph Hellwig
2010-01-25 6:22 ` [PATCH 3/7] xfs: Don't issue buffer IO direct from AIL push Dave Chinner
2010-01-25 11:59 ` Christoph Hellwig
2010-01-25 6:22 ` [PATCH 4/7] xfs: Sort delayed write buffers before dispatch Dave Chinner
2010-01-25 12:00 ` Christoph Hellwig
2010-01-26 4:13 ` Dave Chinner
2010-01-25 6:22 ` [PATCH 5/7] xfs: Use delay write promotion for dquot flushing Dave Chinner
2010-01-25 12:03 ` Christoph Hellwig
2010-01-25 6:22 ` [PATCH 6/7] xfs: kill the unused XFS_QMOPT_* flush flags Dave Chinner
2010-01-25 12:04 ` Christoph Hellwig
2010-01-25 6:22 ` [PATCH 7/7] xfs: xfs_fs_write_inode() can fail to write inodes synchronously Dave Chinner
2010-01-25 16:03 ` Christoph Hellwig
2010-01-26 4:51 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100126045149.GC15853@discord.disaster \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox