From: Dave Chinner <david@fromorbit.com>
To: Kevan Rehm <kfr@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: question on xfs_vm_writepage in combination with fsync
Date: Tue, 21 Jun 2011 09:50:40 +1000 [thread overview]
Message-ID: <20110620235040.GQ561@dastard> (raw)
In-Reply-To: <4DFFB3F3.3070606@sgi.com>
On Mon, Jun 20, 2011 at 03:56:19PM -0500, Kevan Rehm wrote:
> Greetings,
>
> I've run into a case where the fsync() system call seems to have
> returned before all file data was actually on disk. (A SLES11SP1 system
> crash occurred shortly after an fsync which had returned zero. After
> restarting the machine, the last I/O before the fsync is not in the
> file.) In attempting to find the problem, I've come across code I don't
> understand, and am hoping someone can enlighten me as to how things are
> supposed to work.
>
> Routine xfs_vm_writepage has various situations under which it will
> decide it can't currently initiate writeback on a page, and in that case
> calls redirty_page_for_writepage, unlocks the page, and returns zero.
> That seems to me to be incompatible with fsync(), so I'm obviously
> missing some key piece of logic.
>
> The calling sequence of routines involved in fsync is:
>
> do_fsync->vfs_fsync->vfs_fsync_range->
> filemap_write_and_wait_range->
> __filemap_fdatawrite_range->
> do_writepages->generic_writepages->
> write_cache_pages
>
> Routine write_cache_pages walks the radix tree and calls
> clear_page_dirty_for_io and then __writepage on each dirty page to
> initiate writeback. __writepage calls xfs_vm_writepage. That routine
> is occasionally unable to immediately start writeback of the page, and
> so it calls redirty_page_for_writepage without setting the writeback flag.
Hi Kevan,
The current xfs_vm_writepage mainline code will only enter the
redirty path if:
- it is called from direct memory reclaim
- it is called within a transaction context and we need to
do an allocation transaction
- it is WB_SYNC_NONE writeback and we can't get the inode
lock without blocking during block mapping (EAGAIN case).
None of these cases are triggered by fsync() driven (WB_SYNC_ALL)
writeback, so AFAICT fsync() based writeback should not be skipping
writeback of dirty pages in the given fsync range. So for a mainline
kernel I don't think there are any problems w.r.t. fsync() and
redirtying pages causing dirty pages to be skipped during writeback.
However, the mainline writeback path has had significant change
(especially to WB_SYNC_ALL writeback) since sles11sp1 was
snapshotted (2.6.32, right?). Hence it is possible that one (or
several) of the changes fixed this bug without us even realising it
was a problem.
That said, having dirty pages after an fsync is not necessarily an
fsync bug - something coul dhave dirtied them while the fsync was in
progress. I don't know any details of how this occurred, so I'm
simply speculating that there could be other causes of the dirty
pages you are seeing...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2011-06-20 23:51 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-20 20:56 question on xfs_vm_writepage in combination with fsync Kevan Rehm
2011-06-20 23:50 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110620235040.GQ561@dastard \
--to=david@fromorbit.com \
--cc=kfr@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox