public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Kevan Rehm <kfr@sgi.com>
To: xfs@oss.sgi.com
Subject: question on xfs_vm_writepage in combination with fsync
Date: Mon, 20 Jun 2011 15:56:19 -0500	[thread overview]
Message-ID: <4DFFB3F3.3070606@sgi.com> (raw)

Greetings,

I've run into a case where the fsync() system call seems to have
returned before all file data was actually on disk.  (A SLES11SP1 system
crash occurred shortly after an fsync which had returned zero.  After
restarting the machine, the last I/O before the fsync is not in the
file.)  In attempting to find the problem, I've come across code I don't
understand, and am hoping someone can enlighten me as to how things are
supposed to work.

Routine xfs_vm_writepage has various situations under which it will
decide it can't currently initiate writeback on a page, and in that case
calls redirty_page_for_writepage, unlocks the page, and returns zero.
That seems to me to be incompatible with fsync(), so I'm obviously
missing some key piece of logic.

The calling sequence of routines involved in fsync is:

do_fsync->vfs_fsync->vfs_fsync_range->
	filemap_write_and_wait_range->
	__filemap_fdatawrite_range->
	do_writepages->generic_writepages->
	write_cache_pages

Routine write_cache_pages walks the radix tree and calls
clear_page_dirty_for_io and then __writepage on each dirty page to
initiate writeback.  __writepage calls xfs_vm_writepage.  That routine
is occasionally unable to immediately start writeback of the page, and
so it calls redirty_page_for_writepage without setting the writeback flag.

When write_cache_pages resumes after the __writepage call, it continues
walking the radix tree starting additional writebacks on dirty pages,
but nothing I can see will ever come back and try again to start a
writeback on the page that xfs_vm_writepage couldn't writeback.
Eventually control bubbles back up to filemap_write_and_wait_range()
where wait_on_page_writeback_range is called, but that routine only
waits for writebacks to complete, it doesn't do anything about dirty
pages.   So it appears to me that the dirty page will be left dirty
indefinitely even though the wbc contained WB_SYNC_ALL.

I'd like to believe that I am missing something, and that the code is
correct, but I do have a crash dump where I can see dirty pages in files
that were recently fsync'd.  And I can't believe the problem is
something inside XFS, because I see other filesystems also call
redirty_page_for_writepage, so I think the same problem could occur with
them.

Could someone please describe to me how fsync is supposed to work in
combination with xfs_vm_writepage?

Thanks in advance,

Regards, Kevan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

             reply	other threads:[~2011-06-20 20:56 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-20 20:56 Kevan Rehm [this message]
2011-06-20 23:50 ` question on xfs_vm_writepage in combination with fsync Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DFFB3F3.3070606@sgi.com \
    --to=kfr@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox