linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sgi.com>
To: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Some O_DIRECT vs. block_truncate_page problems
Date: 23 Sep 2002 13:09:37 -0500	[thread overview]
Message-ID: <1032804577.27949.17.camel@stout.americas.sgi.com> (raw)

I'm trying to track down a bug I'm seeing after I run xfs's
defragmenter, where data is present past EOF in the last block.

xfs_fsr does this (essentially):

o allocate some contiguous space with an xfs-specific ioctl
o copy the fragmented file into the new file with O_DIRECT
o truncate() the new file back to the correct size (since it's currently
  a multiple of the block size)
o sync the new file
o do "xfs_inval_cached_pages", which is essentially:
    - filemap_fdatasync(ip->i_mapping);
    - fsync_inode_data_buffers(ip);
    - filemap_fdatawait(ip->i_mapping);
    - truncate_inode_pages(ip->i_mapping, first);
o swap the extents between the old & the new files

When this is all done, I see data past EOF in the resulting file.

I believe that this is because the buffer from the truncate() call is
not being synced to disk, and gets thrown away when we do
truncate_inode_pages().

block_truncate_page() does a __mark_buffer_dirty(bh) at the end, but it
does not file the buffer on the inode's dirty data queue, so the
fsync_inode_data_buffers() does not see it.  The inode only has a page
on it's clean pages list, so filemap_fdatasync does not see it either.

Since generic_file_direct_IO expects to do filemap_fdatasync,
fsync_inode_data_buffers, filemap_fdatawait to flush data to disk, I
think that the behavior after a truncate() will be broken for any
filesystem.

This patch seems to fix things up for me; is there a reason that the
buffer was not inserted originally?

--- linux/fs/buffer.c_1.109	Mon Sep 23 13:10:56 2002
+++ linux/fs/buffer.c	Mon Sep 23 13:04:44 2002
@@ -2028,7 +2028,12 @@
 	flush_dcache_page(page);
 	kunmap(page);
 
-	__mark_buffer_dirty(bh);
+	if (!atomic_set_buffer_dirty(bh)) {
+		__mark_dirty(bh);
+		buffer_insert_inode_data_queue(bh, inode);
+		balance_dirty();
+	}
+
 	err = 0;


On a related note, while testing some O_DIRECT cases on ext2, I found
another, possibly related, bug.

If you O_DIRECT write 2 blocks into a new file, truncate() the file to
1.5 blocks, then do an O_DIRECT read of the last block, you get
incorrect data.  Rather than 1/2 data, 1/2 zeroes, I get all zeroes.
This test passes on xfs.

Comments?

-Eric


-- 
Eric Sandeen      XFS for Linux     http://oss.sgi.com/projects/xfs
sandeen@sgi.com   SGI, Inc.         651-683-3102


             reply	other threads:[~2002-09-23 18:16 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-23 18:09 Eric Sandeen [this message]
2002-09-24  5:39 ` Some O_DIRECT vs. block_truncate_page problems Andrew Morton
2002-09-24 12:48   ` Some O_DIRECT vs. block_truncate_page problemsr Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1032804577.27949.17.camel@stout.americas.sgi.com \
    --to=sandeen@sgi.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).