All of lore.kernel.org
 help / color / mirror / Atom feed
* Some O_DIRECT vs. block_truncate_page problems
@ 2002-09-23 18:09 Eric Sandeen
  2002-09-24  5:39 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Sandeen @ 2002-09-23 18:09 UTC (permalink / raw)
  To: linux-fsdevel

I'm trying to track down a bug I'm seeing after I run xfs's
defragmenter, where data is present past EOF in the last block.

xfs_fsr does this (essentially):

o allocate some contiguous space with an xfs-specific ioctl
o copy the fragmented file into the new file with O_DIRECT
o truncate() the new file back to the correct size (since it's currently
  a multiple of the block size)
o sync the new file
o do "xfs_inval_cached_pages", which is essentially:
    - filemap_fdatasync(ip->i_mapping);
    - fsync_inode_data_buffers(ip);
    - filemap_fdatawait(ip->i_mapping);
    - truncate_inode_pages(ip->i_mapping, first);
o swap the extents between the old & the new files

When this is all done, I see data past EOF in the resulting file.

I believe that this is because the buffer from the truncate() call is
not being synced to disk, and gets thrown away when we do
truncate_inode_pages().

block_truncate_page() does a __mark_buffer_dirty(bh) at the end, but it
does not file the buffer on the inode's dirty data queue, so the
fsync_inode_data_buffers() does not see it.  The inode only has a page
on it's clean pages list, so filemap_fdatasync does not see it either.

Since generic_file_direct_IO expects to do filemap_fdatasync,
fsync_inode_data_buffers, filemap_fdatawait to flush data to disk, I
think that the behavior after a truncate() will be broken for any
filesystem.

This patch seems to fix things up for me; is there a reason that the
buffer was not inserted originally?

--- linux/fs/buffer.c_1.109	Mon Sep 23 13:10:56 2002
+++ linux/fs/buffer.c	Mon Sep 23 13:04:44 2002
@@ -2028,7 +2028,12 @@
 	flush_dcache_page(page);
 	kunmap(page);
 
-	__mark_buffer_dirty(bh);
+	if (!atomic_set_buffer_dirty(bh)) {
+		__mark_dirty(bh);
+		buffer_insert_inode_data_queue(bh, inode);
+		balance_dirty();
+	}
+
 	err = 0;


On a related note, while testing some O_DIRECT cases on ext2, I found
another, possibly related, bug.

If you O_DIRECT write 2 blocks into a new file, truncate() the file to
1.5 blocks, then do an O_DIRECT read of the last block, you get
incorrect data.  Rather than 1/2 data, 1/2 zeroes, I get all zeroes.
This test passes on xfs.

Comments?

-Eric


-- 
Eric Sandeen      XFS for Linux     http://oss.sgi.com/projects/xfs
sandeen@sgi.com   SGI, Inc.         651-683-3102


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-09-24 12:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-23 18:09 Some O_DIRECT vs. block_truncate_page problems Eric Sandeen
2002-09-24  5:39 ` Andrew Morton
2002-09-24 12:48   ` Some O_DIRECT vs. block_truncate_page problemsr Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.