linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fs: truncate blocks outside i_size after O_DIRECT write error.
@ 2008-11-30 13:32 Dmitri Monakhov
  2008-12-02 23:58 ` Andrew Morton
  0 siblings, 1 reply; 2+ messages in thread
From: Dmitri Monakhov @ 2008-11-30 13:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, akpm, Dmitri Monakhov

In case of error extending write may have instantiated a few  blocks
outside i_size. We need to trim these blocks. We have to do it *regardless*
to blocksize. At least ext2, ext3 and reiserfs interpret
(i_size < biggest block) condition as error. Fsck will complain about wrong
i_size. Then fsck will fix the error by changing i_size according to the
biggest block. This is bad because this blocks contain garbage from previous
write attempt. And result in data corruption.

Changes from previous version:
 - Move error handling from generic mm code to __blockdev_direct_IO(). As soon
   as i remember this is 11'th version of the patch, and now finally it looks
   simple and clean.

####TESTCASE_BEGIN
$touch /mnt/test/BIG_FILE
## at this moment /mnt/test/BIG_FILE size and blocks equal to zero
open("/mnt/test/BIG_FILE", O_WRONLY|O_CREAT|O_DIRECT, 0666) = 3
write(3, "aaaaaaaaaaaa"..., 104857600) = -1 ENOSPC (No space left on device)
## size and block sould't be changed because write op failed.
$stat /mnt/test/BIG_FILE
File: `/mnt/test/BIG_FILE'
Size: 0 Blocks: 110896 IO Block: 1024 regular empty file
<<<<<<<<^^^^^^^^^^^^^^^^^^^^^^^^^^^^^file size is less than biggest block idx
Device: fe07h/65031d Inode: 14 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2007-01-24 20:03:38.000000000 +0300
Modify: 2007-01-24 20:03:38.000000000 +0300
Change: 2007-01-24 20:03:39.000000000 +0300

#fsck.ext3 -f /dev/VG/test
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Inode 14, i_size is 0, should be 56556544. Fix<y>? yes
Pass 2: Checking directory structure
....
#####TESTCASE_ENDdiff --git a/fs/direct-io.c b/fs/direct-io.c
index af0558d..4e88bea 100644

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
---
 fs/direct-io.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index af0558d..4e88bea 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -1209,6 +1209,16 @@ __blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
 	retval = direct_io_worker(rw, iocb, inode, iov, offset,
 				nr_segs, blkbits, get_block, end_io, dio);
 
+	/*
+	 * In case of error extending write may have instantiated a few
+	 * blocks outside i_size. Trim these off again for DIO_LOCKING.
+	 * NOTE: DIO_NO_LOCK/DIO_OWN_LOCK callers have to handle this by
+	 * it's own meaner.
+	 */
+	if (unlikely(retval < 0 && (rw & WRITE) && (end > i_size_read(inode))
+			&& dio_lock_type == DIO_LOCKING))
+		vmtruncate(inode, inode->i_size);
+
 	if (rw == READ && dio_lock_type == DIO_LOCKING)
 		release_i_mutex = 0;
 
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] fs: truncate blocks outside i_size after O_DIRECT write error.
  2008-11-30 13:32 [PATCH] fs: truncate blocks outside i_size after O_DIRECT write error Dmitri Monakhov
@ 2008-12-02 23:58 ` Andrew Morton
  0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2008-12-02 23:58 UTC (permalink / raw)
  To: Dmitri Monakhov; +Cc: linux-kernel, linux-fsdevel, dmonakhov

On Sun, 30 Nov 2008 16:32:46 +0300
Dmitri Monakhov <dmonakhov@openvz.org> wrote:

> In case of error extending write may have instantiated a few  blocks
> outside i_size. We need to trim these blocks. We have to do it *regardless*
> to blocksize. At least ext2, ext3 and reiserfs interpret
> (i_size < biggest block) condition as error. Fsck will complain about wrong
> i_size. Then fsck will fix the error by changing i_size according to the
> biggest block. This is bad because this blocks contain garbage from previous
> write attempt. And result in data corruption.
> 
> Changes from previous version:
>  - Move error handling from generic mm code to __blockdev_direct_IO(). As soon
>    as i remember this is 11'th version of the patch, and now finally it looks
>    simple and clean.

Only 11?  We can do better than that..

> ####TESTCASE_BEGIN
> $touch /mnt/test/BIG_FILE
> ## at this moment /mnt/test/BIG_FILE size and blocks equal to zero
> open("/mnt/test/BIG_FILE", O_WRONLY|O_CREAT|O_DIRECT, 0666) = 3
> write(3, "aaaaaaaaaaaa"..., 104857600) = -1 ENOSPC (No space left on device)
> ## size and block sould't be changed because write op failed.
> $stat /mnt/test/BIG_FILE
> File: `/mnt/test/BIG_FILE'
> Size: 0 Blocks: 110896 IO Block: 1024 regular empty file
> <<<<<<<<^^^^^^^^^^^^^^^^^^^^^^^^^^^^^file size is less than biggest block idx
> Device: fe07h/65031d Inode: 14 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2007-01-24 20:03:38.000000000 +0300
> Modify: 2007-01-24 20:03:38.000000000 +0300
> Change: 2007-01-24 20:03:39.000000000 +0300
> 
> #fsck.ext3 -f /dev/VG/test
> e2fsck 1.39 (29-May-2006)
> Pass 1: Checking inodes, blocks, and sizes
> Inode 14, i_size is 0, should be 56556544. Fix<y>? yes
> Pass 2: Checking directory structure
> ....
> #####TESTCASE_ENDdiff --git a/fs/direct-io.c b/fs/direct-io.c
> index af0558d..4e88bea 100644
> 
> Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
> ---
>  fs/direct-io.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index af0558d..4e88bea 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -1209,6 +1209,16 @@ __blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
>  	retval = direct_io_worker(rw, iocb, inode, iov, offset,
>  				nr_segs, blkbits, get_block, end_io, dio);
>  
> +	/*
> +	 * In case of error extending write may have instantiated a few
> +	 * blocks outside i_size. Trim these off again for DIO_LOCKING.
> +	 * NOTE: DIO_NO_LOCK/DIO_OWN_LOCK callers have to handle this by
> +	 * it's own meaner.
> +	 */
> +	if (unlikely(retval < 0 && (rw & WRITE) && (end > i_size_read(inode))
> +			&& dio_lock_type == DIO_LOCKING))
> +		vmtruncate(inode, inode->i_size);
> +
>  	if (rw == READ && dio_lock_type == DIO_LOCKING)
>  		release_i_mutex = 0;

Really we should use i_size_read() in both cases (i_mutex might not be
held).  How does this look?

	/*
	 * In case of error extending write may have instantiated a few
	 * blocks outside i_size. Trim these off again for DIO_LOCKING.
	 * NOTE: DIO_NO_LOCK/DIO_OWN_LOCK callers have to handle this by
	 * it's own meaner.
	 */
	if (unlikely(retval < 0 && (rw & WRITE))) {
		loff_t isize = i_size_read(inode);

		if (end > isize && dio_lock_type == DIO_LOCKING)
			vmtruncate(inode, isize);
	}



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-12-02 23:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-30 13:32 [PATCH] fs: truncate blocks outside i_size after O_DIRECT write error Dmitri Monakhov
2008-12-02 23:58 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).