[PATCH 2/2] ext4: handle errors in ext4_clear

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 2/2] ext4: handle errors in ext4_clear_blocks()
@ 2011-03-01 12:28 Amir Goldstein
  2011-03-21  1:32 ` Ted Ts'o
  2011-03-21  1:37 ` Ted Ts'o
  0 siblings, 2 replies; 5+ messages in thread
From: Amir Goldstein @ 2011-03-01 12:28 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Ext4 Developers List

Checking return code from ext4_journal_get_write_access() is important
with snapshots, because this function invokes COW, so may return new
errors, such as ENOSPC.

ext4_clear_blocks() now returns < 0 for fatal errors, in which case,
ext4_free_data() is aborted.

Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
---
 fs/ext4/inode.c |   46 ++++++++++++++++++++++++++--------------------
 1 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 67e7a3c..13d3952 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4096,6 +4096,9 @@ no_top:
  *
  * We release `count' blocks on disk, but (last - first) may be greater
  * than `count' because there can be holes in there.
+ *
+ * Return 0 on success, 1 on invalid block range
+ * and < 0 on fatal error.
  */
 static int ext4_clear_blocks(handle_t *handle, struct inode *inode,
 			     struct buffer_head *bh,
@@ -4122,25 +4125,21 @@ static int ext4_clear_blocks(handle_t *handle,
struct inode *inode,
 		if (bh) {
 			BUFFER_TRACE(bh, "call ext4_handle_dirty_metadata");
 			err = ext4_handle_dirty_metadata(handle, inode, bh);
-			if (unlikely(err)) {
-				ext4_std_error(inode->i_sb, err);
-				return 1;
-			}
+			if (unlikely(err))
+				goto out_err;
 		}
 		err = ext4_mark_inode_dirty(handle, inode);
-		if (unlikely(err)) {
-			ext4_std_error(inode->i_sb, err);
-			return 1;
-		}
+		if (unlikely(err))
+			goto out_err;
 		err = ext4_truncate_restart_trans(handle, inode,
 						  blocks_for_truncate(inode));
-		if (unlikely(err)) {
-			ext4_std_error(inode->i_sb, err);
-			return 1;
-		}
+		if (unlikely(err))
+			goto out_err;
 		if (bh) {
 			BUFFER_TRACE(bh, "retaking write access");
-			ext4_journal_get_write_access(handle, bh);
+			err = ext4_journal_get_write_access(handle, bh);
+			if (unlikely(err))
+				goto out_err;
 		}
 	}

@@ -4149,6 +4148,9 @@ static int ext4_clear_blocks(handle_t *handle,
struct inode *inode,

 	ext4_free_blocks(handle, inode, NULL, block_to_free, count, flags);
 	return 0;
+out_err:
+	ext4_std_error(inode->i_sb, err);
+	return err;
 }

 /**
@@ -4182,7 +4184,7 @@ static void ext4_free_data(handle_t *handle,
struct inode *inode,
 	ext4_fsblk_t nr;		    /* Current block # */
 	__le32 *p;			    /* Pointer into inode/ind
 					       for current block */
-	int err;
+	int err = 0;

 	if (this_bh) {				/* For indirect block */
 		BUFFER_TRACE(this_bh, "get_write_access");
@@ -4204,9 +4206,10 @@ static void ext4_free_data(handle_t *handle,
struct inode *inode,
 			} else if (nr == block_to_free + count) {
 				count++;
 			} else {
-				if (ext4_clear_blocks(handle, inode, this_bh,
-						      block_to_free, count,
-						      block_to_free_p, p))
+				err = ext4_clear_blocks(handle, inode, this_bh,
+						        block_to_free, count,
+						        block_to_free_p, p);
+				if (err)
 					break;
 				block_to_free = nr;
 				block_to_free_p = p;
@@ -4215,9 +4218,12 @@ static void ext4_free_data(handle_t *handle,
struct inode *inode,
 		}
 	}

-	if (count > 0)
-		ext4_clear_blocks(handle, inode, this_bh, block_to_free,
-				  count, block_to_free_p, p);
+	if (!err && count > 0)
+		err = ext4_clear_blocks(handle, inode, this_bh, block_to_free,
+					count, block_to_free_p, p);
+	if (err < 0)
+		/* fatal error */
+		return;

 	if (this_bh) {
 		BUFFER_TRACE(this_bh, "call ext4_handle_dirty_metadata");
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ext4: handle errors in ext4_clear_blocks()
  2011-03-01 12:28 [PATCH 2/2] ext4: handle errors in ext4_clear_blocks() Amir Goldstein
@ 2011-03-21  1:32 ` Ted Ts'o
  2011-03-21  1:37 ` Ted Ts'o
  1 sibling, 0 replies; 5+ messages in thread
From: Ted Ts'o @ 2011-03-21  1:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Ext4 Developers List

On Tue, Mar 01, 2011 at 02:28:26PM +0200, Amir Goldstein wrote:
> Checking return code from ext4_journal_get_write_access() is important
> with snapshots, because this function invokes COW, so may return new
> errors, such as ENOSPC.
> 
> ext4_clear_blocks() now returns < 0 for fatal errors, in which case,
> ext4_free_data() is aborted.

I'll apply this patch because it's not wrong, but I don't think it's
enough.  Just aborting ext4_free_data() is going to get you into
beaucoup trouble, since that happens inside ext4_truncate().  Aborting
ext4_free_data without even returning an error code is going to leave
the file system corrupted.  But aborting the truncate code (this is
only in the non-extent-mapped inode case) is going to get _awfully_
messy.

I suspect you may need to estimate how many blocks you need, and check
to make sure you can COW that many blocks, and reserve them so the you
don't have to deal with a failure in the middle of an ext4_truncate()
operation.  This is not going to be pretty....

	    	    	      	    - Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ext4: handle errors in ext4_clear_blocks()
  2011-03-01 12:28 [PATCH 2/2] ext4: handle errors in ext4_clear_blocks() Amir Goldstein
  2011-03-21  1:32 ` Ted Ts'o
@ 2011-03-21  1:37 ` Ted Ts'o
  2011-03-21  4:23   ` Amir Goldstein
  1 sibling, 1 reply; 5+ messages in thread
From: Ted Ts'o @ 2011-03-21  1:37 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Ext4 Developers List

On Tue, Mar 01, 2011 at 02:28:26PM +0200, Amir Goldstein wrote:
> +out_err:
> +	ext4_std_error(inode->i_sb, err);
> +	return err;

Umm, you do realize this function will mark the file system as dirty,
and possibly remount the file system read-only, or panic the system,
right?

That's appropriate for normal journal failures (since there's no
recovering from them), but if you're proposing that a simple lack of
free blocks when doing a COW operation will result in a ENOSPC, all a
malicious userspace application needs to do to potentially bring the
system down is to attempt to unlink or truncate a file while COW
snapshots are enabled.....

Do you really want me to include this patch?  I think you may want to
rethink how you want to handle this case....

						- Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ext4: handle errors in ext4_clear_blocks()
  2011-03-21  1:37 ` Ted Ts'o
@ 2011-03-21  4:23   ` Amir Goldstein
  2011-03-21  5:18     ` Amir Goldstein
  0 siblings, 1 reply; 5+ messages in thread
From: Amir Goldstein @ 2011-03-21  4:23 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: Ext4 Developers List

On Mon, Mar 21, 2011 at 3:37 AM, Ted Ts'o <tytso@mit.edu> wrote:
> On Tue, Mar 01, 2011 at 02:28:26PM +0200, Amir Goldstein wrote:
>> +out_err:
>> +     ext4_std_error(inode->i_sb, err);
>> +     return err;
>
> Umm, you do realize this function will mark the file system as dirty,
> and possibly remount the file system read-only, or panic the system,
> right?

I am counting on that. In fact I think we should not allow snapshots
and errors=continue,
otherwise we just as good (bad) as LVM snapshot that vanishes when it
runs out of space.

>
> That's appropriate for normal journal failures (since there's no
> recovering from them), but if you're proposing that a simple lack of
> free blocks when doing a COW operation will result in a ENOSPC, all a
> malicious userspace application needs to do to potentially bring the
> system down is to attempt to unlink or truncate a file while COW
> snapshots are enabled.....

It's not that easy to cause ENOSPC during COW.
s_snapshot_r_blocks has blocks reserved for COW,
but I do want the fs to freeze if that reservation somehow runs out.
The reservation is based on metadata size estimation, calculated
for no of used inodes no of dirs and no of used blocks. It may fall
short when there are very many hard links or very long file names,
resulting in lots of directory blocks and when all of them are COWed.

short in many hard links

>
> Do you really want me to include this patch?  I think you may want to
> rethink how you want to handle this case....

I do, because when errors=remount-ro, this function emmits lots of
"journal has aborted" errors, which are of no useful to anyone...

>
>                                                - Ted
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ext4: handle errors in ext4_clear_blocks()
  2011-03-21  4:23   ` Amir Goldstein
@ 2011-03-21  5:18     ` Amir Goldstein
  0 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2011-03-21  5:18 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: Ext4 Developers List

On Mon, Mar 21, 2011 at 6:23 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Mar 21, 2011 at 3:37 AM, Ted Ts'o <tytso@mit.edu> wrote:
>> On Tue, Mar 01, 2011 at 02:28:26PM +0200, Amir Goldstein wrote:
>>> +out_err:
>>> +     ext4_std_error(inode->i_sb, err);
>>> +     return err;
>>
>> Umm, you do realize this function will mark the file system as dirty,
>> and possibly remount the file system read-only, or panic the system,
>> right?
>
> I am counting on that. In fact I think we should not allow snapshots
> and errors=continue,
> otherwise we just as good (bad) as LVM snapshot that vanishes when it
> runs out of space.
>
>>
>> That's appropriate for normal journal failures (since there's no
>> recovering from them), but if you're proposing that a simple lack of
>> free blocks when doing a COW operation will result in a ENOSPC, all a
>> malicious userspace application needs to do to potentially bring the
>> system down is to attempt to unlink or truncate a file while COW
>> snapshots are enabled.....

And FYI, unlink/truncate of large file doesn't take up a lot of disk space,
the data blocks are moved to snapshot and only metadata needs to be
COWed, which should be accounted for by the snapshot disk space
reservation.

The worst DoS, which a malicious user space application can do is to
take up it's own disk quota X number of retained snapshots.
This is something that systems administrators have to take into account
when mixing snapshots and quotas.
Just creating new files and deleting them, does not make snapshot disk
usage grow.


>
> It's not that easy to cause ENOSPC during COW.
> s_snapshot_r_blocks has blocks reserved for COW,
> but I do want the fs to freeze if that reservation somehow runs out.
> The reservation is based on metadata size estimation, calculated
> for no of used inodes no of dirs and no of used blocks. It may fall
> short when there are very many hard links or very long file names,
> resulting in lots of directory blocks and when all of them are COWed.
>
> short in many hard links
>
>>
>> Do you really want me to include this patch?  I think you may want to
>> rethink how you want to handle this case....
>
> I do, because when errors=remount-ro, this function emmits lots of
> "journal has aborted" errors, which are of no useful to anyone...
>
>>
>>                                                - Ted
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-03-21  5:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-01 12:28 [PATCH 2/2] ext4: handle errors in ext4_clear_blocks() Amir Goldstein
2011-03-21  1:32 ` Ted Ts'o
2011-03-21  1:37 ` Ted Ts'o
2011-03-21  4:23   ` Amir Goldstein
2011-03-21  5:18     ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).