linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: ensure LARGE_FILE feature when mounting delalloc
@ 2014-10-01 21:33 Eric Sandeen
  2014-10-02  1:26 ` Andreas Dilger
  2014-10-02 15:28 ` [PATCH] ext4: fix reservation overflow in ext4_da_write_begin Eric Sandeen
  0 siblings, 2 replies; 6+ messages in thread
From: Eric Sandeen @ 2014-10-01 21:33 UTC (permalink / raw)
  To: ext4 development

Delalloc write journal reservations only reserve 1 credit,
to update the inode if necessary.  However, it may happen
once in a filesystem's lifetime that a file will cross
the 2G threshold, and require the LARGE_FILE feature to
be set in the superblock as well, if it was not set already.

This overruns the transaction reservation, and can be
demonstrated simply on any ext4 filesystem without the LARGE_FILE
feature already set:

dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
	conv=notrunc of=testfile
sync
dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
	conv=notrunc of=testfile

leads to:

EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28

It simplifies things if we ensure that when we are running
with delalloc, we have LARGE_FILE set already; that way we
don't have to potentially set it later during a file write.

For any fs of sufficient size, LARGE_FILE is usually set
simply due to the size of the resize inode.  And for ext4,
HUGE_FILE is set by default.

LARGE_FILE is a decades-old compatibility flag, so at this
point there is little risk of backwards compatibility problems
by enabling it when the filesystem is mounted as ext4.

So just set LARGE_FILE if we are mounted delalloc, if it's
not set already, and be done with it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
--- 

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0b28b36..8e56d7e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3576,6 +3576,20 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 			clear_opt(sb, DELALLOC);
 	}
 
+	/*
+	 * Adding the LARGE_FILES feature to the superblock adds
+	 * unnecessary complication to journal credit calculations
+	 * when delalloc is enabled.  This is a decades-old feature,
+	 * so just enable it now to simplify things.
+	 */
+	if (test_opt(sb, DELALLOC) && !(sb->s_flags & MS_RDONLY) &&
+	    EXT4_HAS_COMPAT_FEATURE(sb, EXT4_FEATURE_COMPAT_HAS_JOURNAL) &&
+	    !EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_LARGE_FILE)) {
+		ext4_update_dynamic_rev(sb);
+		EXT4_SET_RO_COMPAT_FEATURE(sb,
+					   EXT4_FEATURE_RO_COMPAT_LARGE_FILE);
+	}
+
 	sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
 		(test_opt(sb, POSIX_ACL) ? MS_POSIXACL : 0);
 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] ext4: ensure LARGE_FILE feature when mounting delalloc
  2014-10-01 21:33 [PATCH] ext4: ensure LARGE_FILE feature when mounting delalloc Eric Sandeen
@ 2014-10-02  1:26 ` Andreas Dilger
  2014-10-02  2:15   ` Eric Sandeen
  2014-10-02 15:28 ` [PATCH] ext4: fix reservation overflow in ext4_da_write_begin Eric Sandeen
  1 sibling, 1 reply; 6+ messages in thread
From: Andreas Dilger @ 2014-10-02  1:26 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

[-- Attachment #1: Type: text/plain, Size: 3902 bytes --]

On Oct 1, 2014, at 3:33 PM, Eric Sandeen <sandeen@redhat.com> wrote:
> Delalloc write journal reservations only reserve 1 credit,
> to update the inode if necessary.  However, it may happen
> once in a filesystem's lifetime that a file will cross
> the 2G threshold, and require the LARGE_FILE feature to
> be set in the superblock as well, if it was not set already.
> 
> This overruns the transaction reservation, and can be
> demonstrated simply on any ext4 filesystem without the LARGE_FILE
> feature already set:
> 
> dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
> 	conv=notrunc of=testfile
> sync
> dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
> 	conv=notrunc of=testfile
> 
> leads to:
> 
> EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
> EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
> EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
> EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
> EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28
> 
> It simplifies things if we ensure that when we are running
> with delalloc, we have LARGE_FILE set already; that way we
> don't have to potentially set it later during a file write.
> 
> For any fs of sufficient size, LARGE_FILE is usually set
> simply due to the size of the resize inode.  And for ext4,
> HUGE_FILE is set by default.
> 
> LARGE_FILE is a decades-old compatibility flag, so at this
> point there is little risk of backwards compatibility problems
> by enabling it when the filesystem is mounted as ext4.
> 
> So just set LARGE_FILE if we are mounted delalloc, if it's
> not set already, and be done with it.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> --- 
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 0b28b36..8e56d7e 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3576,6 +3576,20 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
> 			clear_opt(sb, DELALLOC);
> 	}
> 
> +	/*
> +	 * Adding the LARGE_FILES feature to the superblock adds
> +	 * unnecessary complication to journal credit calculations
> +	 * when delalloc is enabled.  This is a decades-old feature,
> +	 * so just enable it now to simplify things.
> +	 */
> +	if (test_opt(sb, DELALLOC) && !(sb->s_flags & MS_RDONLY) &&
> +	    EXT4_HAS_COMPAT_FEATURE(sb, EXT4_FEATURE_COMPAT_HAS_JOURNAL) &&
> +	    !EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_LARGE_FILE)) {
> +		ext4_update_dynamic_rev(sb);
> +		EXT4_SET_RO_COMPAT_FEATURE(sb,
> +					   EXT4_FEATURE_RO_COMPAT_LARGE_FILE);

This sets the superblock flag, but doesn't actually mark the superblock
dirty.  Later in ext4_fill_super() it is possible that this buffer_head
is discarded without writing it out:

        if (sb->s_blocksize != blocksize) {
                :
                :
                brelse(bh);

While this isn't completely fatal (the next mount would enable this
flag again), it could cause some errors to appear in e2fsck if large
files are created without the large_file feature in the superblock.
It would probably be safer to mark the superblock dirty in this case
so that it is written out.  No need to sync it I think

                ext4_commit_super(sb, 0);

Also, it looks like it is possible to enable delalloc via remount, so
this feature check/set should also be added there?

Cheers, Andreas

> +	}
> +
> 	sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
> 		(test_opt(sb, POSIX_ACL) ? MS_POSIXACL : 0);
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ext4: ensure LARGE_FILE feature when mounting delalloc
  2014-10-02  1:26 ` Andreas Dilger
@ 2014-10-02  2:15   ` Eric Sandeen
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Sandeen @ 2014-10-02  2:15 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ext4 development

On 10/1/14 8:26 PM, Andreas Dilger wrote:
> On Oct 1, 2014, at 3:33 PM, Eric Sandeen <sandeen@redhat.com> wrote:
>> Delalloc write journal reservations only reserve 1 credit,
>> to update the inode if necessary.  However, it may happen
>> once in a filesystem's lifetime that a file will cross
>> the 2G threshold, and require the LARGE_FILE feature to
>> be set in the superblock as well, if it was not set already.
>>
>> This overruns the transaction reservation, and can be
>> demonstrated simply on any ext4 filesystem without the LARGE_FILE
>> feature already set:
>>
>> dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
>> 	conv=notrunc of=testfile
>> sync
>> dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
>> 	conv=notrunc of=testfile
>>
>> leads to:
>>
>> EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
>> EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
>> EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
>> EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
>> EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28
>>
>> It simplifies things if we ensure that when we are running
>> with delalloc, we have LARGE_FILE set already; that way we
>> don't have to potentially set it later during a file write.
>>
>> For any fs of sufficient size, LARGE_FILE is usually set
>> simply due to the size of the resize inode.  And for ext4,
>> HUGE_FILE is set by default.
>>
>> LARGE_FILE is a decades-old compatibility flag, so at this
>> point there is little risk of backwards compatibility problems
>> by enabling it when the filesystem is mounted as ext4.
>>
>> So just set LARGE_FILE if we are mounted delalloc, if it's
>> not set already, and be done with it.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> --- 
>>
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 0b28b36..8e56d7e 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -3576,6 +3576,20 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>> 			clear_opt(sb, DELALLOC);
>> 	}
>>
>> +	/*
>> +	 * Adding the LARGE_FILES feature to the superblock adds
>> +	 * unnecessary complication to journal credit calculations
>> +	 * when delalloc is enabled.  This is a decades-old feature,
>> +	 * so just enable it now to simplify things.
>> +	 */
>> +	if (test_opt(sb, DELALLOC) && !(sb->s_flags & MS_RDONLY) &&
>> +	    EXT4_HAS_COMPAT_FEATURE(sb, EXT4_FEATURE_COMPAT_HAS_JOURNAL) &&
>> +	    !EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_LARGE_FILE)) {
>> +		ext4_update_dynamic_rev(sb);
>> +		EXT4_SET_RO_COMPAT_FEATURE(sb,
>> +					   EXT4_FEATURE_RO_COMPAT_LARGE_FILE);
> 
> This sets the superblock flag, but doesn't actually mark the superblock
> dirty.  Later in ext4_fill_super() it is possible that this buffer_head
> is discarded without writing it out:
> 
>         if (sb->s_blocksize != blocksize) {
>                 :
>                 :
>                 brelse(bh);

sorry, I missed this; skipped to the end too fast.

> While this isn't completely fatal (the next mount would enable this
> flag again), it could cause some errors to appear in e2fsck if large
> files are created without the large_file feature in the superblock.
> It would probably be safer to mark the superblock dirty in this case
> so that it is written out.  No need to sync it I think
> 
>                 ext4_commit_super(sb, 0);
> 
> Also, it looks like it is possible to enable delalloc via remount, so
> this feature check/set should also be added there?

oh, bleah.  I guess so.

Thanks for the review, will send V2.

-Eric

> Cheers, Andreas
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] ext4: fix reservation overflow in ext4_da_write_begin
  2014-10-01 21:33 [PATCH] ext4: ensure LARGE_FILE feature when mounting delalloc Eric Sandeen
  2014-10-02  1:26 ` Andreas Dilger
@ 2014-10-02 15:28 ` Eric Sandeen
  2014-10-02 21:00   ` Andreas Dilger
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Sandeen @ 2014-10-02 15:28 UTC (permalink / raw)
  To: ext4 development

Delalloc write journal reservations only reserve 1 credit,
to update the inode if necessary.  However, it may happen
once in a filesystem's lifetime that a file will cross
the 2G threshold, and require the LARGE_FILE feature to
be set in the superblock as well, if it was not set already.

This overruns the transaction reservation, and can be
demonstrated simply on any ext4 filesystem without the LARGE_FILE
feature already set:

dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
	conv=notrunc of=testfile
sync
dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
	conv=notrunc of=testfile

leads to:

EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28

Adjust the number of credits based on whether the flag is
already set, and whether the current write may extend past the
LARGE_FILE limit.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
--- 

Ok, how's this ... I do like this a lot better than the set-flag-on-
mount-or-remount, which started to get a bit icky.


diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3aa26e9..8d362c2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2515,6 +2515,20 @@ static int ext4_nonda_switch(struct super_block *sb)
 	return 0;
 }
 
+/* We always reserve for an inode update; the superblock could be there too */
+static int ext4_da_write_credits(struct inode *inode, loff_t pos, unsigned len)
+{
+	if (EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
+                                EXT4_FEATURE_RO_COMPAT_LARGE_FILE))
+		return 1;
+
+	if (pos + len <= 0x7fffffffULL)
+		return 1;
+
+	/* We might need to update the superblock to set LARGE_FILE */
+	return 2;
+}
+
 static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 			       loff_t pos, unsigned len, unsigned flags,
 			       struct page **pagep, void **fsdata)
@@ -2565,7 +2579,8 @@ retry_grab:
 	 * of file which has an already mapped buffer.
 	 */
 retry_journal:
-	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, 1);
+	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE,
+			ext4_da_write_credits(inode, pos, len));
 	if (IS_ERR(handle)) {
 		page_cache_release(page);
 		return PTR_ERR(handle);


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] ext4: fix reservation overflow in ext4_da_write_begin
  2014-10-02 15:28 ` [PATCH] ext4: fix reservation overflow in ext4_da_write_begin Eric Sandeen
@ 2014-10-02 21:00   ` Andreas Dilger
  2014-10-11 23:52     ` Theodore Ts'o
  0 siblings, 1 reply; 6+ messages in thread
From: Andreas Dilger @ 2014-10-02 21:00 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

[-- Attachment #1: Type: text/plain, Size: 3110 bytes --]

On Oct 2, 2014, at 9:28 AM, Eric Sandeen <sandeen@redhat.com> wrote:
> Delalloc write journal reservations only reserve 1 credit,
> to update the inode if necessary.  However, it may happen
> once in a filesystem's lifetime that a file will cross
> the 2G threshold, and require the LARGE_FILE feature to
> be set in the superblock as well, if it was not set already.
> 
> This overruns the transaction reservation, and can be
> demonstrated simply on any ext4 filesystem without the LARGE_FILE
> feature already set:
> 
> dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
> 	conv=notrunc of=testfile
> sync
> dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
> 	conv=notrunc of=testfile
> 
> leads to:
> 
> EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
> EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
> EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
> EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
> EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28
> 
> Adjust the number of credits based on whether the flag is
> already set, and whether the current write may extend past the
> LARGE_FILE limit.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> --- 
> 
> Ok, how's this ... I do like this a lot better than the set-flag-on-
> mount-or-remount, which started to get a bit icky.
> 
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3aa26e9..8d362c2 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2515,6 +2515,20 @@ static int ext4_nonda_switch(struct super_block *sb)
> 	return 0;
> }
> 
> +/* We always reserve for an inode update; the superblock could be there too */
> +static int ext4_da_write_credits(struct inode *inode, loff_t pos, unsigned len)
> +{
> +	if (EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,

This could be marked "likely()" I suspect, but not critical.

> +                                EXT4_FEATURE_RO_COMPAT_LARGE_FILE))
> +		return 1;
> +
> +	if (pos + len <= 0x7fffffffULL)
> +		return 1;
> +
> +	/* We might need to update the superblock to set LARGE_FILE */
> +	return 2;
> +}
> +
> static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
> 			       loff_t pos, unsigned len, unsigned flags,
> 			       struct page **pagep, void **fsdata)
> @@ -2565,7 +2579,8 @@ retry_grab:
> 	 * of file which has an already mapped buffer.
> 	 */
> retry_journal:
> -	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, 1);
> +	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE,
> +			ext4_da_write_credits(inode, pos, len));
> 	if (IS_ERR(handle)) {
> 		page_cache_release(page);
> 		return PTR_ERR(handle);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ext4: fix reservation overflow in ext4_da_write_begin
  2014-10-02 21:00   ` Andreas Dilger
@ 2014-10-11 23:52     ` Theodore Ts'o
  0 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2014-10-11 23:52 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Eric Sandeen, ext4 development

On Thu, Oct 02, 2014 at 03:00:23PM -0600, Andreas Dilger wrote:
> On Oct 2, 2014, at 9:28 AM, Eric Sandeen <sandeen@redhat.com> wrote:
> > Delalloc write journal reservations only reserve 1 credit,
> > to update the inode if necessary.  However, it may happen
> > once in a filesystem's lifetime that a file will cross
> > the 2G threshold, and require the LARGE_FILE feature to
> > be set in the superblock as well, if it was not set already.
> > 
> > This overruns the transaction reservation, and can be
> > demonstrated simply on any ext4 filesystem without the LARGE_FILE
> > feature already set:
> > 
> > dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
> > 	conv=notrunc of=testfile
> > sync
> > dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
> > 	conv=notrunc of=testfile
> > 
> > leads to:
> > 
> > EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
> > EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
> > EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
> > EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
> > EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28
> > 
> > Adjust the number of credits based on whether the flag is
> > already set, and whether the current write may extend past the
> > LARGE_FILE limit.
> > 
> > Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> 
> Reviewed-by: Andreas Dilger <adilger@dilger.ca>

Applied, thanks.  I added the likely() qualifer per Andreas'
suggestion.

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-10-11 23:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-01 21:33 [PATCH] ext4: ensure LARGE_FILE feature when mounting delalloc Eric Sandeen
2014-10-02  1:26 ` Andreas Dilger
2014-10-02  2:15   ` Eric Sandeen
2014-10-02 15:28 ` [PATCH] ext4: fix reservation overflow in ext4_da_write_begin Eric Sandeen
2014-10-02 21:00   ` Andreas Dilger
2014-10-11 23:52     ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).