linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: e2fsck max blocks for huge non-extent file
       [not found] <CAE2LqHL6uY=Sq2+aVtW-Lkbu9mvjFkaNqLaDA8Bkpmvx9AjHBg@mail.gmail.com>
@ 2025-01-13 16:33 ` Theodore Ts'o
  2025-01-13 18:35   ` Darrick J. Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2025-01-13 16:33 UTC (permalink / raw)
  To: Catalin Patulea; +Cc: linux-ext4, Kazuya Mio

On Mon, Jan 13, 2025 at 12:49:19AM -0500, Catalin Patulea wrote:
> 
> I have an ext3 filesystem on which I manually enabled huge_file
> (files >2 TB) using tune2fs; then created a 3 TB file (backup image
> of another disk).  Now, I am running e2fsck and it reports errors:

Hmm, it looks like this has been broken for a while.  I've done a
quick look, and it appears this has been the case since e2fsprogs
1.28 and this commit:

commit da307041e75bdf3b24c1eb43132a4f9d8a1b3844
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Tue May 21 21:19:14 2002 -0400

    Check for inodes which are too big (either too many blocks, or
    would cause i_size to be too big), and offer to truncate the inode.
    Remove old bogus i_size checks.
    
    Add test case which tests e2fsck's handling of large sparse files.
    Older e2fsck with the old(er) bogus i_size checks didn't handle
    this correctly.

I think no one noticed since trying to support files this large on a
non-extent file is so inefficient and painful that in practice anyone
trying to use files this large would be using ext4, and not a really
ancient ext3 file system.

The fix might be as simple as this, but I haven't had a chance to test
it and do appropriate regression tests....

diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index eb73922d3..e460a75f4 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -3842,7 +3842,7 @@ static int process_block(ext2_filsys fs,
 		problem = PR_1_TOOBIG_DIR;
 	if (p->is_dir && p->num_blocks + 1 >= p->max_blocks)
 		problem = PR_1_TOOBIG_DIR;
-	if (p->is_reg && p->num_blocks + 1 >= p->max_blocks)
+	if (p->is_reg && p->num_blocks + 1 >= 1U << 31)
 		problem = PR_1_TOOBIG_REG;
 	if (!p->is_dir && !p->is_reg && blockcnt > 0)
 		problem = PR_1_TOOBIG_SYMLINK;


						- Ted

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: e2fsck max blocks for huge non-extent file
  2025-01-13 16:33 ` e2fsck max blocks for huge non-extent file Theodore Ts'o
@ 2025-01-13 18:35   ` Darrick J. Wong
  2025-01-13 19:26     ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Darrick J. Wong @ 2025-01-13 18:35 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Catalin Patulea, linux-ext4, Kazuya Mio

On Mon, Jan 13, 2025 at 11:33:45AM -0500, Theodore Ts'o wrote:
> On Mon, Jan 13, 2025 at 12:49:19AM -0500, Catalin Patulea wrote:
> > 
> > I have an ext3 filesystem on which I manually enabled huge_file
> > (files >2 TB) using tune2fs; then created a 3 TB file (backup image
> > of another disk).  Now, I am running e2fsck and it reports errors:
> 
> Hmm, it looks like this has been broken for a while.  I've done a
> quick look, and it appears this has been the case since e2fsprogs
> 1.28 and this commit:
> 
> commit da307041e75bdf3b24c1eb43132a4f9d8a1b3844
> Author: Theodore Ts'o <tytso@mit.edu>
> Date:   Tue May 21 21:19:14 2002 -0400
> 
>     Check for inodes which are too big (either too many blocks, or
>     would cause i_size to be too big), and offer to truncate the inode.
>     Remove old bogus i_size checks.
>     
>     Add test case which tests e2fsck's handling of large sparse files.
>     Older e2fsck with the old(er) bogus i_size checks didn't handle
>     this correctly.
> 
> I think no one noticed since trying to support files this large on a
> non-extent file is so inefficient and painful that in practice anyone
> trying to use files this large would be using ext4, and not a really
> ancient ext3 file system.
> 
> The fix might be as simple as this, but I haven't had a chance to test
> it and do appropriate regression tests....
> 
> diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> index eb73922d3..e460a75f4 100644
> --- a/e2fsck/pass1.c
> +++ b/e2fsck/pass1.c
> @@ -3842,7 +3842,7 @@ static int process_block(ext2_filsys fs,
>  		problem = PR_1_TOOBIG_DIR;
>  	if (p->is_dir && p->num_blocks + 1 >= p->max_blocks)
>  		problem = PR_1_TOOBIG_DIR;
> -	if (p->is_reg && p->num_blocks + 1 >= p->max_blocks)
> +	if (p->is_reg && p->num_blocks + 1 >= 1U << 31)

Hmm -- num_blocks is ... the number of "extent records", right?  And on
a !extents file, each block mapped by an {in,}direct block counts as a
separate "extent record", right?

In that case, I think (1U<<31) isn't quite right, because the very large
file could have an ACL block, or (shudder) a "hurd translator block".
So that's (1U<<31) + 2 for !extents files.

For extents files, shouldn't this be (1U<<48) + 2?  Since you /could/
create a horrifingly large extent tree with a hojillion little
fragments, right?  Even if it took a million years to create such a
monster? :)

--D

>  		problem = PR_1_TOOBIG_REG;
>  	if (!p->is_dir && !p->is_reg && blockcnt > 0)
>  		problem = PR_1_TOOBIG_SYMLINK;
> 
> 
> 						- Ted
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: e2fsck max blocks for huge non-extent file
  2025-01-13 18:35   ` Darrick J. Wong
@ 2025-01-13 19:26     ` Theodore Ts'o
  2025-01-17  3:26       ` Catalin Patulea
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2025-01-13 19:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Catalin Patulea, linux-ext4, Kazuya Mio

On Mon, Jan 13, 2025 at 10:35:17AM -0800, Darrick J. Wong wrote:
> 
> Hmm -- num_blocks is ... the number of "extent records", right?  And on
> a !extents file, each block mapped by an {in,}direct block counts as a
> separate "extent record", right?
> 
> In that case, I think (1U<<31) isn't quite right, because the very large
> file could have an ACL block, or (shudder) a "hurd translator block".
> So that's (1U<<31) + 2 for !extents files.
> 
> For extents files, shouldn't this be (1U<<48) + 2?  Since you /could/
> create a horrifingly large extent tree with a hojillion little
> fragments, right?  Even if it took a million years to create such a
> monster? :)

The code paths in question are only used for indirect mapped files.
The logic for handling extent-mapped files is check_blocks_extents()
in modern versions of e2fsprogs, which is why Catalin was only seeing
this for an ext3 file systems that had huge_file enabled.

You're right though that we shouldn't be using num_blocks at all for
testing for regular files or directory files that are too big, since
num_blocks include blocks for extended attribute blocks, the
ind/dind/tind blocks, etc.  We do care about num_blocks being too big
for the !huge_file case since for !huge_file file systems i_blocks is
denominated in 512 byte units, and is only 32-bits wide.  So in that
case, we *do* care about the size of the file including metadata
blocks being no more than 2TiB.

						- Ted


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: e2fsck max blocks for huge non-extent file
  2025-01-13 19:26     ` Theodore Ts'o
@ 2025-01-17  3:26       ` Catalin Patulea
  0 siblings, 0 replies; 4+ messages in thread
From: Catalin Patulea @ 2025-01-17  3:26 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Darrick J. Wong, linux-ext4, Kazuya Mio

On Mon, Jan 13, 2025 at 11:33 AM Theodore Ts'o <tytso@mit.edu> wrote:
> The fix might be as simple as this, but I haven't had a chance to test
> it and do appropriate regression tests....
>
> diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> index eb73922d3..e460a75f4 100644
> --- a/e2fsck/pass1.c
> +++ b/e2fsck/pass1.c
> @@ -3842,7 +3842,7 @@ static int process_block(ext2_filsys fs,
>                 problem = PR_1_TOOBIG_DIR;
>         if (p->is_dir && p->num_blocks + 1 >= p->max_blocks)
>                 problem = PR_1_TOOBIG_DIR;
> -       if (p->is_reg && p->num_blocks + 1 >= p->max_blocks)
> +       if (p->is_reg && p->num_blocks + 1 >= 1U << 31)
>                 problem = PR_1_TOOBIG_REG;
>         if (!p->is_dir && !p->is_reg && blockcnt > 0)
>                 problem = PR_1_TOOBIG_SYMLINK;
I can confirm that with this patch, e2fsck passes on the test image
created as shown in my original email (dd if=/dev/zero ...). I also
confirm 'make check' passes (390 tests succeeded).

Do you have any thoughts on what a practical regression test would
look like? My repro instructions require 2.1 TB of physical disk space
and root access, which I am guessing is out of the question. For my
local tests I have been using 'qemu-nbd' and QCOW2 images to reduce
the disk space requirements, but it still requires root and ~30 minute
runtime, which still seems impractical.

> ind/dind/tind blocks, etc.  We do care about num_blocks being too big
> for the !huge_file case since for !huge_file file systems i_blocks is
> denominated in 512 byte units, and is only 32-bits wide.  So in that
> case, we *do* care about the size of the file including metadata
> blocks being no more than 2TiB.
In the proposed patch, "p->num_blocks + 1 >= 1U << 31", that's 2^31
512-byte blocks, would that limit file size to 1 TB?



> You're right though that we shouldn't be using num_blocks at all for
> testing for regular files or directory files that are too big, since
> num_blocks include blocks for extended attribute blocks, the
> ind/dind/tind blocks, etc.  We do care about num_blocks being too big
> for the !huge_file case since for !huge_file file systems i_blocks is
> denominated in 512 byte units, and is only 32-bits wide.  So in that
> case, we *do* care about the size of the file including metadata
> blocks being no more than 2TiB.
>
>                                                 - Ted
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-01-17  3:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAE2LqHL6uY=Sq2+aVtW-Lkbu9mvjFkaNqLaDA8Bkpmvx9AjHBg@mail.gmail.com>
2025-01-13 16:33 ` e2fsck max blocks for huge non-extent file Theodore Ts'o
2025-01-13 18:35   ` Darrick J. Wong
2025-01-13 19:26     ` Theodore Ts'o
2025-01-17  3:26       ` Catalin Patulea

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).