linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Catalin Patulea <cronos586@gmail.com>,
	linux-ext4@vger.kernel.org, Kazuya Mio <k-mio@sx.jp.nec.com>
Subject: Re: e2fsck max blocks for huge non-extent file
Date: Mon, 13 Jan 2025 10:35:17 -0800	[thread overview]
Message-ID: <20250113183517.GC6152@frogsfrogsfrogs> (raw)
In-Reply-To: <20250113163345.GO1284777@mit.edu>

On Mon, Jan 13, 2025 at 11:33:45AM -0500, Theodore Ts'o wrote:
> On Mon, Jan 13, 2025 at 12:49:19AM -0500, Catalin Patulea wrote:
> > 
> > I have an ext3 filesystem on which I manually enabled huge_file
> > (files >2 TB) using tune2fs; then created a 3 TB file (backup image
> > of another disk).  Now, I am running e2fsck and it reports errors:
> 
> Hmm, it looks like this has been broken for a while.  I've done a
> quick look, and it appears this has been the case since e2fsprogs
> 1.28 and this commit:
> 
> commit da307041e75bdf3b24c1eb43132a4f9d8a1b3844
> Author: Theodore Ts'o <tytso@mit.edu>
> Date:   Tue May 21 21:19:14 2002 -0400
> 
>     Check for inodes which are too big (either too many blocks, or
>     would cause i_size to be too big), and offer to truncate the inode.
>     Remove old bogus i_size checks.
>     
>     Add test case which tests e2fsck's handling of large sparse files.
>     Older e2fsck with the old(er) bogus i_size checks didn't handle
>     this correctly.
> 
> I think no one noticed since trying to support files this large on a
> non-extent file is so inefficient and painful that in practice anyone
> trying to use files this large would be using ext4, and not a really
> ancient ext3 file system.
> 
> The fix might be as simple as this, but I haven't had a chance to test
> it and do appropriate regression tests....
> 
> diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> index eb73922d3..e460a75f4 100644
> --- a/e2fsck/pass1.c
> +++ b/e2fsck/pass1.c
> @@ -3842,7 +3842,7 @@ static int process_block(ext2_filsys fs,
>  		problem = PR_1_TOOBIG_DIR;
>  	if (p->is_dir && p->num_blocks + 1 >= p->max_blocks)
>  		problem = PR_1_TOOBIG_DIR;
> -	if (p->is_reg && p->num_blocks + 1 >= p->max_blocks)
> +	if (p->is_reg && p->num_blocks + 1 >= 1U << 31)

Hmm -- num_blocks is ... the number of "extent records", right?  And on
a !extents file, each block mapped by an {in,}direct block counts as a
separate "extent record", right?

In that case, I think (1U<<31) isn't quite right, because the very large
file could have an ACL block, or (shudder) a "hurd translator block".
So that's (1U<<31) + 2 for !extents files.

For extents files, shouldn't this be (1U<<48) + 2?  Since you /could/
create a horrifingly large extent tree with a hojillion little
fragments, right?  Even if it took a million years to create such a
monster? :)

--D

>  		problem = PR_1_TOOBIG_REG;
>  	if (!p->is_dir && !p->is_reg && blockcnt > 0)
>  		problem = PR_1_TOOBIG_SYMLINK;
> 
> 
> 						- Ted
> 

  reply	other threads:[~2025-01-13 18:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAE2LqHL6uY=Sq2+aVtW-Lkbu9mvjFkaNqLaDA8Bkpmvx9AjHBg@mail.gmail.com>
2025-01-13 16:33 ` e2fsck max blocks for huge non-extent file Theodore Ts'o
2025-01-13 18:35   ` Darrick J. Wong [this message]
2025-01-13 19:26     ` Theodore Ts'o
2025-01-17  3:26       ` Catalin Patulea

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250113183517.GC6152@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cronos586@gmail.com \
    --cc=k-mio@sx.jp.nec.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).