public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Baokun Li <libaokun@linux.alibaba.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>,
	libaokun@linux.alibaba.com
Subject: Re: [PATCH] Fix default orphan file size calculations
Date: Thu, 12 Mar 2026 12:58:28 +0800	[thread overview]
Message-ID: <bb18fb00-3875-458d-9cf9-589d9cf56aef@linux.alibaba.com> (raw)
In-Reply-To: <20260311172755.GA74864@macsyma-wired.lan>

Hi Ted,

On 3/12/26 1:27 AM, Theodore Tso wrote:
> On Thu, Mar 12, 2026 at 12:27:07AM +0800, Baokun Li wrote:
>> In fact, the patch that introduced it was not the latest version —
>> there was a later v3 that is consistent with the upstream kernel:
> Hmm, I wonder why b4 didn't pick up the newer version of the patch.
> Maybe I screwed up and missed the -c option to "b4 am -c".
>
> The main difference between my proposed fix and your v3 patch is that
> if the user doesn't specify an explicit orphan file size, with the v3
> patch, it might max out to 8MB when the block size is 64k.  With my
> fix, it will max out to 2MB in those situations.  The user can
> explicitly specify a orphan file size as 8MB, but it makes the default
> to be 2MB.

Yes, but if we cap the maximum size at 8 MB, then the maximum orphan file
created with the previous default settings would be 512 blocks. On 64 KB
page-size systems, older mkfs versions could create a 64 KB-block ext4 fs
with a 32 MB orphan file, so that would break forward compatibility.

This is also why kernel commit 7c11c56eb32e ("ext4: align max orphan file
size with e2fsprogs limit") changed the limit to 512 blocks instead.
That at least preserves compatibility for filesystems created with the
default mkfs options.

>
> In retrospect, I think we went wrong when we capped the orphan file in
> terms of bytes instead of blocks in the kernel.  When the block size
> is 64k, 8MB is only 32 blocks.  When the block size is 4k, the orphan
> file size can be up to 512 blocks.  And the scalability is really a
> function of the number of blocks, not the number of bytes, since with
> the orphan file, we use a hash that maps the cpu number to a logical
> block number in orphan size.

Yes, limiting it by the number of blocks is simpler, and that is exactly
what kernel commit 7c11c56eb32e does.

>
> Now, most of the time, I suspect 32 blocks is plenty most of the time,
> since it's unlikely we'll have that many running processes trying to
> truncate files or something else that requires adding the inode to the
> oprhan file.
>
> So I thought about just using a default orphan inode size of 32 file
> system blocks.  I also thought about changing the kernel to allow size
> of the orphan file to be say, up to 256 blocks.  And also maybe
> allowing mke2fs and tune2fs to accept an extended options
> orphan_file_blocks which takes an argument denominated in blocks.
>
> Ultimately, though, *most* of the time, consuming 512MB on the orphan
> file inode if the file system is say, 2TB.  So I decided it wasn't
> worth the effort to change how things worked.  But if we were starting
> from scratch, I think we would have been better of doing things in
> terms of blocks, instead of bytes.
>
> But maybe we should go and make that change.  What do folks think?
>
I’d prefer reverting e2fsprogs commit 6f03c698ef53 and taking v3 instead.
It looks like the simplest fix for now, and it should still preserve some
compatibility.


Regards,
Baokun


      reply	other threads:[~2026-03-12  4:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 14:11 [PATCH] Fix default orphan file size calculations Theodore Ts'o
2026-03-11 15:01 ` Andreas Dilger
2026-03-11 16:27 ` Baokun Li
2026-03-11 17:27   ` Theodore Tso
2026-03-12  4:58     ` Baokun Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb18fb00-3875-458d-9cf9-589d9cf56aef@linux.alibaba.com \
    --to=libaokun@linux.alibaba.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox