public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Tso" <tytso@mit.edu>
To: Daniel Tang <danielzgtg.opensource@gmail.com>
Cc: linux-ext4@vger.kernel.org, "Darrick J. Wong" <djwong@kernel.org>
Subject: Re: [PATCH e2fsprogs] e2fsck: preen inline data no attr
Date: Sun, 8 Mar 2026 00:04:03 -0500	[thread overview]
Message-ID: <20260308050403.GA61017@macsyma.local> (raw)
In-Reply-To: <25105329.ouqheUzb2q@daniel-desktop3>

On Sat, Mar 07, 2026 at 08:42:04PM -0500, Daniel Tang wrote:
> 
> 
> > More importantly, information about the source of the inconsistency
> > report would be written to the superblock
> 
> What could have the opportunity to write anything to the superblock?
> Before a panic, there's no inconsistency.

I don't know that.  It's possible that the kernel could have detected
an inconsistency, and if you've left the default errors=continue
behavior, this will cause a EXT4-fs error log, but if people don't pay
attention to the console log messages, they might not realize it.
That's why I needed to rule it out.

> After a panic, Linux would
> say "not syncing", or after a panic, hardware stops before new writes
> can reach the disk. systemd, as shown by `systemd-analyze plot` runs
> fsck before attempting any `.mount`.

The problem is that *if* the file system does not have ERROR_FS set
before the crash, *then* the fsck log that you sent me can't
*possibly* have been the first fsck run after the crash.

That's because this message:

root@daniel-tablet1:~# fsck.ext4 -p /dev/nvme0n1p7 # For example
/dev/nvme0n1p7 contains a file system with errors, check forced.

... means that the ERROR_FS bit has already been set.  And the only
two entities that could have set that bit is (a) the kernel, or (b)
fsck.ext4.

Now yes, most distributions will run fsck before mounting.  But
normally, all fsck.ext4 will do is replay the journal (if necessary),
and then perform basic sanity checks on the superblock.  It doesn't do
a full check of the file system unless (1) something is obviously
wrong with the superblock, (2) the ERROR_FS bit is set (check forced
message above), or (3) the user has explicitly requested a full fsck
by running fsck with the -f flag.

So if the theory that the ERROR_FS bit was not set by the kernel is
correct, then there must be an fsck run where (a) the "check forced"
message is not present (so the ERROR_FS bit is not yet set), and (b)
the fsck log shows that the fsck ran into some kind of major
difficulty or something obviously wrong with the file system leading
to the ERROR_FS bit being set.  It was this log that I was asking if
you could find, since the one that you sent me had the "checked force"
message, meaning ERROR_FS was already set.

> Inline data is for mostly-reading
> 30,000 mostly-small Javascript files totalling 100 MiB.

You actually have a lot of Javascript files which are smaller than 160
bytes or so?  That's.... surprising.

>  Fast commit is
> for monthly-apt-upgrading 250,000-max (TeX Live) 300-average
> (google-chrome-stable) files totaling 64 GiB-max 2 GiB-average.

Fast commit only happes if you have workloads where you need fsync(2)
to be fast, and you don't mind writing some extra blocks 5 seconds
later when the large (non-fast) journal commit takes place.

> * apt-get is 7% (48.018 s) faster with fast_commit writes

That's.... surprising.

Ah... looking at what you were doing, it appears you were setting
fast_commit not on the root file system itself, but on the file system
where /var/lib/containerd is located.  I'm going to guess that fsync's
being issued by apt are getting amplified by whatever
docker/containerd is doing with the writes plus fsync's to the
writeable image layer file.

If that's your actual use case, try installing the apt-eatmydata
package in your Dockerfile, and/or before you do this:

# time docker run --rm -v /run/archives:/mnt ubuntu:24.04 bash -c 'dpkg --force-all -i /mnt/*.deb ; sync'

If you were just using this as a proxy for your real world use
case.... I'd suggest finding a different way of measuring the benefits
of fast commit.  That is, if what you *really* care about is running
"apt get" on real, bare metal system (and not in your containers),
then you need to measure that.  Trying to use docker run as a proxy is
going to be misleading.

But if you are really trying to improve container build times, see this article:

https://blog.kronis.dev/blog/increase-container-build-speeds-when-you-use-apt

It uses the eatmydata package directly, instead of using
"apt-eatmydata", but it basically points out why having apt issue a
huge number of fsync on what are generally disposable images in a CI
run, or while building new docker images, is just waste of resources.

Cheers,

						- Ted

      reply	other threads:[~2026-03-08  5:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-04 13:56 [PATCH e2fsprogs] e2fsck: preen inline data no attr Daniel Tang
2026-03-06 11:16 ` Andreas Dilger
2026-03-06 15:51 ` Theodore Tso
2026-03-06 18:03   ` Daniel Tang
2026-03-06 22:23     ` Theodore Tso
2026-03-08  1:42       ` Daniel Tang
2026-03-08  5:04         ` Theodore Tso [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260308050403.GA61017@macsyma.local \
    --to=tytso@mit.edu \
    --cc=danielzgtg.opensource@gmail.com \
    --cc=djwong@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox