public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* ext4 v6.15-rc2 baseline
@ 2025-04-16 17:56 Luis Chamberlain
  2025-04-16 23:34 ` Theodore Ts'o
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Chamberlain @ 2025-04-16 17:56 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: Luis Chamberlain, kdevops, dave, jack

ext4 developers,

kdevops has run fstests on v6.15-rc2 across the different ext4 profiles
it currently defines, and the results are below.

The profiles which kdevops currently supports are:

 - ext4_defaults
 - ext4_1k
 - ext4_2k
 - ext4_4k
 - ext4_advanced_features
 - ext4_bigalloc32k_4k
 - ext4_bigalloc64k_4k
 - ext4_bigalloc1024k_4k
 - ext4_bigalloc2048k_4k

These are defined in the ext4 jinja2 template on kdevops [0] and described
on the ext4 kconfig [1]. Adding support for more profiles is just a matter
of editing these two files, please feel free to send a patch if you'd like
kdevops to test more profiles. A full tarball of the fstests results are
available on kdevops-results-archive [2]. Since we leverage git-lfs, you can
opt to only download this single tarball as follows:

GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/linux-kdevops/kdevops-results-archive.git
cd kdevops-results-archive
git lfs fetch --include "fstests/gh/linux-ext4-kpd/20250415/0001/linux-6-15-rc2/8ffd015db85f.xz
git lfs checkout "fstests/gh/linux-ext4-kpd/20250415/0001/linux-6-15-rc2/8ffd015db85f.xz

Few questions:

 - Is this useful information?
 - Do you want results for each rc release posted to the mailing list?

[0] https://github.com/linux-kdevops/kdevops/blob/main/playbooks/roles/fstests/templates/ext4/ext4.config
[1] https://github.com/linux-kdevops/kdevops/blob/main/workflows/fstests/ext4/Kconfig
[2] https://github.com/linux-kdevops/kdevops-results-archive/commit/a74831cc4300e702eef9bafd31cc5dc4b8dda5e8

  workflow: fstests
  tree: linux
  ref: 8ffd015db85f
  test number: 0001

Detailed test report:

KERNEL:    6.15.0-rc2-g8ffd015db85f
CPUS:      8
MEMORY:    4 GiB

ext4_defaults: 793 tests, 20 failures, 271 skipped, 10397 seconds
  Failures: ext4/034 ext4/055 generic/082 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/235
    generic/270 generic/381 generic/382 generic/566 generic/587
    generic/600 generic/601 generic/681 generic/682 generic/741
ext4_1k: 793 tests, 19 failures, 326 skipped, 10898 seconds
  Failures: ext4/034 ext4/055 generic/082 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/235
    generic/381 generic/382 generic/566 generic/587 generic/600
    generic/601 generic/681 generic/682 generic/741
ext4_2k: 793 tests, 19 failures, 323 skipped, 9737 seconds
  Failures: ext4/034 ext4/055 generic/082 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/235
    generic/381 generic/382 generic/566 generic/587 generic/600
    generic/601 generic/681 generic/682 generic/741
ext4_4k: 793 tests, 19 failures, 320 skipped, 9026 seconds
  Failures: ext4/034 ext4/055 generic/082 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/235
    generic/381 generic/382 generic/566 generic/587 generic/600
    generic/601 generic/681 generic/682 generic/741
ext4_bigalloc2048k_4k: 793 tests, 43 failures, 357 skipped, 7481 seconds
  Failures: ext4/033 ext4/034 ext4/045 ext4/055 generic/075
    generic/082 generic/091 generic/112 generic/127 generic/219
    generic/230 generic/231 generic/232 generic/233 generic/234
    generic/235 generic/251 generic/263 generic/280 generic/365
    generic/381 generic/382 generic/435 generic/471 generic/566
    generic/587 generic/600 generic/601 generic/614 generic/629
    generic/634 generic/635 generic/643 generic/645 generic/676
    generic/681 generic/682 generic/698 generic/732 generic/736
    generic/738 generic/741 generic/754
ext4_bigalloc1024k_4k: 793 tests, 39 failures, 350 skipped, 7800 seconds
  Failures: ext4/033 ext4/034 ext4/045 ext4/055 generic/075
    generic/082 generic/091 generic/112 generic/127 generic/219
    generic/230 generic/231 generic/232 generic/233 generic/234
    generic/235 generic/251 generic/263 generic/280 generic/365
    generic/381 generic/382 generic/435 generic/566 generic/587
    generic/600 generic/601 generic/614 generic/629 generic/634
    generic/635 generic/643 generic/681 generic/682 generic/698
    generic/732 generic/738 generic/741 generic/754
ext4_bigalloc32k_4k: 793 tests, 27 failures, 350 skipped, 8434 seconds
  Failures: ext4/033 ext4/034 ext4/055 generic/075 generic/082
    generic/091 generic/112 generic/127 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/234
    generic/235 generic/263 generic/280 generic/381 generic/382
    generic/566 generic/587 generic/600 generic/601 generic/681
    generic/682 generic/741
ext4_bigalloc64k_4k: 793 tests, 27 failures, 350 skipped, 8575 seconds
  Failures: ext4/033 ext4/034 ext4/055 generic/075 generic/082
    generic/091 generic/112 generic/127 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/234
    generic/235 generic/263 generic/280 generic/381 generic/382
    generic/566 generic/587 generic/600 generic/601 generic/681
    generic/682 generic/741
ext4_bigalloc16k_4k: 793 tests, 27 failures, 350 skipped, 8755 seconds
  Failures: ext4/033 ext4/034 ext4/055 generic/075 generic/082
    generic/091 generic/112 generic/127 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/234
    generic/235 generic/263 generic/280 generic/381 generic/382
    generic/566 generic/587 generic/600 generic/601 generic/681
    generic/682 generic/741
ext4_advanced_features: 793 tests, 21 failures, 279 skipped, 10373 seconds
  Failures: ext4/034 ext4/055 generic/082 generic/219 generic/223
    generic/230 generic/231 generic/232 generic/233 generic/235
    generic/270 generic/381 generic/382 generic/477 generic/566
    generic/587 generic/600 generic/601 generic/681 generic/682
    generic/741
Totals: 7930 tests, 3276 skipped, 261 failures, 0 errors, 82423s

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-16 17:56 ext4 v6.15-rc2 baseline Luis Chamberlain
@ 2025-04-16 23:34 ` Theodore Ts'o
  2025-04-17 16:38   ` Darrick J. Wong
  2025-04-17 16:49   ` Theodore Ts'o
  0 siblings, 2 replies; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-16 23:34 UTC (permalink / raw)
  To: Luis Chamberlain; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

[-- Attachment #1: Type: text/plain, Size: 1441 bytes --]

On Wed, Apr 16, 2025 at 10:56:17AM -0700, Luis Chamberlain wrote:
> ext4 developers,
> 
> kdevops has run fstests on v6.15-rc2 across the different ext4 profiles
> it currently defines, and the results are below.

Hmm, there are quite a lot of failures that aren't in my baseline.  In
particular, I work very hard to make sure the 4k profile is clean, and
as you can see in the attached file, it is.  But here's a short
summary (for the full set, including the versions used for the full
test run, see the attached file.)

ext4/4k: 587 tests, 55 skipped, 5340 seconds
ext4/1k: 581 tests, 59 skipped, 5700 seconds
ext4/ext3: 579 tests, 1 failures, 149 skipped, 4715 seconds
  Failures: ext4/028
ext4/encrypt: 562 tests, 175 skipped, 2982 seconds
ext4/nojournal: 579 tests, 127 skipped, 3955 seconds
   ...

I'll have to take a look at your test results tarball (I assume it
includes the NNN.out.bad and NNN.full files, right) to see what's
going on.

There are some exclude files[1][2] which I use to reduce noise, but
that doesn't seem to explain many of your failures that you have reported.

[1] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/global_exclude
[2] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/exclude

>  - Is this useful information?

Maybe; the question is why are your results so different from my results.

       	   	       	       	    	    - Ted

[-- Attachment #2: ext4-baseline-6.15-rc2 --]
[-- Type: text/plain, Size: 2101 bytes --]

TESTRUNID: ltm-20250414133140
KERNEL:    kernel 6.15.0-rc2-xfstests #22 SMP PREEMPT_DYNAMIC Mon Apr 14 12:18:46 EDT 2025 x86_64
CMDLINE:   --kernel gs://gce-xfstests/kernel.deb -c ext4/all -g auto
CPUS:      2
MEM:       7680

ext4/4k: 587 tests, 55 skipped, 5340 seconds
ext4/1k: 581 tests, 59 skipped, 5700 seconds
ext4/ext3: 579 tests, 1 failures, 149 skipped, 4715 seconds
  Failures: ext4/028
ext4/encrypt: 562 tests, 175 skipped, 2982 seconds
ext4/nojournal: 579 tests, 127 skipped, 3955 seconds
ext4/ext3conv: 584 tests, 57 skipped, 5164 seconds
ext4/adv: 580 tests, 2 failures, 63 skipped, 4873 seconds
  Failures: generic/757 generic/764
ext4/dioread_nolock: 585 tests, 55 skipped, 5538 seconds
ext4/data_journal: 592 tests, 8 failures, 1 errors, 135 skipped, 4464 seconds
  Failures: generic/127
  Flaky: generic/032: 20% (1/5)   generic/475: 40% (2/5)
  Errors: generic/475
ext4/bigalloc_4k: 558 tests, 1 failures, 58 skipped, 5128 seconds
  Flaky: generic/234: 20% (1/5)
ext4/bigalloc_1k: 559 tests, 69 skipped, 4965 seconds
ext4/dax: 571 tests, 2 failures, 160 skipped, 3111 seconds
  Failures: generic/344 generic/363
Totals: 6941 tests, 1162 skipped, 34 failures, 1 errors, 52426s

FSTESTIMG: gce-xfstests/xfstests-amd64-202504110828
FSTESTPRJ: gce-xfstests
FSTESTVER: blktests 236edfd (Tue, 18 Mar 2025 12:56:26 +0900)
FSTESTVER: fio  fio-3.39 (Tue, 18 Feb 2025 08:36:57 -0700)
FSTESTVER: fsverity v1.6-2-gee7d74d (Mon, 17 Feb 2025 11:41:58 -0800)
FSTESTVER: ima-evm-utils v1.5 (Mon, 6 Mar 2023 07:40:07 -0500)
FSTESTVER: libaio   libaio-0.3.108-82-gb8eadc9 (Thu, 2 Jun 2022 13:33:11 +0200)
FSTESTVER: ltp  20250130-195-ge2bbba0c1 (Fri, 11 Apr 2025 18:06:15 +0800)
FSTESTVER: quota  v4.05-69-g68952f1 (Mon, 7 Oct 2024 15:45:56 -0400)
FSTESTVER: util-linux v2.41 (Tue, 18 Mar 2025 13:50:51 +0100)
FSTESTVER: xfsprogs v6.13.0-2-gf0d16c9e (Tue, 1 Apr 2025 20:23:42 -0400)
FSTESTVER: xfstests-bld 42bcd9aa (Wed, 9 Apr 2025 07:51:57 -0400)
FSTESTVER: xfstests v2025.03.30-11-g344015670 (Mon, 31 Mar 2025 13:50:06 -0400)
FSTESTVER: zz_build-distro bookworm
FSTESTSET: -g auto
FSTESTOPT: aex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-16 23:34 ` Theodore Ts'o
@ 2025-04-17 16:38   ` Darrick J. Wong
  2025-04-17 18:37     ` Theodore Ts'o
  2025-04-17 16:49   ` Theodore Ts'o
  1 sibling, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2025-04-17 16:38 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Luis Chamberlain, adilger.kernel, linux-ext4, kdevops, dave, jack

On Wed, Apr 16, 2025 at 06:34:15PM -0500, Theodore Ts'o wrote:
> TESTRUNID: ltm-20250414133140
> KERNEL:    kernel 6.15.0-rc2-xfstests #22 SMP PREEMPT_DYNAMIC Mon Apr 14 12:18:46 EDT 2025 x86_64
> CMDLINE:   --kernel gs://gce-xfstests/kernel.deb -c ext4/all -g auto
> CPUS:      2
> MEM:       7680
> 
> ext4/4k: 587 tests, 55 skipped, 5340 seconds

Hum.  My tests show:

Kernel: 6.15.0-rc2-xfsx
Format: mkfs.ext4
Mount: -o acl,user_xattr
Testing: -r -g all -x dangerous_fuzzers,recoveryloop,broken,deprecated
Failures: generic/045, generic/046, ext4/043, ext4/053, generic/697, generic/633, generic/696, generic/044
Status: Failed 8 of 553 (1.4%) tests, covering 40% of 2,077 in 2.2 hours.

mke2fs.conf has:

[defaults]
	base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr
	default_mntopts = acl,user_xattr,block_validity
	enable_periodic_fsck = 0
	blocksize = 4096
	cluster_size = 32768
	inode_size = 256
	inode_ratio = 16384

[fs_types]
	ext2 = {
		inode_size = 128
	}
	ext3 = {
		features = has_journal
		inode_size = 128
	}
	ext4 = {
		features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit,metadata_csum,quota
		inode_size = 256
		quotatype=usrquota:grpquota:prjquota:
	}


generic/04[456] fail with a bunch of:

--- /run/fstests/bin/tests/generic/045.out	2025-01-30 10:00:16.774276934 -0800
+++ /var/tmp/fstests/generic/045.out.bad	2025-04-16 14:44:41.434069835 -0700
@@ -1 +1,1000 @@
 QA output created by 045
+corrupt file /opt/1 - non-zero size but no extents
+corrupt file /opt/2 - non-zero size but no extents
+corrupt file /opt/3 - non-zero size but no extents
+corrupt file /opt/4 - non-zero size but no extents

ext4/043 seems to fail because it tries to create 128b inodes with
project ids and fails.

ext4/053 I suspect fails because built-in quota conflicts with the quota
mount options.

generic/{633,697,696} fails with:

--- /run/fstests/bin/tests/generic/697.out	2025-01-30 10:00:16.953276275 -0800
+++ /var/tmp/fstests/generic/697.out.bad	2025-04-16 15:54:39.173837150 -0700
@@ -1,2 +1,4 @@
 QA output created by 697
+utils.c: 928: openat_tmpfile_supported - Invalid argument - failure: create
+utils.c: 928: openat_tmpfile_supported - Invalid argument - failure: create
 Silence is golden

No idea what that's about.

--D

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-16 23:34 ` Theodore Ts'o
  2025-04-17 16:38   ` Darrick J. Wong
@ 2025-04-17 16:49   ` Theodore Ts'o
  2025-04-17 20:35     ` Luis Chamberlain
  1 sibling, 1 reply; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-17 16:49 UTC (permalink / raw)
  To: Luis Chamberlain; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Wed, Apr 16, 2025 at 06:34:15PM -0500, Theodore Ts'o wrote:

> >  - Is this useful information?
> 
> Maybe; the question is why are your results so different from my results.

It looks like the problem is that your kernel config doesn't enable
CONFIG_QFMT_V2.  As a result, the quota feature is not supported in
the kernel under test.   From ext4/033.full file:

mount: /media/scratch: wrong fs type, bad option, bad superblock on /dev/loop5, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.
mount -o acl,user_xattr -o dioread_nolock,nodelalloc /dev/loop5 /media/scratch failed

And from the ext4/034.dmesg file:

[  297.969763] EXT4-fs (loop5): The kernel was not built with CONFIG_QUOTA and CONFIG_QFMT_V2

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-17 16:38   ` Darrick J. Wong
@ 2025-04-17 18:37     ` Theodore Ts'o
  2025-04-17 20:56       ` Luis Chamberlain
  0 siblings, 1 reply; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-17 18:37 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Luis Chamberlain, adilger.kernel, linux-ext4, kdevops, dave, jack


On Thu, Apr 17, 2025 at 09:38:20AM -0700, Darrick J. Wong wrote:
> 
> generic/04[456] fail with a bunch of...

Yeah, this is known.   I have an ext4-specific exclude file:

// generic/04[456] tests how truncate and delayed allocation works
// ext4 uses the data=ordered to avoid exposing stale data, and
// so it uses a different mechanism than xfs.  So these tests will fail
generic/044
generic/045
generic/046


> ext4/043 seems to fail because it tries to create 128b inodes with
> project ids and fails.

Yeah, I don't enable project id quotas by default in my test setups.

And _scratch_mkfs will fallback to using just the tests's mkfs option,
so if -O quota,project are specified in MKFS_OPTS, then the fallback works:

Start test timestamps with 128 inode size one device /dev/vdc
** mkfs failed with extra mkfs options added to "-q -O quota,project" by test 043 **
** attempting to mkfs using only test 043 options: -I 128 **

I suppose we could explicitly add something like -O ^project to the
test, but enabling -O project isn't in the default e2fsprogs
mke2fs.conf, and there are probably all sorts of oddball mke2fs.conf
configurations that might cause tesets to fail.


> ext4/053 I suspect fails because built-in quota conflicts with the quota
> mount options.

Hmm, I can't reproduce this with "kvm-xfstests -c ext4/quota
ext4/053", which will configure xfstests with:

MKFS_OPTIONS  -- -F -q -O quota,project /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr -o block_validity /dev/vdc /vdc

Can you send me the out.bad and full files for that test?

Hmm... maybe this is another one of these "it fails if a non-standard
mke2fs.conf is used, although I don't see how."


> generic/{633,697,696} fails with:
> 
> --- /run/fstests/bin/tests/generic/697.out	2025-01-30 10:00:16.953276275 -0800
> +++ /var/tmp/fstests/generic/697.out.bad	2025-04-16 15:54:39.173837150 -0700
> @@ -1,2 +1,4 @@
>  QA output created by 697
> +utils.c: 928: openat_tmpfile_supported - Invalid argument - failure: create
> +utils.c: 928: openat_tmpfile_supported - Invalid argument - failure: create
>  Silence is golden
> 
> No idea what that's about.

I don't have any idea either.  I assume there's nothing in the dmesg
for that test?  Those tests are passing for me, so I got nothing.

    	 	      	    		    - Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-17 16:49   ` Theodore Ts'o
@ 2025-04-17 20:35     ` Luis Chamberlain
  2025-04-18  1:42       ` Luis Chamberlain
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Chamberlain @ 2025-04-17 20:35 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Thu, Apr 17, 2025 at 11:49:07AM -0500, Theodore Ts'o wrote:
> On Wed, Apr 16, 2025 at 06:34:15PM -0500, Theodore Ts'o wrote:
> 
> > >  - Is this useful information?
> > 
> > Maybe; the question is why are your results so different from my results.
> 
> It looks like the problem is that your kernel config doesn't enable
> CONFIG_QFMT_V2.  As a result, the quota feature is not supported in
> the kernel under test.   From ext4/033.full file:
> 
> mount: /media/scratch: wrong fs type, bad option, bad superblock on /dev/loop5, missing codepage or helper program, or other error.
>        dmesg(1) may have more information after failed mount system call.
> mount -o acl,user_xattr -o dioread_nolock,nodelalloc /dev/loop5 /media/scratch failed
> 
> And from the ext4/034.dmesg file:
> 
> [  297.969763] EXT4-fs (loop5): The kernel was not built with CONFIG_QUOTA and CONFIG_QFMT_V2

Let' see what happens when I enable quotas, pushed.

  Luis

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-17 18:37     ` Theodore Ts'o
@ 2025-04-17 20:56       ` Luis Chamberlain
  2025-04-19 18:22         ` Theodore Ts'o
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Chamberlain @ 2025-04-17 20:56 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, adilger.kernel, linux-ext4, kdevops, dave, jack

On Thu, Apr 17, 2025 at 01:37:11PM -0500, Theodore Ts'o wrote:
> 
> On Thu, Apr 17, 2025 at 09:38:20AM -0700, Darrick J. Wong wrote:
> > 
> > generic/04[456] fail with a bunch of...
> 
> Yeah, this is known.   I have an ext4-specific exclude file:
> 
> // generic/04[456] tests how truncate and delayed allocation works
> // ext4 uses the data=ordered to avoid exposing stale data, and
> // so it uses a different mechanism than xfs.  So these tests will fail
> generic/044
> generic/045
> generic/046

Perhaps something like (not tested):

From a9386348701e387942e3eaaef8ee9daac8ace16a Mon Sep 17 00:00:00 2001
From: Luis Chamberlain <mcgrof@kernel.org>
Date: Thu, 17 Apr 2025 13:54:25 -0700
Subject: [PATCH] ext4: add ordered requirement for generic/04[456]

generic/04[456] tests how truncate and delayed allocation works.
ext4 uses the data=ordered to avoid exposing stale data, and
so it uses a different mechanism than xfs. So these tests will fail
on it.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 common/rc         | 19 +++++++++++++++++++
 tests/generic/044 |  1 +
 tests/generic/045 |  1 +
 tests/generic/046 |  1 +
 4 files changed, 22 insertions(+)

diff --git a/common/rc b/common/rc
index 9bed6dad9303..dd640c70f428 100644
--- a/common/rc
+++ b/common/rc
@@ -4495,6 +4495,25 @@ _exclude_test_mount_option()
 	_exclude_mount_option "$TEST_FS_MOUNT_OPTS" $@
 }
 
+_require_scratch_mount_ordered()
+{
+	[ "$FSTYP" = "ext4" ] || return
+
+	_require_scratch
+
+	local ordered_set=false
+	for opt in $(_normalize_mount_options "$MOUNT_OPTIONS"); do
+		case "$opt" in
+			data=ordered)
+				ordered_set=true
+				break
+				;;
+		esac
+	done
+
+	$ordered_set || _notrun "Test requires ext4 with data=ordered mount option"
+}
+
 _require_atime()
 {
 	_exclude_scratch_mount_option "noatime"
diff --git a/tests/generic/044 b/tests/generic/044
index 5d21875cf772..b596f66d07e8 100755
--- a/tests/generic/044
+++ b/tests/generic/044
@@ -19,6 +19,7 @@ _require_xfs_io_command "fiemap"
 _scratch_mkfs >/dev/null 2>&1
 _require_metadata_journaling $SCRATCH_DEV
 _scratch_mount
+_require_scratch_mount_ordered
 
 # create files
 i=1;
diff --git a/tests/generic/045 b/tests/generic/045
index 9904142f89ac..3ee59642239c 100755
--- a/tests/generic/045
+++ b/tests/generic/045
@@ -19,6 +19,7 @@ _require_xfs_io_command "fiemap"
 _scratch_mkfs >/dev/null 2>&1
 _require_metadata_journaling $SCRATCH_DEV
 _scratch_mount
+_require_scratch_mount_ordered
 
 # create files
 i=1;
diff --git a/tests/generic/046 b/tests/generic/046
index 5ed60c762fe8..9e77bd9573af 100755
--- a/tests/generic/046
+++ b/tests/generic/046
@@ -19,6 +19,7 @@ _require_xfs_io_command "fiemap"
 _scratch_mkfs >/dev/null 2>&1
 _require_metadata_journaling $SCRATCH_DEV
 _scratch_mount
+_require_scratch_mount_ordered
 
 # create files
 i=1;
-- 
2.47.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-17 20:35     ` Luis Chamberlain
@ 2025-04-18  1:42       ` Luis Chamberlain
  2025-04-18  3:56         ` Theodore Ts'o
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Chamberlain @ 2025-04-18  1:42 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Thu, Apr 17, 2025 at 01:35:28PM -0700, Luis Chamberlain wrote:
> On Thu, Apr 17, 2025 at 11:49:07AM -0500, Theodore Ts'o wrote:
> > On Wed, Apr 16, 2025 at 06:34:15PM -0500, Theodore Ts'o wrote:
> > 
> > > >  - Is this useful information?
> > > 
> > > Maybe; the question is why are your results so different from my results.
> > 
> > It looks like the problem is that your kernel config doesn't enable
> > CONFIG_QFMT_V2.  As a result, the quota feature is not supported in
> > the kernel under test.   From ext4/033.full file:
> > 
> > mount: /media/scratch: wrong fs type, bad option, bad superblock on /dev/loop5, missing codepage or helper program, or other error.
> >        dmesg(1) may have more information after failed mount system call.
> > mount -o acl,user_xattr -o dioread_nolock,nodelalloc /dev/loop5 /media/scratch failed
> > 
> > And from the ext4/034.dmesg file:
> > 
> > [  297.969763] EXT4-fs (loop5): The kernel was not built with CONFIG_QUOTA and CONFIG_QFMT_V2
> 
> Let' see what happens when I enable quotas, pushed.

That helped, it brought down the test failures from 261 to 170 with a
success rate improvement from 55.4% to 57.9%.

Dashboards for both results, the old one:

https://kdevops.org/ext4/v6.15-rc2-a74831cc.html

The latest one:

https://kdevops.org/ext4/v6.15-rc2.html

Detailed test results:

https://github.com/linux-kdevops/kdevops-results-archive/commit/a051dea3db9fcc7e164c1d027264e181b68833e0

And so the new file name for results is

fstests/gh/linux-ext4-kpd/20250417/0001/linux-6-15-rc2/8ffd015db85f.xz

Detailed test results below:

KERNEL:    6.15.0-rc2-g8ffd015db85f
CPUS:      8

ext4_defaults: 793 tests, 2 failures, 259 skipped, 10521 seconds
  Failures: generic/223 generic/741
ext4_4k: 793 tests, 2 failures, 308 skipped, 9837 seconds
  Failures: generic/223 generic/741
ext4_2k: 793 tests, 2 failures, 311 skipped, 10017 seconds
  Failures: generic/223 generic/741
ext4_advanced_features: 793 tests, 3 failures, 267 skipped, 10416 seconds
  Failures: generic/223 generic/477 generic/741
ext4_1k: 793 tests, 2 failures, 314 skipped, 10813 seconds
  Failures: generic/223 generic/741
ext4_bigalloc16k_4k: 793 tests, 26 failures, 341 skipped, 8856 seconds
  Failures: ext4/033 generic/075 generic/082 generic/091 generic/112
    generic/127 generic/219 generic/223 generic/230 generic/231
    generic/232 generic/233 generic/234 generic/235 generic/263
    generic/280 generic/381 generic/382 generic/566 generic/587
    generic/600 generic/601 generic/681 generic/682 generic/691
    generic/741
ext4_bigalloc32k_4k: 793 tests, 26 failures, 341 skipped, 8678 seconds
  Failures: ext4/033 generic/075 generic/082 generic/091 generic/112
    generic/127 generic/219 generic/223 generic/230 generic/231
    generic/232 generic/233 generic/234 generic/235 generic/263
    generic/280 generic/381 generic/382 generic/566 generic/587
    generic/600 generic/601 generic/681 generic/682 generic/691
    generic/741
ext4_bigalloc64k_4k: 793 tests, 26 failures, 341 skipped, 8554 seconds
  Failures: ext4/033 generic/075 generic/082 generic/091 generic/112
    generic/127 generic/219 generic/223 generic/230 generic/231
    generic/232 generic/233 generic/234 generic/235 generic/263
    generic/280 generic/381 generic/382 generic/566 generic/587
    generic/600 generic/601 generic/681 generic/682 generic/691
    generic/741
ext4_bigalloc1024k_4k: 793 tests, 38 failures, 341 skipped, 8019 seconds
  Failures: ext4/033 ext4/045 generic/075 generic/082 generic/091
    generic/112 generic/127 generic/219 generic/230 generic/231
    generic/232 generic/233 generic/234 generic/235 generic/251
    generic/263 generic/280 generic/365 generic/381 generic/382
    generic/435 generic/566 generic/587 generic/600 generic/601
    generic/614 generic/629 generic/634 generic/635 generic/643
    generic/681 generic/682 generic/691 generic/698 generic/732
    generic/738 generic/741 generic/754
ext4_bigalloc2048k_4k: 793 tests, 43 failures, 348 skipped, 7961 seconds
  Failures: ext4/033 ext4/045 generic/075 generic/082 generic/091
    generic/112 generic/127 generic/219 generic/230 generic/231
    generic/232 generic/233 generic/234 generic/235 generic/251
    generic/263 generic/280 generic/365 generic/381 generic/382
    generic/435 generic/471 generic/566 generic/587 generic/600
    generic/601 generic/603 generic/614 generic/629 generic/634
    generic/635 generic/643 generic/645 generic/676 generic/681
    generic/682 generic/691 generic/698 generic/732 generic/736
    generic/738 generic/741 generic/754
Totals: 7930 tests, 3171 skipped, 170 failures, 0 errors, 83862s

  Luis

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-18  1:42       ` Luis Chamberlain
@ 2025-04-18  3:56         ` Theodore Ts'o
  2025-04-18 19:08           ` Luis Chamberlain
  0 siblings, 1 reply; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-18  3:56 UTC (permalink / raw)
  To: Luis Chamberlain; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Thu, Apr 17, 2025 at 06:42:25PM -0700, Luis Chamberlain wrote:
> 
> ext4_defaults: 793 tests, 2 failures, 259 skipped, 10521 seconds
>   Failures: generic/223 generic/741

generic/223 is excluded in my tests.  From [1]:

// generic/223 tests file alignment, which works on ext4 only by
// accident because we're not RAID stripe aware yet, and works at all
// because we have bias towards aligning on power-of-two block numbers.
// It is a flaky test for some configurations, so skip it.
generic/223

[1] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/exclude

generic/741 looks like some kind of device-mapper setup problem.  From
741.out.bad:

device-mapper: remove ioctl on flakey-test  failed: No such device or address
Command failed.

There's nothing interesting in generic/741, but all I can tell you is,
"it works for me"(tm).

Ran: generic/741
Passed all 1 tests


> ext4_bigalloc16k_4k: 793 tests, 26 failures, 341 skipped, 8856 seconds
>   Failures: ext4/033 generic/075 generic/082 generic/091 generic/112
>     generic/127 generic/219 generic/223 generic/230 generic/231
>     generic/232 generic/233 generic/234 generic/235 generic/263
>     generic/280 generic/381 generic/382 generic/566 generic/587
>     generic/600 generic/601 generic/681 generic/682 generic/691
>     generic/741

Hmm, some of these are because there ar a bunch of tests that don't
work well the allocation cluster size != the file system block size.
See [2] for the tests that I exclude.  These are fundamentally test
bugs that just don't work for bigalloc's clustered allocation.

[2] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/cfg/bigalloc_4k.exclude

As far as the rest of the bigalloc failures, some of them is hard to
tell because you're not saving all of the test artifacts.  In
particular, the tests which run fsx create ${seq}.*.fsx{good,bad,log}
files.  My test appliance saves them, because they are super helpful
when debugging a test failure.  kdevops apparently doesn't.

What I do is save the entire results directory, although by default I
truncate any test artifacts from passing tests to 31k (this amount is
configurable via a command line option to gce-xfstests).  This is
important because some of artifact files are super verbose, and if you
save them all, the time to run xz on the tar file takes forever.  But
if the tests fail, they are *super* useful.

For the other bigalloc failures, I have a suspicion --- how big is the
TEST and SCRATCH devices that you are using?  By default, most of my
test scenarios use a "small" config which is 5G.  But for the bigalloc
tests, for the 4k block / 64k cluster size, the deviec needs to be at
least 20G or some of the tests will fail with ENOSPC.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-18  3:56         ` Theodore Ts'o
@ 2025-04-18 19:08           ` Luis Chamberlain
  2025-04-19 18:36             ` Theodore Ts'o
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Chamberlain @ 2025-04-18 19:08 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Thu, Apr 17, 2025 at 10:56:23PM -0500, Theodore Ts'o wrote:
> On Thu, Apr 17, 2025 at 06:42:25PM -0700, Luis Chamberlain wrote:
> > 
> > ext4_defaults: 793 tests, 2 failures, 259 skipped, 10521 seconds
> >   Failures: generic/223 generic/741
> 
> generic/223 is excluded in my tests.  From [1]:
> 
> // generic/223 tests file alignment, which works on ext4 only by
> // accident because we're not RAID stripe aware yet, and works at all
> // because we have bias towards aligning on power-of-two block numbers.
> // It is a flaky test for some configurations, so skip it.
> generic/223
> 
> [1] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/exclude

Why not just add a hook to the test to skip it upstream?

> > ext4_bigalloc16k_4k: 793 tests, 26 failures, 341 skipped, 8856 seconds
> >   Failures: ext4/033 generic/075 generic/082 generic/091 generic/112
> >     generic/127 generic/219 generic/223 generic/230 generic/231
> >     generic/232 generic/233 generic/234 generic/235 generic/263
> >     generic/280 generic/381 generic/382 generic/566 generic/587
> >     generic/600 generic/601 generic/681 generic/682 generic/691
> >     generic/741
> 
> Hmm, some of these are because there ar a bunch of tests that don't
> work well the allocation cluster size != the file system block size.

We experienced a lot of test bugs for LBS but we addressed them.

> See [2] for the tests that I exclude.  These are fundamentally test
> bugs that just don't work for bigalloc's clustered allocation.

Absolutely all of these are test bugs? And they can't be fixed to
test bigalloc?

> [2] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/cfg/bigalloc_4k.exclude
> 
> As far as the rest of the bigalloc failures, some of them is hard to
> tell because you're not saving all of the test artifacts.  In
> particular, the tests which run fsx create ${seq}.*.fsx{good,bad,log}
> files.  My test appliance saves them, because they are super helpful
> when debugging a test failure.  kdevops apparently doesn't.

Patch posted.

> What I do is save the entire results directory, 

The experience we have is sometimes test bugs create TFB files (too f big),
and also earlier its not clear if we had to be conservative about space.
We have a solution in place now to not have to care about space for
results, but also in practice TFB files in practice also stall CIs and
networks, etc. And so TFB files are ignored.

If *.fsx{good,bad,log} won't ever be TFB, then we'll be good. Specially
since we can scale for archiving now.

> although by default I
> truncate any test artifacts from passing tests to 31k (this amount is
> configurable via a command line option to gce-xfstests).  This is
> important because some of artifact files are super verbose, and if you
> save them all, the time to run xz on the tar file takes forever.  But
> if the tests fail, they are *super* useful.

Right, same experience here. We call these TFB files. And we have a size
threshold too.

> For the other bigalloc failures, I have a suspicion --- how big is the
> TEST and SCRATCH devices that you are using?  By default, most of my
> test scenarios use a "small" config which is 5G.  But for the bigalloc
> tests, for the 4k block / 64k cluster size, the deviec needs to be at
> least 20G or some of the tests will fail with ENOSPC.

They are 20GiB. This is configurable via CONFIG_FSTESTS_SPARSE_FILE_SIZE.

  Luis

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-17 20:56       ` Luis Chamberlain
@ 2025-04-19 18:22         ` Theodore Ts'o
  2025-04-21 15:54           ` Darrick J. Wong
  0 siblings, 1 reply; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-19 18:22 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Darrick J. Wong, adilger.kernel, linux-ext4, kdevops, dave, jack

On Thu, Apr 17, 2025 at 01:56:29PM -0700, Luis Chamberlain wrote:
> 
> Perhaps something like (not tested):
> 
> From a9386348701e387942e3eaaef8ee9daac8ace16a Mon Sep 17 00:00:00 2001
> From: Luis Chamberlain <mcgrof@kernel.org>
> Date: Thu, 17 Apr 2025 13:54:25 -0700
> Subject: [PATCH] ext4: add ordered requirement for generic/04[456]
> 
> generic/04[456] tests how truncate and delayed allocation works.
> ext4 uses the data=ordered to avoid exposing stale data, and
> so it uses a different mechanism than xfs. So these tests will fail
> on it.

No, you misunderstand the problem.  The generic/04[456] tests are
checking for a specific implementation detail in how xfs works to
prevent stale data from being exposing data after a crash.  Ext4 has a
different method for achieving the same goal, using data=ordered,
which is the default.  So checking for data=ordered isn't necessary,
because it is the default.  But how it achieves thinigs means that
these tests, which tests for a specific implementation, doesn't work.

Fundamentally, these tests check what happens when you are writing to
a file and the file system is shutdown (simulating a power failure).
Exaclty how this handled is not guaranteed by POSIX, so testing for a
specific behaviour is in my opinion, not really that great of an idea.
In any case, the fact that we don't do exactly what these tests are
expecting is not a problem as far as I'm concerned, and so we skip
them.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-18 19:08           ` Luis Chamberlain
@ 2025-04-19 18:36             ` Theodore Ts'o
  2025-04-20  3:39               ` Theodore Ts'o
  0 siblings, 1 reply; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-19 18:36 UTC (permalink / raw)
  To: Luis Chamberlain; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Fri, Apr 18, 2025 at 12:08:17PM -0700, Luis Chamberlain wrote:
> > [1] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/exclude
> 
> Why not just add a hook to the test to skip it upstream?

Quite a few years ago, the upstream xfstests-bld maintainer at the
time was very much against adding these sorts of exceptions.

Instead of trying to pursuade upstream about these sorts of changes,
it was just simpler for me to exclude them in my test runner.  It's
for similar reasons why I still have some out of tree patches.  The
standards of patch review of patches from some folks such as myself
are *substantially* higher than say, those of parallel check patches,
where xfstests for-next was broken for three months.

If upstream was more willing to take patches that I find useful, I'd
certainly send them upstream.  But it's been painful.

> > Hmm, some of these are because there ar a bunch of tests that don't
> > work well the allocation cluster size != the file system block size.
> 
> We experienced a lot of test bugs for LBS but we addressed them.

If I recall correctly, upstream was hostile to the bigalloc changes a
while back, but that was many years ago. 

> 
> > See [2] for the tests that I exclude.  These are fundamentally test
> > bugs that just don't work for bigalloc's clustered allocation.
> 
> Absolutely all of these are test bugs? And they can't be fixed to
> test bigalloc?

The ones in [2] are test bugs, and *why* they are test bugs are
clearly documented in the exclude file.  If I had confidence that
upstream would accept them, I could work on it in my copious spare
time.  But it's *way* simpler for me to exclude them in my test
runner, as opposed to trying to get changes upstream in xfstests.

If other people want to try to get changes upstream, please be my
guest.  :-)

						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-19 18:36             ` Theodore Ts'o
@ 2025-04-20  3:39               ` Theodore Ts'o
  0 siblings, 0 replies; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-20  3:39 UTC (permalink / raw)
  To: Luis Chamberlain; +Cc: adilger.kernel, linux-ext4, kdevops, dave, jack

On Sat, Apr 19, 2025 at 01:36:41PM -0500, Theodore Ts'o wrote:
> 
> Quite a few years ago, the upstream xfstests-bld maintainer at the
> time was very much against adding these sorts of exceptions.

Typo, this should have read, "Quite a few years ago, the upstream
XFSTESTS maintainer at the time...."

Again, if someone would like to work with trying to get changes
upstream so we don't need as much of an exclude file, help would
certainly be appreciated.  Or we can just keep using an exclude file.
Whatever.

                                        - Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-19 18:22         ` Theodore Ts'o
@ 2025-04-21 15:54           ` Darrick J. Wong
  2025-04-21 16:29             ` Theodore Ts'o
  0 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2025-04-21 15:54 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Luis Chamberlain, adilger.kernel, linux-ext4, kdevops, dave, jack

On Sat, Apr 19, 2025 at 01:22:49PM -0500, Theodore Ts'o wrote:
> On Thu, Apr 17, 2025 at 01:56:29PM -0700, Luis Chamberlain wrote:
> > 
> > Perhaps something like (not tested):
> > 
> > From a9386348701e387942e3eaaef8ee9daac8ace16a Mon Sep 17 00:00:00 2001
> > From: Luis Chamberlain <mcgrof@kernel.org>
> > Date: Thu, 17 Apr 2025 13:54:25 -0700
> > Subject: [PATCH] ext4: add ordered requirement for generic/04[456]
> > 
> > generic/04[456] tests how truncate and delayed allocation works.
> > ext4 uses the data=ordered to avoid exposing stale data, and
> > so it uses a different mechanism than xfs. So these tests will fail
> > on it.
> 
> No, you misunderstand the problem.  The generic/04[456] tests are
> checking for a specific implementation detail in how xfs works to
> prevent stale data from being exposing data after a crash.  Ext4 has a
> different method for achieving the same goal, using data=ordered,
> which is the default.  So checking for data=ordered isn't necessary,
> because it is the default.  But how it achieves thinigs means that
> these tests, which tests for a specific implementation, doesn't work.
> 
> Fundamentally, these tests check what happens when you are writing to
> a file and the file system is shutdown (simulating a power failure).
> Exaclty how this handled is not guaranteed by POSIX, so testing for a
> specific behaviour is in my opinion, not really that great of an idea.
> In any case, the fact that we don't do exactly what these tests are
> expecting is not a problem as far as I'm concerned, and so we skip
> them.

I might be wading in deeper than I know, but it seems to me that
after a crash recovery it's not great to see 64k files with no blocks
allocated to them at all.  That probably falls into "fs crash behavior
isn't guaranteed by POSIX", but if that's the case then these three
tests (generic/044-046) should _exclude_fs ext3 ext4 and explain why.

(I don't care about the others whining about _exclude_fs-- if you make
the design decision that the current ext4 behavior is good enough, then
the test cannot ever be satisfied so let's capture that in the test
itself, not in everyone's scattered exclusion lists.)

--D

> Cheers,
> 
> 						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-21 15:54           ` Darrick J. Wong
@ 2025-04-21 16:29             ` Theodore Ts'o
  2025-04-21 16:47               ` Darrick J. Wong
  0 siblings, 1 reply; 16+ messages in thread
From: Theodore Ts'o @ 2025-04-21 16:29 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Luis Chamberlain, adilger.kernel, linux-ext4, kdevops, dave, jack

On Mon, Apr 21, 2025 at 08:54:33AM -0700, Darrick J. Wong wrote:
> 
> I might be wading in deeper than I know, but it seems to me that
> after a crash recovery it's not great to see 64k files with no blocks
> allocated to them at all.

Well, what ext4 in no dioread_nolock mode will do is to allocate
blocks marked as unitializationed, and then write the data blocks, and
then change them to be marked as initialized.  So it's not that there
are no blocks allocated at all; but that there are blocks allocated
but attempts to read from the file will return all zeros.

This is non-ideal, but my main concern is a performance issue, not a
correctness one.  We're modifying the metadata blocks twice, and while
most of the time the two modifications happen within a single
transaction (so the user won't actually see the zero blocks after the
crash _most_ of the time), the extra journal handles means extra CPU
and extra jbd2 spinlocks getting taken and released.

So it's on my todo list to fix, in my copious spare time.....

> (I don't care about the others whining about _exclude_fs-- if
> you make the design decision that the current ext4 behavior is
> good enough, then the test cannot ever be satisfied so let's
> capture that in the test > itself, not in everyone's scattered
> exclusion lists.)

Fair enough, I can try, and see if we get people attempting to NACK
the changes this time around.  Support beating back the whiners would
be appreciated.

I can also see if Luis's LBS changes might it easier to deal with the
bigalloc test bugs.  It will mean exposing the concept of cluster
allocation size (as distinct from block size) to the core xfstests
infrastructure, and again, we can see if people try to NACK the
changes.  This will require a bit more work, however as this is a big
difference between XFS's LBS feature and ext4's bigalloc feature.

						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ext4 v6.15-rc2 baseline
  2025-04-21 16:29             ` Theodore Ts'o
@ 2025-04-21 16:47               ` Darrick J. Wong
  0 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2025-04-21 16:47 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Luis Chamberlain, adilger.kernel, linux-ext4, kdevops, dave, jack

On Mon, Apr 21, 2025 at 11:29:52AM -0500, Theodore Ts'o wrote:
> On Mon, Apr 21, 2025 at 08:54:33AM -0700, Darrick J. Wong wrote:
> > 
> > I might be wading in deeper than I know, but it seems to me that
> > after a crash recovery it's not great to see 64k files with no blocks
> > allocated to them at all.
> 
> Well, what ext4 in no dioread_nolock mode will do is to allocate
> blocks marked as unitializationed, and then write the data blocks, and
> then change them to be marked as initialized.  So it's not that there
> are no blocks allocated at all; but that there are blocks allocated
> but attempts to read from the file will return all zeros.

But that's not what I see -- on my system, I get files with i_size ==
65536, but no mappings at all:

--- /run/fstests/bin/tests/generic/044.out      2025-04-17 14:52:53.521658441 -0700
+++ /var/tmp/fstests/generic/044.out.bad        2025-04-21 08:46:15.328757541 -0700
@@ -1 +1,95 @@
 QA output created by 044
+corrupt file /opt/906 - non-zero size but no extents
+corrupt file /opt/907 - non-zero size but no extents

# mount /opt/
# ls /opt/906
-rw------- 1 root root 65536 Apr 21 08:45 /opt/906
# filefrag -v !$
filefrag -v /opt/906
Filesystem type is: ef53
File size of /opt/906 is 65536 (16 blocks of 4096 bytes)
/opt/906: 0 extents found

...unless ext4 is removing those unwritten blocks during recovery?

> This is non-ideal, but my main concern is a performance issue, not a
> correctness one.  We're modifying the metadata blocks twice, and while
> most of the time the two modifications happen within a single
> transaction (so the user won't actually see the zero blocks after the
> crash _most_ of the time), the extra journal handles means extra CPU
> and extra jbd2 spinlocks getting taken and released.
> 
> So it's on my todo list to fix, in my copious spare time.....
> 
> > (I don't care about the others whining about _exclude_fs-- if
> > you make the design decision that the current ext4 behavior is
> > good enough, then the test cannot ever be satisfied so let's
> > capture that in the test > itself, not in everyone's scattered
> > exclusion lists.)
> 
> Fair enough, I can try, and see if we get people attempting to NACK
> the changes this time around.  Support beating back the whiners would
> be appreciated.

Ok, I'll chime in whenever I see patches. :)

> I can also see if Luis's LBS changes might it easier to deal with the
> bigalloc test bugs.  It will mean exposing the concept of cluster
> allocation size (as distinct from block size) to the core xfstests
> infrastructure, and again, we can see if people try to NACK the
> changes.  This will require a bit more work, however as this is a big
> difference between XFS's LBS feature and ext4's bigalloc feature.

That shouldn't be a problem; _xfs_get_file_block_size has returned the
allocation unit size for XFS files for quite some time, despite being
badly named.

--D

> 
> 						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-04-21 16:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 17:56 ext4 v6.15-rc2 baseline Luis Chamberlain
2025-04-16 23:34 ` Theodore Ts'o
2025-04-17 16:38   ` Darrick J. Wong
2025-04-17 18:37     ` Theodore Ts'o
2025-04-17 20:56       ` Luis Chamberlain
2025-04-19 18:22         ` Theodore Ts'o
2025-04-21 15:54           ` Darrick J. Wong
2025-04-21 16:29             ` Theodore Ts'o
2025-04-21 16:47               ` Darrick J. Wong
2025-04-17 16:49   ` Theodore Ts'o
2025-04-17 20:35     ` Luis Chamberlain
2025-04-18  1:42       ` Luis Chamberlain
2025-04-18  3:56         ` Theodore Ts'o
2025-04-18 19:08           ` Luis Chamberlain
2025-04-19 18:36             ` Theodore Ts'o
2025-04-20  3:39               ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox