* [PATCHSET v2] fstests: random fixes for v2025.02.02
@ 2025-02-04 21:22 Darrick J. Wong
2025-02-04 21:22 ` [PATCH 01/34] generic/476: fix fsstress process management Darrick J. Wong
` (34 more replies)
0 siblings, 35 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:22 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, joannelkoong, fstests, linux-xfs
Hi all,
This is an unusually large set of bug fixes for fstests. The first 20
patches in this patchset are corrections for that RFC series.
The most significant change is that I made ./check run each test with
its own Unix process session id. This means that a test can use pkill
to kill all of its own subprocesses, without killing anyone else's
subprocesses. I wasn't completely sold on that approach, but it works
for me. A better approach is to run each test in a separate pid and
mount namespace, but then kernel support for that becomes a hard
requirement. Both approaches seems to work for check and
check-parallel, though I've not tested that all that much.
Note: I am /not/ happy about Dave's RFC going straight to for-next
without even a complete review right before everyone went on PTO for
several weeks for xmas/solar new year. But in the interests of getting
QA back on line for myself and everyone else who's having problems, here
it is.
At this point this series has accumulated even more bug fixes: an
unexplained insertion of -R in a umount command; fixes for the new
generic/759 hugepages IO test, fixes to the pptr size calculations in
common/fuzzy, memory consumption reductions in scrub fsstress tests,
fixes for excessive fsstress ops scaling, and some cleanups to how we
locate the fsstress and fsx binaries.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=random-fixes
xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=random-fixes
fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes
---
Commits in this patchset:
* generic/476: fix fsstress process management
* metadump: make non-local function variables more obvious
* metadump: fix cleanup for v1 metadump testing
* generic/019: don't fail if fio crashes while shutting down
* fuzzy: do not set _FSSTRESS_PID when exercising fsx
* common/rc: revert recursive unmount in _clear_mount_stack
* common/dump: don't replace pids arbitrarily
* common/populate: correct the parent pointer name creation formulae
* generic/759,760: fix MADV_COLLAPSE detection and inclusion
* generic/759,760: skip test if we can't set up a hugepage for IO
* common/rc: create a wrapper for the su command
* fuzzy: kill subprocesses with SIGPIPE, not SIGINT
* common/rc: hoist pkill to a helper function
* common: fix pkill by running test program in a separate session
* check: run tests in a private pid/mount namespace
* check: deprecate using process sessions to isolate test instances
* common/rc: don't copy fsstress to $TEST_DIR
* unmount: resume logging of stdout and stderr for filtering
* mkfs: don't hardcode log size
* common/rc: return mount_ret in _try_scratch_mount
* preamble: fix missing _kill_fsstress
* generic/650: revert SOAK DURATION changes
* generic/032: fix pinned mount failure
* fuzzy: stop __stress_scrub_fsx_loop if fsx fails
* fuzzy: don't use readarray for xfsfind output
* fuzzy: always stop the scrub fsstress loop on error
* fuzzy: port fsx and fsstress loop to use --duration
* fix _require_scratch_duperemove ordering
* fsstress: fix a memory leak
* fsx: fix leaked log file pointer
* misc: don't put nr_cpus into the fsstress -n argument
* common/config: add $here to FSSTRESS_PROG
* config: add FSX_PROG variable
* build: initialize stack variables to zero by default
---
check | 89 +++++++++++++++++++++++++------
common/config | 5 +-
common/dump | 1
common/fuzzy | 100 ++++++++++++++++++++++-------------
common/metadump | 42 +++++++--------
common/populate | 13 +++--
common/preamble | 2 -
common/quota | 2 -
common/rc | 141 ++++++++++++++++++++++++++++++++++++++++++-------
common/reflink | 6 +-
configure.ac | 1
include/builddefs.in | 3 +
ltp/fsstress.c | 1
ltp/fsx.c | 8 ++-
m4/package_libcdev.m4 | 14 +++++
src/nsexec.c | 18 +++++-
src/xfsfind.c | 14 ++++-
tests/generic/019 | 2 -
tests/generic/032 | 9 +++
tests/generic/050 | 2 -
tests/generic/075 | 2 -
tests/generic/085 | 2 -
tests/generic/093 | 2 -
tests/generic/112 | 2 -
tests/generic/125 | 2 -
tests/generic/127 | 16 +++---
tests/generic/128 | 2 -
tests/generic/193 | 36 ++++++-------
tests/generic/230 | 14 ++---
tests/generic/231 | 4 +
tests/generic/233 | 2 -
tests/generic/270 | 12 ++--
tests/generic/310 | 6 +-
tests/generic/314 | 2 -
tests/generic/327 | 2 -
tests/generic/328 | 4 +
tests/generic/355 | 2 -
tests/generic/361 | 4 +
tests/generic/453 | 6 +-
tests/generic/455 | 2 -
tests/generic/456 | 2 -
tests/generic/457 | 2 -
tests/generic/469 | 2 -
tests/generic/476 | 4 +
tests/generic/499 | 2 -
tests/generic/504 | 15 +++++
tests/generic/511 | 2 -
tests/generic/514 | 2 -
tests/generic/530 | 6 --
tests/generic/531 | 6 --
tests/generic/561 | 2 -
tests/generic/573 | 2 -
tests/generic/590 | 2 -
tests/generic/600 | 2 -
tests/generic/601 | 2 -
tests/generic/603 | 10 ++-
tests/generic/642 | 2 -
tests/generic/650 | 6 +-
tests/generic/673 | 2 -
tests/generic/674 | 2 -
tests/generic/675 | 2 -
tests/generic/680 | 2 -
tests/generic/681 | 2 -
tests/generic/682 | 2 -
tests/generic/683 | 2 -
tests/generic/684 | 2 -
tests/generic/685 | 2 -
tests/generic/686 | 2 -
tests/generic/687 | 2 -
tests/generic/688 | 2 -
tests/generic/691 | 8 +--
tests/generic/721 | 10 ++-
tests/generic/726 | 2 -
tests/generic/727 | 2 -
tests/generic/746 | 2 -
tests/generic/750 | 2 -
tests/generic/759 | 7 +-
tests/generic/760 | 7 +-
tests/xfs/149 | 2 -
tests/xfs/501 | 2 -
tests/xfs/502 | 2 -
tests/xfs/530 | 2 -
tests/xfs/720 | 2 -
tests/xfs/795 | 2 -
tests/xfs/803 | 2 -
tools/run_seq_pidns | 28 ++++++++++
tools/run_seq_setsid | 22 ++++++++
87 files changed, 546 insertions(+), 250 deletions(-)
create mode 100755 tools/run_seq_pidns
create mode 100755 tools/run_seq_setsid
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH 01/34] generic/476: fix fsstress process management
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
@ 2025-02-04 21:22 ` Darrick J. Wong
2025-02-04 21:22 ` [PATCH 02/34] metadump: make non-local function variables more obvious Darrick J. Wong
` (33 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:22 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
generic/476 never used to run fsstress in the background, but
8973af00ec212f made it do that. This is incorrect, because now 476 runs
for three seconds (i.e. long enough to fall out the bottom of the test
and end up in _cleanup), ignoring any SOAK_DURATION/TIME_FACTOR
settings.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
tests/generic/476 | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/generic/476 b/tests/generic/476
index 769b3745f75689..4537fcd77d2f07 100755
--- a/tests/generic/476
+++ b/tests/generic/476
@@ -22,7 +22,7 @@ nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus)
test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
-_run_fsstress_bg "${fsstress_args[@]}"
+_run_fsstress "${fsstress_args[@]}"
# success, all done
status=0
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 02/34] metadump: make non-local function variables more obvious
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
2025-02-04 21:22 ` [PATCH 01/34] generic/476: fix fsstress process management Darrick J. Wong
@ 2025-02-04 21:22 ` Darrick J. Wong
2025-02-04 21:23 ` [PATCH 03/34] metadump: fix cleanup for v1 metadump testing Darrick J. Wong
` (32 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:22 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
In _xfs_verify_metadump_v2(), we want to set up some loop devices,
record the names of those loop devices, and then leave the variables in
the global namespace so that _xfs_cleanup_verify_metadump can dispose of
them.
Elsewhere in fstests the convention for global variables is to put them
in all caps to make it obvious that they're global and not local
variables, so do that here too.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: ce79de11337e38 ("fstests: clean up loop device instantiation")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
common/metadump | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/common/metadump b/common/metadump
index 3aa7adf76da4bb..493f8b6379dc0b 100644
--- a/common/metadump
+++ b/common/metadump
@@ -25,8 +25,8 @@ _xfs_cleanup_verify_metadump()
test -n "$XFS_METADUMP_FILE" && rm -f "$XFS_METADUMP_FILE"
if [ -n "$XFS_METADUMP_IMG" ]; then
- [ -b "$md_data_loop_dev" ] && _destroy_loop_device $md_data_loop_dev
- [ -b "$md_log_loop_dev" ] && _destroy_loop_device $md_log_loop_dev
+ [ -b "$METADUMP_DATA_LOOP_DEV" ] && _destroy_loop_device $METADUMP_DATA_LOOP_DEV
+ [ -b "$METADUMP_LOG_LOOP_DEV" ] && _destroy_loop_device $METADUMP_LOG_LOOP_DEV
for img in "$XFS_METADUMP_IMG"*; do
test -e "$img" && rm -f "$img"
done
@@ -100,9 +100,7 @@ _xfs_verify_metadump_v2()
local metadump_file="$XFS_METADUMP_FILE"
local version="-v 2"
local data_img="$XFS_METADUMP_IMG.data"
- local data_loop
local log_img=""
- local log_loop
# Capture metadump, which creates metadump_file
_scratch_xfs_metadump $metadump_file $metadump_args $version
@@ -118,27 +116,27 @@ _xfs_verify_metadump_v2()
_scratch_xfs_mdrestore $metadump_file
# Create loopdev for data device so we can mount the fs
- md_data_loop_dev=$(_create_loop_device $data_img)
+ METADUMP_DATA_LOOP_DEV=$(_create_loop_device $data_img)
# Create loopdev for log device if we recovered anything
- test -s "$log_img" && md_log_loop_dev=$(_create_loop_device $log_img)
+ test -s "$log_img" && METADUMP_LOG_LOOP_DEV=$(_create_loop_device $log_img)
# Mount fs, run an extra test, fsck, and unmount
- SCRATCH_DEV=$md_data_loop_dev SCRATCH_LOGDEV=$md_log_loop_dev _scratch_mount
+ SCRATCH_DEV=$METADUMP_DATA_LOOP_DEV SCRATCH_LOGDEV=$METADUMP_LOG_LOOP_DEV _scratch_mount
if [ -n "$extra_test" ]; then
- SCRATCH_DEV=$md_data_loop_dev SCRATCH_LOGDEV=$md_log_loop_dev $extra_test
+ SCRATCH_DEV=$METADUMP_DATA_LOOP_DEV SCRATCH_LOGDEV=$METADUMP_LOG_LOOP_DEV $extra_test
fi
- SCRATCH_DEV=$md_data_loop_dev SCRATCH_LOGDEV=$md_log_loop_dev _check_xfs_scratch_fs
- _unmount $md_data_loop_dev
+ SCRATCH_DEV=$METADUMP_DATA_LOOP_DEV SCRATCH_LOGDEV=$METADUMP_LOG_LOOP_DEV _check_xfs_scratch_fs
+ _unmount $METADUMP_DATA_LOOP_DEV
# Tear down what we created
- if [ -b "$md_log_loop_dev" ]; then
- _destroy_loop_device $md_log_loop_dev
- unset md_log_loop_dev
+ if [ -b "$METADUMP_LOG_LOOP_DEV" ]; then
+ _destroy_loop_device $METADUMP_LOG_LOOP_DEV
+ unset METADUMP_LOG_LOOP_DEV
rm -f $log_img
fi
- _destroy_loop_device $md_data_loop_dev
- unset md_data_loop_dev
+ _destroy_loop_device $METADUMP_DATA_LOOP_DEV
+ unset METADUMP_DATA_LOOP_DEV
rm -f $data_img
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 03/34] metadump: fix cleanup for v1 metadump testing
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
2025-02-04 21:22 ` [PATCH 01/34] generic/476: fix fsstress process management Darrick J. Wong
2025-02-04 21:22 ` [PATCH 02/34] metadump: make non-local function variables more obvious Darrick J. Wong
@ 2025-02-04 21:23 ` Darrick J. Wong
2025-02-04 21:23 ` [PATCH 04/34] generic/019: don't fail if fio crashes while shutting down Darrick J. Wong
` (31 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:23 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
In commit ce79de11337e38, the metadump v2 tests were updated to leave
the names of loop devices in some global variables so that the cleanup
method can find them and remove the loop devices. Inexplicably, the
metadump v1 test function was not upgraded. Do so now.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: ce79de11337e38 ("fstests: clean up loop device instantiation")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
common/metadump | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/common/metadump b/common/metadump
index 493f8b6379dc0b..a4ec9a7f921acf 100644
--- a/common/metadump
+++ b/common/metadump
@@ -61,7 +61,6 @@ _xfs_verify_metadump_v1()
local metadump_file="$XFS_METADUMP_FILE"
local version=""
local data_img="$XFS_METADUMP_IMG.data"
- local data_loop
# Force v1 if we detect v2 support
if [[ $MAX_XFS_METADUMP_FORMAT > 1 ]]; then
@@ -75,18 +74,19 @@ _xfs_verify_metadump_v1()
SCRATCH_DEV=$data_img _scratch_xfs_mdrestore $metadump_file
# Create loopdev for data device so we can mount the fs
- data_loop=$(_create_loop_device $data_img)
+ METADUMP_DATA_LOOP_DEV=$(_create_loop_device $data_img)
# Mount fs, run an extra test, fsck, and unmount
- SCRATCH_DEV=$data_loop _scratch_mount
+ SCRATCH_DEV=$METADUMP_DATA_LOOP_DEV _scratch_mount
if [ -n "$extra_test" ]; then
- SCRATCH_DEV=$data_loop $extra_test
+ SCRATCH_DEV=$METADUMP_DATA_LOOP_DEV $extra_test
fi
- SCRATCH_DEV=$data_loop _check_xfs_scratch_fs
- SCRATCH_DEV=$data_loop _scratch_unmount
+ SCRATCH_DEV=$METADUMP_DATA_LOOP_DEV _check_xfs_scratch_fs
+ _unmount $METADUMP_DATA_LOOP_DEV
# Tear down what we created
- _destroy_loop_device $data_loop
+ _destroy_loop_device $METADUMP_DATA_LOOP_DEV
+ unset METADUMP_DATA_LOOP_DEV
rm -f $data_img
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 04/34] generic/019: don't fail if fio crashes while shutting down
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (2 preceding siblings ...)
2025-02-04 21:23 ` [PATCH 03/34] metadump: fix cleanup for v1 metadump testing Darrick J. Wong
@ 2025-02-04 21:23 ` Darrick J. Wong
2025-02-04 21:23 ` [PATCH 05/34] fuzzy: do not set _FSSTRESS_PID when exercising fsx Darrick J. Wong
` (30 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:23 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
My system (Debian 12) has fio 3.33. Once in a while, fio crashes while
shutting down after it receives a SIGBUS on account of the filesystem
going down. This causes the test to fail with:
generic/019 - output mismatch (see /var/tmp/fstests/generic/019.out.bad)
--- tests/generic/019.out 2024-02-28 16:20:24.130889521 -0800
+++ /var/tmp/fstests/generic/019.out.bad 2025-01-03 15:00:35.903564431 -0800
@@ -5,5 +5,6 @@
Start fio..
Force SCRATCH_DEV device failure
+/tmp/fstests/tests/generic/019: line 112: 90841 Segmentation fault $FIO_PROG $fio_config >> $seqres.full 2>&1
Make SCRATCH_DEV device operable again
Disallow global fail_make_request feature
...
(Run 'diff -u /tmp/fstests/tests/generic/019.out /var/tmp/fstests/generic/019.out.bad' to see the entire diff)
because the wait command will dutifully report fatal signals that kill
the fio process. Unfortunately, a core dump shows that we blew up in
some library's exit handler somewhere:
(gdb) where
#0 unlink_chunk (p=p@entry=0x55b31cb9a430, av=0x7f8b4475ec60 <main_arena>) at ./malloc/malloc.c:1628
#1 0x00007f8b446222ff in _int_free (av=0x7f8b4475ec60 <main_arena>, p=0x55b31cb9a430, have_lock=<optimized out>, have_lock@entry=0) at ./malloc/malloc.c:4603
#2 0x00007f8b44624f1f in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3385
#3 0x00007f8b3a71cf0e in ?? () from /lib/x86_64-linux-gnu/libtasn1.so.6
#4 0x00007f8b4426447c in ?? () from /lib/x86_64-linux-gnu/libgnutls.so.30
#5 0x00007f8b4542212a in _dl_call_fini (closure_map=closure_map@entry=0x7f8b44465620) at ./elf/dl-call_fini.c:43
#6 0x00007f8b4542581e in _dl_fini () at ./elf/dl-fini.c:114
#7 0x00007f8b445ca55d in __run_exit_handlers (status=0, listp=0x7f8b4475e820 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
at ./stdlib/exit.c:116
#8 0x00007f8b445ca69a in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:146
#9 0x00007f8b445b3251 in __libc_start_call_main (main=main@entry=0x55b319278e10 <main>, argc=argc@entry=2, argv=argv@entry=0x7ffec6f8b468) at ../sysdeps/nptl/libc_start_call_main.h:74
#10 0x00007f8b445b3305 in __libc_start_main_impl (main=0x55b319278e10 <main>, argc=2, argv=0x7ffec6f8b468, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7ffec6f8b458) at ../csu/libc-start.c:360
#11 0x000055b319278ed1 in _start ()
This isn't a filesystem failure, so mask this by shovelling the output
to seqres.full.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
tests/generic/019 | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/generic/019 b/tests/generic/019
index bed916b53f98e5..676ff27dab8062 100755
--- a/tests/generic/019
+++ b/tests/generic/019
@@ -109,7 +109,7 @@ _workout()
_fail "failed: still able to perform integrity fsync on $SCRATCH_MNT"
_kill_fsstress
- wait $fio_pid
+ wait $fio_pid &>> $seqres.full # old fio can crash on EIO, ignore segfault reporting
unset fio_pid
# We expect that broken FS still can be umounted
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 05/34] fuzzy: do not set _FSSTRESS_PID when exercising fsx
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (3 preceding siblings ...)
2025-02-04 21:23 ` [PATCH 04/34] generic/019: don't fail if fio crashes while shutting down Darrick J. Wong
@ 2025-02-04 21:23 ` Darrick J. Wong
2025-02-04 21:23 ` [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack Darrick J. Wong
` (29 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:23 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
If we're not running fsstress as the scrub exerciser, don't set
_FSSTRESS_PID because the _kill_fsstress call in the cleanup function
will think that it has to wait for a nonexistant fsstress process.
This fixes the problem of xfs/565 runtime increasing from 30s to 800s
because it tries to kill a nonexistent "565.fsstress" process and then
waits for the fsx loop control process, which hasn't been sent any
signals.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
common/fuzzy | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/common/fuzzy b/common/fuzzy
index 534e91dedbbb43..0a2d91542b561e 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -1392,7 +1392,11 @@ _scratch_xfs_stress_scrub() {
"__stress_scrub_${exerciser}_loop" "$end" "$runningfile" \
"$remount_period" "$stress_tgt" &
- _FSSTRESS_PID=$!
+ # The loop is a background process, so _FSSTRESS_PID is set in that
+ # child. Unfortunately, this process doesn't know about it. Therefore
+ # we need to set _FSSTRESS_PID ourselves so that cleanup tries to kill
+ # fsstress.
+ test "${exerciser}" = "fsstress" && _FSSTRESS_PID=$!
if [ -n "$freeze" ]; then
__stress_scrub_freeze_loop "$end" "$runningfile" &
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (4 preceding siblings ...)
2025-02-04 21:23 ` [PATCH 05/34] fuzzy: do not set _FSSTRESS_PID when exercising fsx Darrick J. Wong
@ 2025-02-04 21:23 ` Darrick J. Wong
2025-02-05 0:07 ` Dave Chinner
2025-02-04 21:24 ` [PATCH 07/34] common/dump: don't replace pids arbitrarily Darrick J. Wong
` (28 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:23 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Zorro complained about the following regression in generic/589 that I
can't reproduce here on Debian 12:
> --- /dev/fd/63 2025-01-08 19:53:49.621421512 -0500
> +++ generic/589.out.bad 2025-01-08 19:53:49.244099127 -0500
> @@ -30,6 +30,7 @@
> mpC SCRATCH_DEV
> mpC SCRATCH_DEV
> ======
> +umount: /mnt/xfstests/test/589-dst: not mounted
>
> It fails on generic/589 -> end_test -> _clear_mount_stack
>
> ...
> + end_test
> + _clear_mount_stack
> + '[' -n '/mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src' ']'
> + _unmount -R /mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src
> + local outlog=/tmp/201227.201227.umount
> + local errlog=/tmp/201227.201227.umount.err
> + rm -f /tmp/201227.201227.umount /tmp/201227.201227.umount.err
> + /usr/bin/umount -R /mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src
> + local res=1
> + '[' -s /tmp/201227.201227.umount ']'
> + '[' -s /tmp/201227.201227.umount.err ']'
> + cat /tmp/201227.201227.umount.err
> + cat /tmp/201227.201227.umount.err
> umount: /mnt/test/589-dst: not mounted
>
> The _get_mount already save all mount points into MOUNTED_POINT_STACK,
> MOUNTED_POINT_STACK="/mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src"
>
> '-/mnt/test /dev/sda5 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
> |-/mnt/test/589-src /dev/sda6 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
> | '-/mnt/test/589-src/223262_mpA /dev/sda6 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
> '-/mnt/test/589-dst /dev/sda6 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
> |-/mnt/test/589-dst /dev/sda6 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
> | '-/mnt/test/589-dst/223262_mpC /dev/sda6 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
> '-/mnt/test/589-dst/223262_mpC /dev/sda6 xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>
> Orignal _clear_mount_stack trys to umount all of them. But Dave gave -R option to
> umount command, so
> `umount -R /mnt/test/589-dst/201227_mpC` and `umount -R /mnt/test/589-src/201227_mpA`
> already umount /mnt/test/589-dst and /mnt/test/589-src. recursively.
> Then `umount -R /mnt/test/589-dst` shows "not mounted".
I /think/ this is a result of commit 4c6bc4565105e6 performing this
"conversion" in _clear_mount_stack:
- $UMOUNT_PROG $MOUNTED_POINT_STACK
+ _unmount -R $MOUNTED_POINT_STACK
This is not a 1:1 conversion here -- previously there was no
-R(ecursive) unmount option, and now there is. It looks as though
umount parses /proc/self/mountinfo to figure out what to unmount, and
maybe on Zorro's system it balks if the argument passed is not present
in that file? Debian 12's umount doesn't care.
Regardless, there was no justification for this change in behavior that
was buried in what is otherwise a hoist patch, so revert it. The author
can resubmit the change with proper documentation.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 4c6bc4565105e6 ("fstests: clean up mount and unmount operations")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/common/rc b/common/rc
index 4658e3b8be747f..07646927bad523 100644
--- a/common/rc
+++ b/common/rc
@@ -334,7 +334,7 @@ _put_mount()
_clear_mount_stack()
{
if [ -n "$MOUNTED_POINT_STACK" ]; then
- _unmount -R $MOUNTED_POINT_STACK
+ _unmount $MOUNTED_POINT_STACK
fi
MOUNTED_POINT_STACK=""
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 07/34] common/dump: don't replace pids arbitrarily
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (5 preceding siblings ...)
2025-02-04 21:23 ` [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack Darrick J. Wong
@ 2025-02-04 21:24 ` Darrick J. Wong
2025-02-05 0:09 ` Dave Chinner
2025-02-04 21:24 ` [PATCH 08/34] common/populate: correct the parent pointer name creation formulae Darrick J. Wong
` (27 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:24 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
In the next patch we'll run tests in a pid namespace, which means that
the test process will have a very low pid. This low pid situation
causes problems with the dump tests because they unconditionally replace
it with the word "PID", which causes unnecessary test failures.
Initially I was going to fix it by bracketing the regexp with a
whitespace/punctuation/eol/sol detector, but then I decided to remove it
see how many sheep came barreling through. None did, so I removed it
entirely. The commit adding it (linked below) was not insightful at
all.
Fixes: 19beb54c96e363 ("Extra filtering as part of IRIX/Linux xfstests reconciliation for dump.")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/dump | 1 -
1 file changed, 1 deletion(-)
diff --git a/common/dump b/common/dump
index 3761c16100d808..6dcd6250c33147 100644
--- a/common/dump
+++ b/common/dump
@@ -907,7 +907,6 @@ _dir_filter()
-e "s#$SCRATCH_MNT#SCRATCH_MNT#g" \
-e "s#$dump_sdir#DUMP_SUBDIR#g" \
-e "s#$restore_sdir#RESTORE_SUBDIR#g" \
- -e "s#$$#PID#g" \
-e "/Only in SCRATCH_MNT: .use_space/d" \
-e "s#$RESULT_DIR/##g" \
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 08/34] common/populate: correct the parent pointer name creation formulae
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (6 preceding siblings ...)
2025-02-04 21:24 ` [PATCH 07/34] common/dump: don't replace pids arbitrarily Darrick J. Wong
@ 2025-02-04 21:24 ` Darrick J. Wong
2025-02-04 21:24 ` [PATCH 09/34] generic/759,760: fix MADV_COLLAPSE detection and inclusion Darrick J. Wong
` (26 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:24 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
The formulae used to compute the number of parent pointers that we have
to create in a child file in order to generate a particular xattr
structure are not even close to correct -- the first one needs a bit of
adjustment, but the second one is way off and creates far too many
files.
Fix the computation, and document where the magic numbers come from.
Cc: <fstests@vger.kernel.org> # v2024.06.27
Fixes: 0c02207d61af9a ("populate: create hardlinks for parent pointers")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/populate | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/common/populate b/common/populate
index 4cf9c0691956a3..c907e04efd0ea9 100644
--- a/common/populate
+++ b/common/populate
@@ -473,13 +473,18 @@ _scratch_xfs_populate() {
__populate_create_dir "${SCRATCH_MNT}/PPTRS" 1 '' \
--hardlink --format "two_%d"
- # Create one xattr leaf block of parent pointers
- nr="$((blksz * 2 / 16))"
+ # Create one xattr leaf block of parent pointers. The name is
+ # 8 bytes and, the handle is 12 bytes, which rounds up to 24
+ # bytes per record, plus xattr structure overhead.
+ nr="$((blksz / 24))"
__populate_create_dir "${SCRATCH_MNT}/PPTRS" ${nr} '' \
--hardlink --format "many%04d"
- # Create multiple xattr leaf blocks of large parent pointers
- nr="$((blksz * 16 / 16))"
+ # Create multiple xattr leaf blocks of large parent pointers.
+ # The name is 256 bytes and the handle is 12 bytes, which
+ # rounds up to 272 bytes per record, plus xattr structure
+ # overhead.
+ nr="$((blksz * 2 / 272))"
__populate_create_dir "${SCRATCH_MNT}/PPTRS" ${nr} '' \
--hardlink --format "y%0254d"
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 09/34] generic/759,760: fix MADV_COLLAPSE detection and inclusion
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (7 preceding siblings ...)
2025-02-04 21:24 ` [PATCH 08/34] common/populate: correct the parent pointer name creation formulae Darrick J. Wong
@ 2025-02-04 21:24 ` Darrick J. Wong
2025-02-04 21:24 ` [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO Darrick J. Wong
` (25 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:24 UTC (permalink / raw)
To: zlang, djwong; +Cc: joannelkoong, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
On systems with "old" C libraries such as glibc 2.36 in Debian 12, the
MADV_COLLAPSE flag might not be defined in any of the header files
pulled in by sys/mman.h, which means that the fsx binary might not get
built with any of the MADV_COLLAPSE code. If the kernel supports THP,
the test will fail with:
> QA output created by 760
> fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> +mapped writes DISABLED
> +MADV_COLLAPSE not supported. Can't support -h
Fix both tests to detect fsx binaries that don't support MADV_COLLAPSE,
then fix fsx.c to include the mman.h from the kernel headers (aka
linux/mman.h) so that we can actually test IOs to and from THPs if the
kernel is newer than the rest of userspace.
Cc: <fstests@vger.kernel.org> # v2025.02.02
Cc: joannelkoong@gmail.com
Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 5 +++++
ltp/fsx.c | 1 +
tests/generic/759 | 1 +
tests/generic/760 | 1 +
4 files changed, 8 insertions(+)
diff --git a/common/rc b/common/rc
index 07646927bad523..b7736173e6e839 100644
--- a/common/rc
+++ b/common/rc
@@ -4976,6 +4976,11 @@ _get_page_size()
echo $(getconf PAGE_SIZE)
}
+_require_hugepage_fsx()
+{
+ $here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
+ _notrun "fsx binary does not support MADV_COLLAPSE"
+}
run_fsx()
{
diff --git a/ltp/fsx.c b/ltp/fsx.c
index 634c496ffe9317..cf9502a74c17a7 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -20,6 +20,7 @@
#include <strings.h>
#include <sys/file.h>
#include <sys/mman.h>
+#include <linux/mman.h>
#include <sys/uio.h>
#include <stdbool.h>
#ifdef HAVE_ERR_H
diff --git a/tests/generic/759 b/tests/generic/759
index 6c74478aa8a0e0..a7dec155056abc 100755
--- a/tests/generic/759
+++ b/tests/generic/759
@@ -13,6 +13,7 @@ _begin_fstest rw auto quick
_require_test
_require_thp
+_require_hugepage_fsx
run_fsx -N 10000 -l 500000 -h
run_fsx -N 10000 -o 8192 -l 500000 -h
diff --git a/tests/generic/760 b/tests/generic/760
index c71a196222ad3b..4781a8d1eec4ec 100755
--- a/tests/generic/760
+++ b/tests/generic/760
@@ -14,6 +14,7 @@ _begin_fstest rw auto quick
_require_test
_require_odirect
_require_thp
+_require_hugepage_fsx
psize=`$here/src/feature -s`
bsize=`$here/src/min_dio_alignment $TEST_DIR $TEST_DEV`
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (8 preceding siblings ...)
2025-02-04 21:24 ` [PATCH 09/34] generic/759,760: fix MADV_COLLAPSE detection and inclusion Darrick J. Wong
@ 2025-02-04 21:24 ` Darrick J. Wong
2025-02-05 18:06 ` Darrick J. Wong
2025-02-04 21:25 ` [PATCH 11/34] common/rc: create a wrapper for the su command Darrick J. Wong
` (24 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:24 UTC (permalink / raw)
To: zlang, djwong; +Cc: joannelkoong, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
On an arm64 VM with 64k base pages and a paltry 8G of RAM, this test
will frequently fail like this:
> QA output created by 759
> fsx -N 10000 -l 500000 -h
> -fsx -N 10000 -o 8192 -l 500000 -h
> -fsx -N 10000 -o 128000 -l 500000 -h
> +Seed set to 1
> +madvise collapse for buf: Cannot allocate memory
> +init_hugepages_buf failed for good_buf: Cannot allocate memory
This system has a 512MB hugepage size, which means that there's a good
chance that memory is so fragmented that we won't be able to create a
huge page (in 1/16th the available DRAM). Create a _run_hugepage_fsx
helper that will detect this situation at the start of the test and skip
it, having refactored run_fsx into a properly namespaced version that
won't exit the test on failure.
Cc: <fstests@vger.kernel.org> # v2025.02.02
Cc: joannelkoong@gmail.com
Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 34 ++++++++++++++++++++++++++++++----
ltp/fsx.c | 6 ++++--
tests/generic/759 | 6 +++---
tests/generic/760 | 6 +++---
4 files changed, 40 insertions(+), 12 deletions(-)
diff --git a/common/rc b/common/rc
index b7736173e6e839..4005db776309f3 100644
--- a/common/rc
+++ b/common/rc
@@ -4982,20 +4982,46 @@ _require_hugepage_fsx()
_notrun "fsx binary does not support MADV_COLLAPSE"
}
-run_fsx()
+_run_fsx()
{
- echo fsx $@
+ echo "fsx $*"
local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
set -- $here/ltp/fsx $args $FSX_AVOID $TEST_DIR/junk
echo "$@" >>$seqres.full
rm -f $TEST_DIR/junk
"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
- if [ ${PIPESTATUS[0]} -ne 0 ]; then
+ local res=${PIPESTATUS[0]}
+ if [ $res -ne 0 ]; then
cat $tmp.fsx
rm -f $tmp.fsx
- exit 1
+ return $res
fi
rm -f $tmp.fsx
+ return 0
+}
+
+# Run fsx with -h(ugepage buffers). If we can't set up a hugepage then skip
+# the test, but if any other error occurs then exit the test.
+_run_hugepage_fsx() {
+ _run_fsx "$@" -h &> $tmp.hugepage_fsx
+ local res=$?
+ if [ $res -eq 103 ]; then
+ # According to the MADV_COLLAPSE manpage, these three errors
+ # can happen if the kernel could not collapse a collection of
+ # pages into a single huge page.
+ grep -q -E ' for hugebuf: (Cannot allocate memory|Device or resource busy|Resource temporarily unavailable)' $tmp.hugepage_fsx && \
+ _notrun "Could not set up huge page for test"
+ fi
+ cat $tmp.hugepage_fsx
+ rm -f $tmp.hugepage_fsx
+ test $res -ne 0 && exit 1
+ return 0
+}
+
+# run fsx or exit the test
+run_fsx()
+{
+ _run_fsx || exit 1
}
_require_statx()
diff --git a/ltp/fsx.c b/ltp/fsx.c
index cf9502a74c17a7..d1b0f245582b31 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -2974,13 +2974,15 @@ init_hugepages_buf(unsigned len, int hugepage_size, int alignment, long *buf_siz
ret = posix_memalign(&buf, hugepage_size, size);
if (ret) {
- prterr("posix_memalign for buf");
+ /* common/rc greps this error message */
+ prterr("posix_memalign for hugebuf");
return NULL;
}
memset(buf, '\0', size);
ret = madvise(buf, size, MADV_COLLAPSE);
if (ret) {
- prterr("madvise collapse for buf");
+ /* common/rc greps this error message */
+ prterr("madvise collapse for hugebuf");
free(buf);
return NULL;
}
diff --git a/tests/generic/759 b/tests/generic/759
index a7dec155056abc..49c02214559a55 100755
--- a/tests/generic/759
+++ b/tests/generic/759
@@ -15,9 +15,9 @@ _require_test
_require_thp
_require_hugepage_fsx
-run_fsx -N 10000 -l 500000 -h
-run_fsx -N 10000 -o 8192 -l 500000 -h
-run_fsx -N 10000 -o 128000 -l 500000 -h
+_run_hugepage_fsx -N 10000 -l 500000
+_run_hugepage_fsx -N 10000 -o 8192 -l 500000
+_run_hugepage_fsx -N 10000 -o 128000 -l 500000
status=0
exit
diff --git a/tests/generic/760 b/tests/generic/760
index 4781a8d1eec4ec..f270636e56a377 100755
--- a/tests/generic/760
+++ b/tests/generic/760
@@ -19,9 +19,9 @@ _require_hugepage_fsx
psize=`$here/src/feature -s`
bsize=`$here/src/min_dio_alignment $TEST_DIR $TEST_DEV`
-run_fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
-run_fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
-run_fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
+_run_hugepage_fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
+_run_hugepage_fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
+_run_hugepage_fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
status=0
exit
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 11/34] common/rc: create a wrapper for the su command
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (9 preceding siblings ...)
2025-02-04 21:24 ` [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO Darrick J. Wong
@ 2025-02-04 21:25 ` Darrick J. Wong
2025-02-05 0:14 ` Dave Chinner
2025-02-04 21:25 ` [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT Darrick J. Wong
` (23 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:25 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Create a _su wrapper around the su command so that the next patch can
fix all the pkill isolation code so that this test can only kill
processes started for this test.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 9 +++++++--
tests/generic/093 | 2 +-
tests/generic/125 | 2 +-
tests/generic/128 | 2 +-
tests/generic/193 | 36 ++++++++++++++++++------------------
tests/generic/230 | 14 +++++++-------
tests/generic/231 | 2 +-
tests/generic/233 | 2 +-
tests/generic/270 | 4 ++--
tests/generic/314 | 2 +-
tests/generic/327 | 2 +-
tests/generic/328 | 4 ++--
tests/generic/355 | 2 +-
tests/generic/453 | 6 +++---
tests/generic/514 | 2 +-
tests/generic/573 | 2 +-
tests/generic/600 | 2 +-
tests/generic/601 | 2 +-
tests/generic/603 | 10 +++++-----
tests/generic/673 | 2 +-
tests/generic/674 | 2 +-
tests/generic/675 | 2 +-
tests/generic/680 | 2 +-
tests/generic/681 | 2 +-
tests/generic/682 | 2 +-
tests/generic/683 | 2 +-
tests/generic/684 | 2 +-
tests/generic/685 | 2 +-
tests/generic/686 | 2 +-
tests/generic/687 | 2 +-
tests/generic/688 | 2 +-
tests/generic/691 | 8 ++++----
tests/generic/721 | 10 +++++-----
tests/generic/726 | 2 +-
tests/generic/727 | 2 +-
tests/xfs/720 | 2 +-
tests/xfs/795 | 2 +-
tests/xfs/803 | 2 +-
38 files changed, 82 insertions(+), 77 deletions(-)
diff --git a/common/rc b/common/rc
index 4005db776309f3..9a6f7dce613e62 100644
--- a/common/rc
+++ b/common/rc
@@ -2726,6 +2726,11 @@ _require_user_exists()
[ "$?" == "0" ] || _notrun "$user user not defined."
}
+_su()
+{
+ su "$@"
+}
+
# check if a user exists and is able to execute commands.
# Uses 'fsgqa' user as default.
#
@@ -2736,7 +2741,7 @@ _require_user()
qa_user=$1
fi
_require_user_exists $qa_user
- echo /bin/true | su $qa_user
+ echo /bin/true | _su $qa_user
[ "$?" == "0" ] || _notrun "$qa_user cannot execute commands."
}
@@ -2798,7 +2803,7 @@ s,^\s*$,,;
_user_do()
{
- echo $1 | su -s /bin/bash $qa_user 2>&1 | _filter_user_do
+ echo $1 | _su -s /bin/bash $qa_user 2>&1 | _filter_user_do
}
_require_xfs_io_command()
diff --git a/tests/generic/093 b/tests/generic/093
index c4e866da1c18eb..047cc821e0e608 100755
--- a/tests/generic/093
+++ b/tests/generic/093
@@ -62,7 +62,7 @@ chmod ugo+w $TEST_DIR
# don't use $here/src/writemod, as we're running it as a regular user, and
# $here may contain path component that a regular user doesn't have search
# permission
-su $qa_user -c "src/writemod $file" | filefilter
+_su $qa_user -c "src/writemod $file" | filefilter
cat $file
# success, all done
diff --git a/tests/generic/125 b/tests/generic/125
index e2bc5fa125da6e..011a51e8fa21c2 100755
--- a/tests/generic/125
+++ b/tests/generic/125
@@ -32,7 +32,7 @@ chmod a+rw $TESTFILE
# don't use $here/src/ftrunc, as we're running it as a regular user, and $here
# may contain path component that a regular user doesn't have search permission
-su $qa_user -c "./src/ftrunc -f $TESTFILE"
+_su $qa_user -c "./src/ftrunc -f $TESTFILE"
if [ "$?" != "0" ]; then
echo src/ftrunc returned non 0 status!
diff --git a/tests/generic/128 b/tests/generic/128
index f931ca0639687b..5a497cdea7382b 100755
--- a/tests/generic/128
+++ b/tests/generic/128
@@ -26,7 +26,7 @@ cp "$(type -P ls)" $SCRATCH_MNT
chmod 700 $SCRATCH_MNT/nosuid
chmod 4755 $SCRATCH_MNT/ls
-su -s/bin/bash - $qa_user -c "$SCRATCH_MNT/ls $SCRATCH_MNT/nosuid >/dev/null 2>&1"
+_su -s/bin/bash - $qa_user -c "$SCRATCH_MNT/ls $SCRATCH_MNT/nosuid >/dev/null 2>&1"
if [ $? -eq 0 ] ; then
echo "Error: we shouldn't be able to ls the directory"
fi
diff --git a/tests/generic/193 b/tests/generic/193
index d251d3a5c4f8ee..ba557428f77b9c 100755
--- a/tests/generic/193
+++ b/tests/generic/193
@@ -69,17 +69,17 @@ echo
_create_files
echo "user: chown root owned file to qa_user (should fail)"
-su ${qa_user} -c "chown ${qa_user} $test_root" 2>&1 | _filter_files
+_su ${qa_user} -c "chown ${qa_user} $test_root" 2>&1 | _filter_files
echo "user: chown root owned file to root (should fail)"
-su ${qa_user} -c "chown root $test_root" 2>&1 | _filter_files
+_su ${qa_user} -c "chown root $test_root" 2>&1 | _filter_files
echo "user: chown qa_user owned file to qa_user (should succeed)"
-su ${qa_user} -c "chown ${qa_user} $test_user"
+_su ${qa_user} -c "chown ${qa_user} $test_user"
# this would work without _POSIX_CHOWN_RESTRICTED
echo "user: chown qa_user owned file to root (should fail)"
-su ${qa_user} -c "chown root $test_user" 2>&1 | _filter_files
+_su ${qa_user} -c "chown root $test_user" 2>&1 | _filter_files
_cleanup_files
@@ -93,19 +93,19 @@ echo
_create_files
echo "user: chgrp root owned file to root (should fail)"
-su ${qa_user} -c "chgrp root $test_root" 2>&1 | _filter_files
+_su ${qa_user} -c "chgrp root $test_root" 2>&1 | _filter_files
echo "user: chgrp qa_user owned file to root (should fail)"
-su ${qa_user} -c "chgrp root $test_user" 2>&1 | _filter_files
+_su ${qa_user} -c "chgrp root $test_user" 2>&1 | _filter_files
echo "user: chgrp root owned file to qa_user (should fail)"
-su ${qa_user} -c "chgrp ${qa_user} $test_root" 2>&1 | _filter_files
+_su ${qa_user} -c "chgrp ${qa_user} $test_root" 2>&1 | _filter_files
echo "user: chgrp qa_user owned file to qa_user (should succeed)"
-su ${qa_user} -c "chgrp ${qa_user} $test_user"
+_su ${qa_user} -c "chgrp ${qa_user} $test_user"
#echo "user: chgrp qa_user owned file to secondary group (should succeed)"
-#su ${qa_user} -c "chgrp ${group2} $test_user"
+#_su ${qa_user} -c "chgrp ${group2} $test_user"
_cleanup_files
@@ -119,10 +119,10 @@ echo
_create_files
echo "user: chmod a+r on qa_user owned file (should succeed)"
-su ${qa_user} -c "chmod a+r $test_user"
+_su ${qa_user} -c "chmod a+r $test_user"
echo "user: chmod a+r on root owned file (should fail)"
-su ${qa_user} -c "chmod a+r $test_root" 2>&1 | _filter_files
+_su ${qa_user} -c "chmod a+r $test_root" 2>&1 | _filter_files
#
# Setup a file owned by the qa_user, but with a group ID that
@@ -143,7 +143,7 @@ chown ${qa_user}:root $test_user
chmod g+s $test_user
# and let the qa_user change permission bits
-su ${qa_user} -c "chmod a+w $test_user"
+_su ${qa_user} -c "chmod a+w $test_user"
stat -c '%A' $test_user
#
@@ -220,7 +220,7 @@ echo "with no exec perm"
echo frobnozzle >> $test_user
chmod ug+s $test_user
echo -n "before: "; stat -c '%A' $test_user
-su ${qa_user} -c "echo > $test_user"
+_su ${qa_user} -c "echo > $test_user"
echo -n "after: "; stat -c '%A' $test_user
echo "with user exec perm"
@@ -228,7 +228,7 @@ echo frobnozzle >> $test_user
chmod ug+s $test_user
chmod u+x $test_user
echo -n "before: "; stat -c '%A' $test_user
-su ${qa_user} -c "echo > $test_user"
+_su ${qa_user} -c "echo > $test_user"
echo -n "after: "; stat -c '%A' $test_user
echo "with group exec perm"
@@ -237,7 +237,7 @@ chmod ug+s $test_user
chmod g+x $test_user
chmod u-x $test_user
echo -n "before: "; stat -c '%A' $test_user
-su ${qa_user} -c "echo > $test_user"
+_su ${qa_user} -c "echo > $test_user"
echo -n "after: "; stat -c '%A' $test_user
echo "with user+group exec perm"
@@ -245,7 +245,7 @@ echo frobnozzle >> $test_user
chmod ug+s $test_user
chmod ug+x $test_user
echo -n "before: "; stat -c '%A' $test_user
-su ${qa_user} -c "echo > $test_user"
+_su ${qa_user} -c "echo > $test_user"
echo -n "after: "; stat -c '%A' $test_user
#
@@ -258,10 +258,10 @@ echo
_create_files
echo "user: touch qa_user file (should succeed)"
-su ${qa_user} -c "touch $test_user"
+_su ${qa_user} -c "touch $test_user"
echo "user: touch root file (should fail)"
-su ${qa_user} -c "touch $test_root" 2>&1 | _filter_files
+_su ${qa_user} -c "touch $test_root" 2>&1 | _filter_files
_cleanup_files
diff --git a/tests/generic/230 b/tests/generic/230
index ba95fbe725ad28..18e20f4b2e9439 100755
--- a/tests/generic/230
+++ b/tests/generic/230
@@ -33,13 +33,13 @@ test_enforcement()
echo "--- initiating IO..." >>$seqres.full
# Firstly fit below block soft limit
echo "Write 225 blocks..."
- su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((225 * $BLOCK_SIZE))' -c fsync \
+ _su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((225 * $BLOCK_SIZE))' -c fsync \
$SCRATCH_MNT/file1" 2>&1 >>$seqres.full | \
_filter_xfs_io_error | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Secondly overcome block soft limit
echo "Rewrite 250 blocks plus 1 byte..."
- su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((250 * $BLOCK_SIZE + 1))' -c fsync \
+ _su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((250 * $BLOCK_SIZE + 1))' -c fsync \
$SCRATCH_MNT/file1" 2>&1 >>$seqres.full | \
_filter_xfs_io_error | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
@@ -49,14 +49,14 @@ test_enforcement()
# scratch device that may leed to grace time exceed.
setquota -$type $qa_user -T $grace $grace $SCRATCH_MNT 2>/dev/null
echo "Write 250 blocks..."
- su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((250 * $BLOCK_SIZE))' -c fsync \
+ _su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((250 * $BLOCK_SIZE))' -c fsync \
$SCRATCH_MNT/file2" 2>&1 >>$seqres.full | \
_filter_xfs_io_error | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Now sleep for grace time and check that softlimit got enforced
sleep $((grace+1))
echo "Write 1 block..."
- su $qa_user -c "$XFS_IO_PROG -c 'truncate 0' -c 'pwrite 0 $BLOCK_SIZE' \
+ _su $qa_user -c "$XFS_IO_PROG -c 'truncate 0' -c 'pwrite 0 $BLOCK_SIZE' \
$SCRATCH_MNT/file2" 2>&1 >>$seqres.full | \
_filter_xfs_io_error | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
@@ -65,7 +65,7 @@ test_enforcement()
# space reservations on XFS
setquota -$type $qa_user 0 0 3 5 $SCRATCH_MNT
echo "Touch 3+4"
- su $qa_user -c "touch $SCRATCH_MNT/file3 $SCRATCH_MNT/file4" \
+ _su $qa_user -c "touch $SCRATCH_MNT/file3 $SCRATCH_MNT/file4" \
2>&1 >>$seqres.full | _filter_scratch | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Try to exceed inode hardlimit
@@ -74,14 +74,14 @@ test_enforcement()
# scratch device that may leed to grace time exceed.
setquota -$type $qa_user -T $grace $grace $SCRATCH_MNT 2>/dev/null
echo "Touch 5+6"
- su $qa_user -c "touch $SCRATCH_MNT/file5 $SCRATCH_MNT/file6" \
+ _su $qa_user -c "touch $SCRATCH_MNT/file5 $SCRATCH_MNT/file6" \
2>&1 >>$seqres.full | _filter_scratch | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Wait and check grace time enforcement
rm -f $SCRATCH_MNT/file5 >>$seqres.full 2>&1
sleep $((grace+1))
echo "Touch 5"
- su $qa_user -c "touch $SCRATCH_MNT/file5" 2>&1 >>$seqres.full |
+ _su $qa_user -c "touch $SCRATCH_MNT/file5" 2>&1 >>$seqres.full |
_filter_scratch | tee -a $seqres.full
repquota -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
echo "--- completed IO ($type)" >>$seqres.full
diff --git a/tests/generic/231 b/tests/generic/231
index ef3ea45d45d736..8dda926d875e88 100755
--- a/tests/generic/231
+++ b/tests/generic/231
@@ -24,7 +24,7 @@ _fsx()
for (( i = 1; i <= $tasks; i++ )); do
SEED=$RANDOM
echo "ltp/fsx $FSX_ARGS -S $SEED $SCRATCH_MNT/fsx_file$i" >>$seqres.full
- su $qa_user -c "ltp/fsx $FSX_ARGS -S $SEED \
+ _su $qa_user -c "ltp/fsx $FSX_ARGS -S $SEED \
$FSX_AVOID $SCRATCH_MNT/fsx_file$i" >$tmp.output$i 2>&1 &
done
diff --git a/tests/generic/233 b/tests/generic/233
index b4c804ff217fbb..df67b39092b7cb 100755
--- a/tests/generic/233
+++ b/tests/generic/233
@@ -43,7 +43,7 @@ _fsstress()
# temporarily.
ulimit -l unlimited
echo "fsstress $args" >> $seqres.full
- if ! su $qa_user -c "$FSSTRESS_PROG $args" | tee -a $seqres.full | _filter_num
+ if ! _su $qa_user -c "$FSSTRESS_PROG $args" | tee -a $seqres.full | _filter_num
then
echo " fsstress $args returned $?"
cat $tmp.out | tee -a $seqres.full
diff --git a/tests/generic/270 b/tests/generic/270
index 342ac8b5d3d963..d74971bb535239 100755
--- a/tests/generic/270
+++ b/tests/generic/270
@@ -37,14 +37,14 @@ _workout()
# io_uring_queue_init fail on ENOMEM, set max locked memory to unlimited
# temporarily.
ulimit -l unlimited
- su $qa_user -c "$_FSSTRESS_PROG $args" > /dev/null 2>&1 &
+ _su $qa_user -c "$_FSSTRESS_PROG $args" > /dev/null 2>&1 &
_FSSTRESS_PID=$!
echo "Run dd writers in parallel"
for ((i=0; i < num_iterations; i++))
do
# File will be opened with O_TRUNC each time
- su $qa_user -c "dd if=/dev/zero \
+ _su $qa_user -c "dd if=/dev/zero \
of=$SCRATCH_MNT/SPACE_CONSUMER bs=1M " \
>> $seqres.full 2>&1
sleep $enospc_time
diff --git a/tests/generic/314 b/tests/generic/314
index 5fbc6424de3ab4..65f7f9d9619f2e 100755
--- a/tests/generic/314
+++ b/tests/generic/314
@@ -27,7 +27,7 @@ chown $qa_user:12345 $TEST_DIR/$seq-dir
chmod 2775 $TEST_DIR/$seq-dir
# Make subdir
-su $qa_user -c "umask 022; mkdir $TEST_DIR/$seq-dir/subdir"
+_su $qa_user -c "umask 022; mkdir $TEST_DIR/$seq-dir/subdir"
# Subdir should have inherited sgid
_ls_l $TEST_DIR/$seq-dir/ | grep -v total | _filter_test_dir | awk '{print $1,$NF}'
diff --git a/tests/generic/327 b/tests/generic/327
index 2323e1e6a12ec8..18cfcd1f655bd7 100755
--- a/tests/generic/327
+++ b/tests/generic/327
@@ -47,7 +47,7 @@ _report_quota_blocks $SCRATCH_MNT
echo "Try to reflink again"
touch $testdir/file3
chown $qa_user $testdir/file3
-su $qa_user -c "cp --reflink=always -f $testdir/file1 $testdir/file3" 2>&1 | _filter_scratch
+_su $qa_user -c "cp --reflink=always -f $testdir/file1 $testdir/file3" 2>&1 | _filter_scratch
_report_quota_blocks $SCRATCH_MNT
# success, all done
diff --git a/tests/generic/328 b/tests/generic/328
index 7071ded259ddb3..fa33bdb78dba12 100755
--- a/tests/generic/328
+++ b/tests/generic/328
@@ -47,12 +47,12 @@ setquota -u $qa_user 0 1024 0 0 $SCRATCH_MNT
_report_quota_blocks $SCRATCH_MNT
echo "Try to dio write the whole file"
-su $qa_user -c '$XFS_IO_PROG -d -c "pwrite 0 '$((sz+65536))'" '$testdir'/file1' 2>&1 >> $seqres.full | \
+_su $qa_user -c '$XFS_IO_PROG -d -c "pwrite 0 '$((sz+65536))'" '$testdir'/file1' 2>&1 >> $seqres.full | \
_filter_xfs_io_error
_report_quota_blocks $SCRATCH_MNT
echo "Try to write the whole file"
-su $qa_user -c '$XFS_IO_PROG -c "pwrite 0 '$((sz+65536))'" '$testdir'/file1' 2>&1 >> $seqres.full | \
+_su $qa_user -c '$XFS_IO_PROG -c "pwrite 0 '$((sz+65536))'" '$testdir'/file1' 2>&1 >> $seqres.full | \
_filter_xfs_io_error
_report_quota_blocks $SCRATCH_MNT
diff --git a/tests/generic/355 b/tests/generic/355
index b0f8fc45d2b5f4..6b4f7ebae86405 100755
--- a/tests/generic/355
+++ b/tests/generic/355
@@ -22,7 +22,7 @@ rm -f $testfile
do_io()
{
- su $qa_user -c "$XFS_IO_PROG -d -c 'pwrite 0 4k' $testfile" \
+ _su $qa_user -c "$XFS_IO_PROG -d -c 'pwrite 0 4k' $testfile" \
>>$seqres.full
}
diff --git a/tests/generic/453 b/tests/generic/453
index b7e686f37100da..04945ad1085b2d 100755
--- a/tests/generic/453
+++ b/tests/generic/453
@@ -196,7 +196,7 @@ setf "job offer\xef\xb9\x92pdf" "small full stop"
setf "job offer\xef\xbc\x8epdf" "fullwidth full stop"
setf "job offer\xdc\x81pdf" "syriac supralinear full stop"
setf "job offer\xdc\x82pdf" "syriac sublinear full stop"
-setf "job offer\xea\x93\xb8pdf" "lisu letter tone mya ti"
+setf "job offer\xea\x93\xb8pdf" "li_su letter tone mya ti"
setf "job offer.pdf" "actual period"
# encoding hidden tag characters in filenames to create confusing names
@@ -270,7 +270,7 @@ testf "job offer\xef\xb9\x92pdf" "small full stop"
testf "job offer\xef\xbc\x8epdf" "fullwidth full stop"
testf "job offer\xdc\x81pdf" "syriac supralinear full stop"
testf "job offer\xdc\x82pdf" "syriac sublinear full stop"
-testf "job offer\xea\x93\xb8pdf" "lisu letter tone mya ti"
+testf "job offer\xea\x93\xb8pdf" "li_su letter tone mya ti"
testf "job offer.pdf" "actual period"
testf "llamapirate\xf3\xa0\x80\x81\xf3\xa0\x81\x94\xf3\xa0\x81\xa8\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb3\xf3\xa0\x81\xa1\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x81\xb3\xf3\xa0\x80\xa0\xf3\xa0\x81\xa6\xf3\xa0\x81\xaf\xf3\xa0\x81\xb2\xf3\xa0\x80\xa0\xf3\xa0\x81\x93\xf3\xa0\x81\xa5\xf3\xa0\x81\xa1\xf3\xa0\x81\xb4\xf3\xa0\x81\xb4\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb7\xf3\xa0\x81\xa5\xf3\xa0\x81\xb2\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\x95\xf3\xa0\x81\x93\xf3\xa0\x81\x84\xf3\xa0\x80\xa0\xf3\xa0\x80\xb1\xf3\xa0\x80\xb2\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x81\xbf"
@@ -307,7 +307,7 @@ if _check_xfs_scrub_does_unicode "$SCRATCH_MNT" "$SCRATCH_DEV"; then
grep -q "job offer.xef.xbc.x8epdf" $tmp.scrub || echo "No complaints about fullwidth full stop?"
grep -q "job offer.xdc.x81pdf" $tmp.scrub || echo "No complaints about syriac supralinear full stop?"
grep -q "job offer.xdc.x82pdf" $tmp.scrub || echo "No complaints about syriac sublinear full stop?"
- grep -q "job offer.xea.x93.xb8pdf" $tmp.scrub || echo "No complaints about lisu letter tone mya ti?"
+ grep -q "job offer.xea.x93.xb8pdf" $tmp.scrub || echo "No complaints about li_su letter tone mya ti?"
grep -q "job offer.*could be confused with" $tmp.scrub || echo "No complaints about confusing job offers?"
grep -q "job offer.xe2.x80.xa4.xe2.x80.x8dpdf" $tmp.scrub || echo "No complaints about one dot leader with invisible space?"
grep -q "llamapirate" $tmp.scrub || echo "No complaints about hidden llm instructions in filenames?"
diff --git a/tests/generic/514 b/tests/generic/514
index 7f3d9c16cb70a4..a2086a255c77c6 100755
--- a/tests/generic/514
+++ b/tests/generic/514
@@ -21,7 +21,7 @@ _scratch_mount
chmod a+rwx $SCRATCH_MNT
$XFS_IO_PROG -f -c "pwrite -S 0x18 0 1m" $SCRATCH_MNT/foo >>$seqres.full
-su -s/bin/bash - $qa_user -c "ulimit -f 64 ; $XFS_IO_PROG -f -c \"reflink $SCRATCH_MNT/foo\" $SCRATCH_MNT/bar" >> $seqres.full 2>&1
+_su -s/bin/bash - $qa_user -c "ulimit -f 64 ; $XFS_IO_PROG -f -c \"reflink $SCRATCH_MNT/foo\" $SCRATCH_MNT/bar" >> $seqres.full 2>&1
sz="$(_get_filesize $SCRATCH_MNT/bar)"
if [ "$sz" -ne 0 ] && [ "$sz" -ne 65536 ]; then
diff --git a/tests/generic/573 b/tests/generic/573
index b310fccbddda56..d3f3296cb6bafa 100755
--- a/tests/generic/573
+++ b/tests/generic/573
@@ -56,7 +56,7 @@ $CHATTR_PROG -i $fsv_file
_fsv_scratch_begin_subtest "FS_IOC_MEASURE_VERITY doesn't require root"
_fsv_create_enable_file $fsv_file >> $seqres.full
chmod 444 $fsv_file
-su $qa_user -c "$FSVERITY_PROG measure $fsv_file" >> $seqres.full
+_su $qa_user -c "$FSVERITY_PROG measure $fsv_file" >> $seqres.full
# success, all done
status=0
diff --git a/tests/generic/600 b/tests/generic/600
index 43f75376a10efc..31c832251ebb6f 100755
--- a/tests/generic/600
+++ b/tests/generic/600
@@ -33,7 +33,7 @@ setquota -t -u 0 1 $SCRATCH_MNT
# Soft inode limit 1, hard limit 5
setquota -u $qa_user 0 0 1 5 $SCRATCH_MNT
# Run qa user over soft limit and go over grace period
-su $qa_user -c "touch $SCRATCH_MNT/file1 $SCRATCH_MNT/file2"
+_su $qa_user -c "touch $SCRATCH_MNT/file1 $SCRATCH_MNT/file2"
sleep 3
# Extend grace to now + 100s
now=`date +%s`
diff --git a/tests/generic/601 b/tests/generic/601
index 78b6a4aaa13748..320ac0c758b766 100755
--- a/tests/generic/601
+++ b/tests/generic/601
@@ -42,7 +42,7 @@ $XFS_QUOTA_PROG -x -c "timer -u -i -d 1" $SCRATCH_MNT
# Soft inode limit 1, hard limit 5
$XFS_QUOTA_PROG -x -c "limit -u isoft=1 ihard=5 $qa_user" $SCRATCH_MNT
# Run qa user over soft limit and go over grace period
-su $qa_user -c "touch $SCRATCH_MNT/file1 $SCRATCH_MNT/file2"
+_su $qa_user -c "touch $SCRATCH_MNT/file1 $SCRATCH_MNT/file2"
sleep 3
# Extend grace to now + 100s
now=`date +%s`
diff --git a/tests/generic/603 b/tests/generic/603
index 2a75cf9e022750..b199f801a8f03f 100755
--- a/tests/generic/603
+++ b/tests/generic/603
@@ -66,13 +66,13 @@ test_grace()
echo "--- Test block quota ---"
# Firstly fit below block soft limit
echo "Write 225 blocks..."
- su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((225 * $BLOCK_SIZE))' \
+ _su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((225 * $BLOCK_SIZE))' \
-c fsync $dir/file1" 2>&1 >>$seqres.full | \
_filter_xfs_io_error | tee -a $seqres.full
repquota -v -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Secondly overcome block soft limit
echo "Rewrite 250 blocks plus 1 byte, over the block softlimit..."
- su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((250 * $BLOCK_SIZE + 1))' \
+ _su $qa_user -c "$XFS_IO_PROG -c 'pwrite 0 $((250 * $BLOCK_SIZE + 1))' \
-c fsync $dir/file1" 2>&1 >>$seqres.full | \
_filter_xfs_io_error | tee -a $seqres.full
repquota -v -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
@@ -81,7 +81,7 @@ test_grace()
# Now sleep enough grace time and check that softlimit got enforced
sleep $((bgrace + 1))
echo "Try to write 1 one more block after grace..."
- su $qa_user -c "$XFS_IO_PROG -c 'truncate 0' -c 'pwrite 0 $BLOCK_SIZE' \
+ _su $qa_user -c "$XFS_IO_PROG -c 'truncate 0' -c 'pwrite 0 $BLOCK_SIZE' \
$dir/file2" 2>&1 >>$seqres.full | _filter_xfs_io_error | \
filter_enospc_edquot $type | tee -a $seqres.full
repquota -v -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
@@ -91,7 +91,7 @@ test_grace()
# space reservations on XFS
setquota -$type $qa_user 0 0 3 100 $SCRATCH_MNT
echo "Create 2 more files, over the inode softlimit..."
- su $qa_user -c "touch $dir/file3 $dir/file4" 2>&1 >>$seqres.full | \
+ _su $qa_user -c "touch $dir/file3 $dir/file4" 2>&1 >>$seqres.full | \
_filter_scratch | tee -a $seqres.full
repquota -v -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Reset grace time here, make below grace time test more accurate
@@ -99,7 +99,7 @@ test_grace()
# Wait and check grace time enforcement
sleep $((igrace+1))
echo "Try to create one more inode after grace..."
- su $qa_user -c "touch $dir/file5" 2>&1 >>$seqres.full | \
+ _su $qa_user -c "touch $dir/file5" 2>&1 >>$seqres.full | \
filter_enospc_edquot $type | _filter_scratch | \
tee -a $seqres.full
repquota -v -$type $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
diff --git a/tests/generic/673 b/tests/generic/673
index 8f6def9c78881a..6c54ade81f0cec 100755
--- a/tests/generic/673
+++ b/tests/generic/673
@@ -39,7 +39,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c 'reflink $SCRATCH_MNT/b 0 0 1m' $SCRATCH_MNT/a"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/674 b/tests/generic/674
index 1b711f27f39ed1..41fbdeb7d9eb17 100755
--- a/tests/generic/674
+++ b/tests/generic/674
@@ -42,7 +42,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c 'dedupe $SCRATCH_MNT/b 0 0 1m' $SCRATCH_MNT/a"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/675 b/tests/generic/675
index e66de84b546a25..87dfbdfe278dd2 100755
--- a/tests/generic/675
+++ b/tests/generic/675
@@ -44,7 +44,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c 'reflink $SCRATCH_MNT/b 0 0 1m' $SCRATCH_MNT/a"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/680 b/tests/generic/680
index 07048db5cc39a9..1a418fa3b61b0b 100755
--- a/tests/generic/680
+++ b/tests/generic/680
@@ -38,7 +38,7 @@ chmod 0644 $localfile
cp $here/src/splice2pipe $tmp.splice2pipe
# Test unprivileged user's privilege escalation
echo "Test unprivileged user:"
-su ${qa_user} -c "$tmp.splice2pipe $localfile 1 AAAAAAAABBBBBBBB"
+_su ${qa_user} -c "$tmp.splice2pipe $localfile 1 AAAAAAAABBBBBBBB"
_hexdump $localfile
# success, all done
diff --git a/tests/generic/681 b/tests/generic/681
index aef54205d26f3a..dc4252013fc058 100755
--- a/tests/generic/681
+++ b/tests/generic/681
@@ -55,7 +55,7 @@ ls -sld $scratchdir >> $seqres.full
echo "fail quota" >> $seqres.full
for ((i = 0; i < dirents; i++)); do
name=$(printf "y%0254d" $i)
- su - "$qa_user" -c "ln $scratchfile $scratchdir/$name" 2>&1 | \
+ _su - "$qa_user" -c "ln $scratchfile $scratchdir/$name" 2>&1 | \
_filter_scratch | sed -e 's/y[0-9]*/yXXX/g'
test "${PIPESTATUS[0]}" -ne 0 && break
done
diff --git a/tests/generic/682 b/tests/generic/682
index 3572af173cbe63..6914a549dc0975 100755
--- a/tests/generic/682
+++ b/tests/generic/682
@@ -65,7 +65,7 @@ echo "fail quota" >> $seqres.full
for ((i = 0; i < dirents; i++)); do
name=$(printf "y%0254d" $i)
ln $scratchfile $stagedir/$name
- su - "$qa_user" -c "mv $stagedir/$name $scratchdir/$name" 2>&1 | \
+ _su - "$qa_user" -c "mv $stagedir/$name $scratchdir/$name" 2>&1 | \
_filter_scratch | _filter_mv_output
test "${PIPESTATUS[0]}" -ne 0 && break
done
diff --git a/tests/generic/683 b/tests/generic/683
index cc9a9786bde4bf..883905da5f9082 100755
--- a/tests/generic/683
+++ b/tests/generic/683
@@ -49,7 +49,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c '$command $start $end' $junk_file"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/684 b/tests/generic/684
index 2ca036fe518050..9cdfe4ab4f4463 100755
--- a/tests/generic/684
+++ b/tests/generic/684
@@ -49,7 +49,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c '$command $start $end' $junk_file"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/685 b/tests/generic/685
index de07a798a68594..5567255032e1c7 100755
--- a/tests/generic/685
+++ b/tests/generic/685
@@ -49,7 +49,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c '$command $start $end' $junk_file"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/686 b/tests/generic/686
index fc6761fe61a91e..a3fa8e060aeab6 100755
--- a/tests/generic/686
+++ b/tests/generic/686
@@ -49,7 +49,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c '$command $start $end' $junk_file"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/687 b/tests/generic/687
index 82dce88b85ef6d..0c4b09d29fe5e6 100755
--- a/tests/generic/687
+++ b/tests/generic/687
@@ -49,7 +49,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c '$command $start $end' $junk_file"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/688 b/tests/generic/688
index e491d5cf30af23..77db29461415b3 100755
--- a/tests/generic/688
+++ b/tests/generic/688
@@ -51,7 +51,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c 'falloc 0 64k' $junk_file"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/691 b/tests/generic/691
index f33d6edf18a1d1..30ae4a1e384a05 100755
--- a/tests/generic/691
+++ b/tests/generic/691
@@ -75,11 +75,11 @@ exercise()
setquota -${type} -t 86400 86400 $SCRATCH_MNT
repquota -v -${type} $SCRATCH_MNT | grep -v "^root" >>$seqres.full 2>&1
# Exceed the soft quota limit a bit at first
- su $qa_user -c "$XFS_IO_PROG -f -t -c 'pwrite 0 2m' -c fsync ${file}.0" >>$seqres.full
+ _su $qa_user -c "$XFS_IO_PROG -f -t -c 'pwrite 0 2m' -c fsync ${file}.0" >>$seqres.full
# Write more data more times under soft quota limit exhausted condition,
# but not reach hard limit. To make sure each write won't trigger EDQUOT.
for ((i=1; i<=100; i++));do
- su "$qa_user" -c "$XFS_IO_PROG -f -c 'pwrite 0 1m' -c fsync ${file}.$i" >>$seqres.full
+ _su "$qa_user" -c "$XFS_IO_PROG -f -c 'pwrite 0 1m' -c fsync ${file}.$i" >>$seqres.full
if [ $? -ne 0 ];then
echo "Unexpected error (type=$type)!"
break
@@ -89,9 +89,9 @@ exercise()
# As we've tested soft limit, now exceed the hard limit and give it a
# test in passing.
- su $qa_user -c "$XFS_IO_PROG -f -t -c 'pwrite 0 100m' -c fsync ${file}.hard.0" 2>&1 >/dev/null | filter_quota $type
+ _su $qa_user -c "$XFS_IO_PROG -f -t -c 'pwrite 0 100m' -c fsync ${file}.hard.0" 2>&1 >/dev/null | filter_quota $type
for ((i=1; i<=10; i++));do
- su "$qa_user" -c "$XFS_IO_PROG -f -c 'pwrite 0 1m' -c fsync ${file}.hard.$i" 2>&1 | filter_quota $type
+ _su "$qa_user" -c "$XFS_IO_PROG -f -c 'pwrite 0 1m' -c fsync ${file}.hard.$i" 2>&1 | filter_quota $type
done
}
diff --git a/tests/generic/721 b/tests/generic/721
index a9565f18917831..75d5063c2d1701 100755
--- a/tests/generic/721
+++ b/tests/generic/721
@@ -52,7 +52,7 @@ cmd="$XFS_IO_PROG \
-c 'pwrite -S 0x60 44k 55k -b 1m' \
-c 'commitupdate -q' \
\"$dir/a\""
-su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
+_su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
filesnap "after commit" $dir/a
echo
@@ -67,7 +67,7 @@ cmd="$XFS_IO_PROG \
-c 'pwrite -S 0x60 0 55k' \
-c 'commitupdate -q' \
\"$dir/a\""
-su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
+_su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
filesnap "after shorten commit" $dir/a
echo
@@ -81,7 +81,7 @@ cmd="$XFS_IO_PROG \
-c \"pwrite -S 0x60 0 $(( (blksz * nrblks) + 37373 ))\" \
-c 'commitupdate -q' \
\"$dir/a\""
-su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
+_su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
filesnap "after lengthen commit" $dir/a
echo
@@ -96,7 +96,7 @@ cmd="$XFS_IO_PROG \
-c 'pwrite -S 0x60 44k 55k -b 1m' \
-c 'cancelupdate' \
\"$dir/a\""
-su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
+_su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
filesnap "after cancel" $dir/a
echo
@@ -115,7 +115,7 @@ cmd="$XFS_IO_PROG \
-c 'pwrite -S 0x61 22k 11k -b 1m' \
-c 'commitupdate -q' \
\"$dir/a\""
-su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
+_su -s /bin/bash -c "$cmd" $qa_user 2>&1 | _filter_xfs_io | _filter_test_dir
filesnap "after fail commit" $dir/a
echo
diff --git a/tests/generic/726 b/tests/generic/726
index 131ac5b503e1a4..d2a2a2cebe522e 100755
--- a/tests/generic/726
+++ b/tests/generic/726
@@ -46,7 +46,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c 'startupdate' -c 'pwrite -S 0x57 0 1m' -c 'commitupdate' $SCRATCH_MNT/a"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/generic/727 b/tests/generic/727
index ee7ed9760a165a..9551e47cb13b06 100755
--- a/tests/generic/727
+++ b/tests/generic/727
@@ -54,7 +54,7 @@ commit_and_check() {
local cmd="$XFS_IO_PROG -c 'startupdate' -c 'pwrite -S 0x57 0 1m' -c 'commitupdate' $SCRATCH_MNT/a"
if [ -n "$user" ]; then
- su - "$user" -c "$cmd" >> $seqres.full
+ _su - "$user" -c "$cmd" >> $seqres.full
else
$SHELL -c "$cmd" >> $seqres.full
fi
diff --git a/tests/xfs/720 b/tests/xfs/720
index f928cc43d3bc54..68a6c7f6e2d584 100755
--- a/tests/xfs/720
+++ b/tests/xfs/720
@@ -61,7 +61,7 @@ $XFS_QUOTA_PROG -x -c 'report -u' $SCRATCH_MNT >> $seqres.full
# Fail at appending the file as qa_user to ensure quota enforcement works
echo "fail quota" >> $seqres.full
-su - "$qa_user" -c "$XFS_IO_PROG -c 'pwrite 10g 1' $scratchfile" >> $seqres.full
+_su - "$qa_user" -c "$XFS_IO_PROG -c 'pwrite 10g 1' $scratchfile" >> $seqres.full
$XFS_QUOTA_PROG -x -c 'report -u' $SCRATCH_MNT >> $seqres.full
# success, all done
diff --git a/tests/xfs/795 b/tests/xfs/795
index 5a67f02ec92eca..217f96092a4c42 100755
--- a/tests/xfs/795
+++ b/tests/xfs/795
@@ -63,7 +63,7 @@ $XFS_QUOTA_PROG -x -c 'report -u' $SCRATCH_MNT >> $seqres.full
echo "fail quota" >> $seqres.full
for ((i = 0; i < dirents; i++)); do
name=$(printf "y%0254d" $i)
- su - "$qa_user" -c "ln $scratchfile $scratchdir/$name" 2>&1 | \
+ _su - "$qa_user" -c "ln $scratchfile $scratchdir/$name" 2>&1 | \
_filter_scratch | sed -e 's/y[0-9]*/yXXX/g'
test "${PIPESTATUS[0]}" -ne 0 && break
done
diff --git a/tests/xfs/803 b/tests/xfs/803
index e3277181b41af0..4e5a58c4a3328e 100755
--- a/tests/xfs/803
+++ b/tests/xfs/803
@@ -88,7 +88,7 @@ echo set too long value
$XFS_IO_PROG -c "setfsprops $propname=$longpropval" $TEST_DIR
echo not enough permissions
-su - "$qa_user" -c "$XFS_IO_PROG -c \"setfsprops $propname=$propval\" $TEST_DIR" 2>&1 | _filter_test_dir
+_su - "$qa_user" -c "$XFS_IO_PROG -c \"setfsprops $propname=$propval\" $TEST_DIR" 2>&1 | _filter_test_dir
echo "*** DB TEST ***"
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (10 preceding siblings ...)
2025-02-04 21:25 ` [PATCH 11/34] common/rc: create a wrapper for the su command Darrick J. Wong
@ 2025-02-04 21:25 ` Darrick J. Wong
2025-02-05 0:16 ` Dave Chinner
2025-02-04 21:25 ` [PATCH 13/34] common/rc: hoist pkill to a helper function Darrick J. Wong
` (22 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:25 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
The next patch in this series fixes various issues with the recently
added fstests process isolation scheme by running each new process in a
separate process group session. Unfortunately, the processes in the
session are created with SIGINT ignored by default because they are not
attached to the controlling terminal. Therefore, switch the kill signal
to SIGPIPE because that is usually fatal and not masked by default.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/fuzzy | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/common/fuzzy b/common/fuzzy
index 0a2d91542b561e..6d390d4efbd3da 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -891,7 +891,7 @@ __stress_xfs_scrub_loop() {
local runningfile="$2"
local scrub_startat="$3"
shift; shift; shift
- local sigint_ret="$(( $(kill -l SIGINT) + 128 ))"
+ local signal_ret="$(( $(kill -l SIGPIPE) + 128 ))"
local scrublog="$tmp.scrub"
while __stress_scrub_running "$scrub_startat" "$runningfile"; do
@@ -902,7 +902,7 @@ __stress_xfs_scrub_loop() {
_scratch_scrub "$@" &> $scrublog
res=$?
if [ "$res" -eq "$sigint_ret" ]; then
- # Ignore SIGINT because the cleanup function sends
+ # Ignore SIGPIPE because the cleanup function sends
# that to terminate xfs_scrub
res=0
fi
@@ -1173,13 +1173,11 @@ _scratch_xfs_stress_scrub_cleanup() {
rm -f "$runningfile"
echo "Cleaning up scrub stress run at $(date)" >> $seqres.full
- # Send SIGINT so that bash won't print a 'Terminated' message that
- # distorts the golden output.
echo "Killing stressor processes at $(date)" >> $seqres.full
_kill_fsstress
- pkill -INT --parent $$ xfs_io >> $seqres.full 2>&1
- pkill -INT --parent $$ fsx >> $seqres.full 2>&1
- pkill -INT --parent $$ xfs_scrub >> $seqres.full 2>&1
+ _pkill --echo -PIPE xfs_io >> $seqres.full 2>&1
+ _pkill --echo -PIPE fsx >> $seqres.full 2>&1
+ _pkill --echo -PIPE xfs_scrub >> $seqres.full 2>&1
# Tests are not allowed to exit with the scratch fs frozen. If we
# started a fs freeze/thaw background loop, wait for that loop to exit
@@ -1209,6 +1207,7 @@ _scratch_xfs_stress_scrub_cleanup() {
# Wait for the remaining children to exit.
echo "Waiting for children to exit at $(date)" >> $seqres.full
wait
+ echo "Children exited as of $(date)" >> $seqres.full
# Ensure the scratch fs is also writable before we exit.
if [ -n "$__SCRUB_STRESS_REMOUNT_LOOP" ]; then
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 13/34] common/rc: hoist pkill to a helper function
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (11 preceding siblings ...)
2025-02-04 21:25 ` [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT Darrick J. Wong
@ 2025-02-04 21:25 ` Darrick J. Wong
2025-02-05 0:17 ` Dave Chinner
2025-02-04 21:25 ` [PATCH 14/34] common: fix pkill by running test program in a separate session Darrick J. Wong
` (21 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:25 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Create a helper function to wrap pkill in preparation for the next
patch.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 8 +++++++-
tests/generic/310 | 6 +++---
tests/generic/561 | 2 +-
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/common/rc b/common/rc
index 9a6f7dce613e62..2b56d6de9c9cb1 100644
--- a/common/rc
+++ b/common/rc
@@ -30,6 +30,12 @@ _test_sync()
_sync_fs $TEST_DIR
}
+# Kill only the processes started by this test.
+_pkill()
+{
+ pkill "$@"
+}
+
# Common execution handling for fsstress invocation.
#
# We need per-test fsstress binaries because of the way fsstress forks and
@@ -69,7 +75,7 @@ _kill_fsstress()
if [ -n "$_FSSTRESS_PID" ]; then
# use SIGPIPE to avoid "Killed" messages from bash
echo "killing $_FSSTRESS_BIN" >> $seqres.full
- pkill -PIPE $_FSSTRESS_BIN >> $seqres.full 2>&1
+ _pkill -PIPE $_FSSTRESS_BIN >> $seqres.full 2>&1
_wait_for_fsstress
return $?
fi
diff --git a/tests/generic/310 b/tests/generic/310
index 52babfdc803a21..570cc5f3859548 100755
--- a/tests/generic/310
+++ b/tests/generic/310
@@ -29,7 +29,7 @@ _begin_fstest auto
# Override the default cleanup function.
_cleanup()
{
- pkill -9 $seq.t_readdir > /dev/null 2>&1
+ _pkill -9 $seq.t_readdir > /dev/null 2>&1
wait
rm -rf $TEST_DIR/tmp
rm -f $tmp.*
@@ -83,7 +83,7 @@ _test_read()
{
$TEST_DIR/$seq.t_readdir_1 $SEQ_DIR > /dev/null 2>&1 &
sleep $RUN_TIME
- pkill -PIPE $seq.t_readdir_1
+ _pkill -PIPE $seq.t_readdir_1
wait
check_kernel_bug
@@ -97,7 +97,7 @@ _test_lseek()
$TEST_DIR/$seq.t_readdir_2 $SEQ_DIR > /dev/null 2>&1 &
readdir_pid=$!
sleep $RUN_TIME
- pkill -PIPE $seq.t_readdir_2
+ _pkill -PIPE $seq.t_readdir_2
wait
check_kernel_bug
diff --git a/tests/generic/561 b/tests/generic/561
index afe727ac56cbd9..b260aaf16c9256 100755
--- a/tests/generic/561
+++ b/tests/generic/561
@@ -40,7 +40,7 @@ function end_test()
# stop duperemove running
if [ -e $dupe_run ]; then
rm -f $dupe_run
- pkill $dedup_bin >/dev/null 2>&1
+ _pkill $dedup_bin >/dev/null 2>&1
wait $dedup_pids
rm -f $dedup_prog
fi
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 14/34] common: fix pkill by running test program in a separate session
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (12 preceding siblings ...)
2025-02-04 21:25 ` [PATCH 13/34] common/rc: hoist pkill to a helper function Darrick J. Wong
@ 2025-02-04 21:25 ` Darrick J. Wong
2025-02-05 0:23 ` Dave Chinner
2025-02-04 21:26 ` [PATCH 15/34] check: run tests in a private pid/mount namespace Darrick J. Wong
` (20 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:25 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Run each test program with a separate session id so that we can tell
pkill to kill all processes of a given name, but only within our own
session id. This /should/ suffice to run multiple fstests on the same
machine without one instance shooting down processes of another
instance.
This fixes a general problem with using "pkill --parent" -- if the
process being targeted is not a direct descendant of the bash script
calling pkill, then pkill will not do anything. The scrub stress tests
make use of multiple background subshells, which is how a ^C in the
parent process fails to result in fsx/fsstress being killed.
This is necessary to fix SOAK_DURATION runtime constraints for all the
scrub stress tests. However, there is a cost -- the test program no
longer runs with the same controlling tty as ./check, which means that
^Z doesn't work and SIGINT/SIGQUIT are set to SIG_IGN. IOWs, if a test
wants to kill its subprocesses, it must use another signal such as
SIGPIPE. Fortunately, bash doesn't whine about children dying due to
fatal signals if the children run in a different session id.
Unfortunately we have to let it run the test as a background process for
bash to handle SIGINT, and SIGSTOP no longer works properly.
This solution is a bit crap, and I have a better solution for it in the
next patch. Unfortunately, that solution adds new minimum requirements
for running fstests and removing previously supported configurations
abruptly during a bug fix is not appropriate behavior.
I also explored alternate designs, and this was the least unsatisfying:
a) Setting the process group didn't work because background subshells
are assigned a new group id.
b) Constraining the pkill/pgrep search to a cgroup could work, but it
seems that procps has only recently (~2023) gained the ability to filter
on a cgroup. Furthermore, we'd have to set up a cgroup in which to run
the fstest. The last decade has been rife with user bug reports
complaining about chaos resulting from multiple pieces of software (e.g.
Docker, systemd, etc.) deciding that they own the entire cgroup
structure without having any means to enforce that statement. We should
not wade into that mess.
c) Putting test subprocesses in a systemd sub-scope and telling systemd
to kill the sub-scope could work because ./check can already use it to
ensure that all child processes of a test are killed. However, this is
an *optional* feature, which means that we'd have to require systemd.
d) Constraining the pkill/pgrep search to a particular pid namespace
could work, but we already have tests that set up their own mount
namespaces, which means the constrained pgrep will not find all child
processes of a test. Though this hasn't been born out through testing?
e) Constraining to any other type of namespace (uts, pid, etc) might not
work because those namespaces might not be enabled. However, combining
a private pid and mount namespace to isolate tests from each other seems
to work better than session ids. For older systems, however, we will
need this as a (probably deprecated) fallback.
f) Revert check-parallel and go back to one fstests instance per system.
Zorro already chose not to revert.
So. Change _run_seq to create a the ./$seq process with a new session
id, update _su calls to use the same session as the parent test, update
all the pkill sites to use a wrapper so that we only target processes
created by *this* instance of fstests, and update SIGINT to SIGPIPE.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
check | 40 ++++++++++++++++++++++++++++++++++------
common/rc | 6 ++++--
tools/run_seq_setsid | 22 ++++++++++++++++++++++
3 files changed, 60 insertions(+), 8 deletions(-)
create mode 100755 tools/run_seq_setsid
diff --git a/check b/check
index 5cb4e7eb71ce07..d854515ed1aac5 100755
--- a/check
+++ b/check
@@ -698,18 +698,46 @@ _adjust_oom_score -500
# systemd doesn't automatically remove transient scopes that fail to terminate
# when systemd tells them to terminate (e.g. programs stuck in D state when
# systemd sends SIGKILL), so we use reset-failed to tear down the scope.
+#
+# Use setsid to run the test program with a separate session id so that we
+# can pkill only the processes started by this test.
_run_seq() {
- local cmd=(bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq")
+ local res
+ unset CHILDPID
+ unset FSTESTS_ISOL # set by tools/run_seq_*
if [ -n "${HAVE_SYSTEMD_SCOPES}" ]; then
local unit="$(systemd-escape "fs$seq").scope"
systemctl reset-failed "${unit}" &> /dev/null
- systemd-run --quiet --unit "${unit}" --scope "${cmd[@]}"
+ systemd-run --quiet --unit "${unit}" --scope \
+ ./tools/run_seq_setsid "./$seq" &
+ CHILDPID=$!
+ wait
res=$?
+ unset CHILDPID
systemctl stop "${unit}" &> /dev/null
- return "${res}"
else
- "${cmd[@]}"
+ # bash won't run the SIGINT trap handler while there are
+ # foreground children in a separate session, so we must run
+ # the test in the background and wait for it.
+ ./tools/run_seq_setsid "./$seq" &
+ CHILDPID=$!
+ wait
+ res=$?
+ unset CHILDPID
+ fi
+
+ return $res
+}
+
+_kill_seq() {
+ if [ -n "$CHILDPID" ]; then
+ # SIGPIPE will kill all the children (including fsstress)
+ # without bash logging fatal signal termination messages to the
+ # console
+ pkill -PIPE --session "$CHILDPID"
+ wait
+ unset CHILDPID
fi
}
@@ -718,9 +746,9 @@ _prepare_test_list
fstests_start_time="$(date +"%F %T")"
if $OPTIONS_HAVE_SECTIONS; then
- trap "_summary; exit \$status" 0 1 2 3 15
+ trap "_kill_seq; _summary; exit \$status" 0 1 2 3 15
else
- trap "_wrapup; exit \$status" 0 1 2 3 15
+ trap "_kill_seq; _wrapup; exit \$status" 0 1 2 3 15
fi
function run_section()
diff --git a/common/rc b/common/rc
index 2b56d6de9c9cb1..bda80995f8dd55 100644
--- a/common/rc
+++ b/common/rc
@@ -33,7 +33,7 @@ _test_sync()
# Kill only the processes started by this test.
_pkill()
{
- pkill "$@"
+ pkill --session 0 "$@"
}
# Common execution handling for fsstress invocation.
@@ -2732,9 +2732,11 @@ _require_user_exists()
[ "$?" == "0" ] || _notrun "$user user not defined."
}
+# Run all non-root processes in the same session as the root. Believe it or
+# not, passing $SHELL in this manner works both for "su" and "su -c cmd".
_su()
{
- su "$@"
+ su --session-command $SHELL "$@"
}
# check if a user exists and is able to execute commands.
diff --git a/tools/run_seq_setsid b/tools/run_seq_setsid
new file mode 100755
index 00000000000000..5938f80e689255
--- /dev/null
+++ b/tools/run_seq_setsid
@@ -0,0 +1,22 @@
+#!/bin/bash
+
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 Oracle. All Rights Reserved.
+#
+# Try starting things in a new process session so that test processes have
+# something with which to filter only their own subprocesses.
+
+if [ -n "${FSTESTS_ISOL}" ]; then
+ # Allow the test to become a target of the oom killer
+ oom_knob="/proc/self/oom_score_adj"
+ test -w "${oom_knob}" && echo 250 > "${oom_knob}"
+
+ exec "$@"
+fi
+
+if [ -z "$1" ] || [ "$1" = "--help" ]; then
+ echo "Usage: $0 command [args...]"
+ exit 1
+fi
+
+FSTESTS_ISOL=setsid exec setsid "$0" "$@"
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (13 preceding siblings ...)
2025-02-04 21:25 ` [PATCH 14/34] common: fix pkill by running test program in a separate session Darrick J. Wong
@ 2025-02-04 21:26 ` Darrick J. Wong
2025-02-05 0:37 ` Dave Chinner
2025-02-04 21:26 ` [PATCH 16/34] check: deprecate using process sessions to isolate test instances Darrick J. Wong
` (19 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:26 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
As mentioned in the previous patch, trying to isolate processes from
separate test instances through the use of distinct Unix process
sessions is annoying due to the many complications with signal handling.
Instead, we could just use nsexec to run the test program with a private
pid namespace so that each test instance can only see its own processes;
and private mount namespace so that tests writing to /tmp cannot clobber
other tests or the stuff running on the main system.
However, it's not guaranteed that a particular kernel has pid and mount
namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
been around for a long time, but there's no hard requirement for the
latter to be enabled in the kernel. Therefore, this bugfix slips
namespace support in alongside the session id thing.
Declaring CONFIG_PID_NS=n a deprecated configuration and removing
support should be a separate conversation, not something that I have to
do in a bug fix to get mainline QA back up.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
check | 34 +++++++++++++++++++++++-----------
common/rc | 12 ++++++++++--
src/nsexec.c | 18 +++++++++++++++---
tests/generic/504 | 15 +++++++++++++--
tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
5 files changed, 89 insertions(+), 18 deletions(-)
create mode 100755 tools/run_seq_pidns
diff --git a/check b/check
index d854515ed1aac5..24b21cf139f927 100755
--- a/check
+++ b/check
@@ -674,6 +674,11 @@ _stash_test_status() {
esac
}
+# Can we run in a private pid/mount namespace?
+HAVE_PRIVATENS=
+./tools/run_seq_pidns bash -c "exit 77"
+test $? -eq 77 && HAVE_PRIVATENS=yes
+
# Can we run systemd scopes?
HAVE_SYSTEMD_SCOPES=
systemctl reset-failed "fstests-check" &>/dev/null
@@ -691,22 +696,29 @@ _adjust_oom_score -500
# the system runs out of memory it'll be the test that gets killed and not the
# test framework. The test is run in a separate process without any of our
# functions, so we open-code adjusting the OOM score.
-#
-# If systemd is available, run the entire test script in a scope so that we can
-# kill all subprocesses of the test if it fails to clean up after itself. This
-# is essential for ensuring that the post-test unmount succeeds. Note that
-# systemd doesn't automatically remove transient scopes that fail to terminate
-# when systemd tells them to terminate (e.g. programs stuck in D state when
-# systemd sends SIGKILL), so we use reset-failed to tear down the scope.
-#
-# Use setsid to run the test program with a separate session id so that we
-# can pkill only the processes started by this test.
_run_seq() {
local res
unset CHILDPID
unset FSTESTS_ISOL # set by tools/run_seq_*
- if [ -n "${HAVE_SYSTEMD_SCOPES}" ]; then
+ if [ -n "${HAVE_PRIVATENS}" ]; then
+ # If pid and mount namespaces are available, run the whole test
+ # inside them so that the test cannot access any process or
+ # /tmp contents that it does not itself create. The ./$seq
+ # process is considered the "init" process of the pid
+ # namespace, so all subprocesses will be sent SIGKILL when it
+ # terminates.
+ ./tools/run_seq_pidns "./$seq"
+ res=$?
+ elif [ -n "${HAVE_SYSTEMD_SCOPES}" ]; then
+ # If systemd is available, run the entire test script in a
+ # scope so that we can kill all subprocesses of the test if it
+ # fails to clean up after itself. This is essential for
+ # ensuring that the post-test unmount succeeds. Note that
+ # systemd doesn't automatically remove transient scopes that
+ # fail to terminate when systemd tells them to terminate (e.g.
+ # programs stuck in D state when systemd sends SIGKILL), so we
+ # use reset-failed to tear down the scope.
local unit="$(systemd-escape "fs$seq").scope"
systemctl reset-failed "${unit}" &> /dev/null
systemd-run --quiet --unit "${unit}" --scope \
diff --git a/common/rc b/common/rc
index bda80995f8dd55..25900533acb974 100644
--- a/common/rc
+++ b/common/rc
@@ -33,7 +33,11 @@ _test_sync()
# Kill only the processes started by this test.
_pkill()
{
- pkill --session 0 "$@"
+ if [ "$FSTESTS_ISOL" = "setsid" ]; then
+ pkill --session 0 "$@"
+ else
+ pkill "$@"
+ fi
}
# Common execution handling for fsstress invocation.
@@ -2736,7 +2740,11 @@ _require_user_exists()
# not, passing $SHELL in this manner works both for "su" and "su -c cmd".
_su()
{
- su --session-command $SHELL "$@"
+ if [ "$FSTESTS_ISOL" = "setsid" ]; then
+ su --session-command $SHELL "$@"
+ else
+ su "$@"
+ fi
}
# check if a user exists and is able to execute commands.
diff --git a/src/nsexec.c b/src/nsexec.c
index 750d52df129716..5c0bc922153514 100644
--- a/src/nsexec.c
+++ b/src/nsexec.c
@@ -54,6 +54,7 @@ usage(char *pname)
fpe(" If -M or -G is specified, -U is required\n");
fpe("-s Set uid/gid to 0 in the new user namespace\n");
fpe("-v Display verbose messages\n");
+ fpe("-z Return child's return value\n");
fpe("\n");
fpe("Map strings for -M and -G consist of records of the form:\n");
fpe("\n");
@@ -144,6 +145,8 @@ int
main(int argc, char *argv[])
{
int flags, opt;
+ int return_child_status = 0;
+ int child_status = 0;
pid_t child_pid;
struct child_args args;
char *uid_map, *gid_map;
@@ -161,7 +164,7 @@ main(int argc, char *argv[])
setid = 0;
gid_map = NULL;
uid_map = NULL;
- while ((opt = getopt(argc, argv, "+imnpuUM:G:vs")) != -1) {
+ while ((opt = getopt(argc, argv, "+imnpuUM:G:vsz")) != -1) {
switch (opt) {
case 'i': flags |= CLONE_NEWIPC; break;
case 'm': flags |= CLONE_NEWNS; break;
@@ -173,6 +176,7 @@ main(int argc, char *argv[])
case 'G': gid_map = optarg; break;
case 'U': flags |= CLONE_NEWUSER; break;
case 's': setid = 1; break;
+ case 'z': return_child_status = 1; break;
default: usage(argv[0]);
}
}
@@ -229,11 +233,19 @@ main(int argc, char *argv[])
close(args.pipe_fd[1]);
- if (waitpid(child_pid, NULL, 0) == -1) /* Wait for child */
+ if (waitpid(child_pid, &child_status, 0) == -1) /* Wait for child */
errExit("waitpid");
if (verbose)
- printf("%s: terminating\n", argv[0]);
+ printf("%s: terminating %d\n", argv[0], WEXITSTATUS(child_status));
+
+ if (return_child_status) {
+ if (WIFEXITED(child_status))
+ exit(WEXITSTATUS(child_status));
+ if (WIFSIGNALED(child_status))
+ exit(WTERMSIG(child_status) + 128); /* like sh */
+ exit(EXIT_FAILURE);
+ }
exit(EXIT_SUCCESS);
}
diff --git a/tests/generic/504 b/tests/generic/504
index 271c040e7b842a..96f18a0bbc7ba2 100755
--- a/tests/generic/504
+++ b/tests/generic/504
@@ -18,7 +18,7 @@ _cleanup()
{
exec {test_fd}<&-
cd /
- rm -f $tmp.*
+ rm -r -f $tmp.*
}
# Import common functions.
@@ -35,13 +35,24 @@ echo inode $tf_inode >> $seqres.full
# Create new fd by exec
exec {test_fd}> $testfile
-# flock locks the fd then exits, we should see the lock info even the owner is dead
+# flock locks the fd then exits, we should see the lock info even the owner is
+# dead. If we're using pid namespace isolation we have to move /proc so that
+# we can access the /proc/locks from the init_pid_ns.
+if [ "$FSTESTS_ISOL" = "privatens" ]; then
+ move_proc="$tmp.procdir"
+ mkdir -p "$move_proc"
+ mount --move /proc "$move_proc"
+fi
flock -x $test_fd
cat /proc/locks >> $seqres.full
# Checking
grep -q ":$tf_inode " /proc/locks || echo "lock info not found"
+if [ -n "$move_proc" ]; then
+ mount --move "$move_proc" /proc
+fi
+
# success, all done
status=0
echo "Silence is golden"
diff --git a/tools/run_seq_pidns b/tools/run_seq_pidns
new file mode 100755
index 00000000000000..df94974ab30c3c
--- /dev/null
+++ b/tools/run_seq_pidns
@@ -0,0 +1,28 @@
+#!/bin/bash
+
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 Oracle. All Rights Reserved.
+#
+# Try starting things in a private pid/mount namespace with a private /tmp
+# and /proc so that child process trees cannot interfere with each other.
+
+if [ -n "${FSTESTS_ISOL}" ]; then
+ for path in /proc /tmp; do
+ mount --make-private "$path"
+ done
+ mount -t proc proc /proc
+ mount -t tmpfs tmpfs /tmp
+
+ # Allow the test to become a target of the oom killer
+ oom_knob="/proc/self/oom_score_adj"
+ test -w "${oom_knob}" && echo 250 > "${oom_knob}"
+
+ exec "$@"
+fi
+
+if [ -z "$1" ] || [ "$1" = "--help" ]; then
+ echo "Usage: $0 command [args...]"
+ exit 1
+fi
+
+FSTESTS_ISOL=privatens exec "$(dirname "$0")/../src/nsexec" -z -m -p "$0" "$@"
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 16/34] check: deprecate using process sessions to isolate test instances
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (14 preceding siblings ...)
2025-02-04 21:26 ` [PATCH 15/34] check: run tests in a private pid/mount namespace Darrick J. Wong
@ 2025-02-04 21:26 ` Darrick J. Wong
2025-02-05 0:38 ` Dave Chinner
2025-02-04 21:26 ` [PATCH 17/34] common/rc: don't copy fsstress to $TEST_DIR Darrick J. Wong
` (18 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:26 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
As I've noted elsewhere, the use of process session ids to "isolate"
test instances from killing each other is kind of hacky and creates
other weird side effects. I'd rather everyone use the new code that
runs everything in proper isolation with private pid and mount
namespaces, but I don't know how many people this would break were it a
hard dependency.
Deprecate the process session handling immediately with a warning that
we're going to rip it out in a year.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
check | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/check b/check
index 24b21cf139f927..5beb10612b75fb 100755
--- a/check
+++ b/check
@@ -692,6 +692,15 @@ function _adjust_oom_score() {
}
_adjust_oom_score -500
+warn_deprecated_sessionid()
+{
+ if [ -z "$WARNED_DEPRECATED_SESSIONID" ]; then
+ echo "WARNING: Running fstests without private pid/mount namespace"
+ echo "support is deprecated and will be removed in February 2026."
+ WARNED_DEPRECATED_SESSIONID=1
+ fi
+}
+
# ...and make the tests themselves somewhat more attractive to it, so that if
# the system runs out of memory it'll be the test that gets killed and not the
# test framework. The test is run in a separate process without any of our
@@ -891,6 +900,8 @@ function run_section()
seqres="$check"
_check_test_fs
+ test -n "$HAVE_PRIVATENS" || warn_deprecated_sessionid
+
loop_status=() # track rerun-on-failure state
local tc_status ix
local -a _list=( $list )
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 17/34] common/rc: don't copy fsstress to $TEST_DIR
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (15 preceding siblings ...)
2025-02-04 21:26 ` [PATCH 16/34] check: deprecate using process sessions to isolate test instances Darrick J. Wong
@ 2025-02-04 21:26 ` Darrick J. Wong
2025-02-04 21:27 ` [PATCH 18/34] unmount: resume logging of stdout and stderr for filtering Darrick J. Wong
` (17 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:26 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Now that we can pkill only processes that were started by this test, we
don't need to copy the fsstress binary to $TEST_DIR to avoid killing the
wrong program instances. This avoids a whole slew of ETXTBSY problems
with scrub stress tests that run multiple copies of fsstress in the
background.
Revert most of the changes to generic/270, because it wants to do
something fancy with the fsstress binary, so it needs to control the
process directly.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
common/rc | 13 ++++---------
tests/generic/270 | 10 ++++++----
2 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/common/rc b/common/rc
index 25900533acb974..6a1d4fd712bb40 100644
--- a/common/rc
+++ b/common/rc
@@ -54,11 +54,9 @@ _pkill()
# task name to kill.
#
# If tasks want to start fsstress themselves (e.g. under a different uid) then
-# they can set up _FSSTRESS_BIN and record _FSSTRESS_PID themselves. Then if the
-# test is killed then it will get cleaned up automatically.
+# they can record _FSSTRESS_PID themselves. Then if the test is killed then it
+# will get cleaned up automatically.
-_FSSTRESS_BIN="$seq.fsstress"
-_FSSTRESS_PROG="$TEST_DIR/$seq.fsstress"
_FSSTRESS_PID=""
_wait_for_fsstress()
{
@@ -69,7 +67,6 @@ _wait_for_fsstress()
ret=$?
unset _FSSTRESS_PID
fi
- rm -f $_FSSTRESS_PROG
return $ret
}
@@ -78,8 +75,7 @@ _kill_fsstress()
{
if [ -n "$_FSSTRESS_PID" ]; then
# use SIGPIPE to avoid "Killed" messages from bash
- echo "killing $_FSSTRESS_BIN" >> $seqres.full
- _pkill -PIPE $_FSSTRESS_BIN >> $seqres.full 2>&1
+ _pkill --echo -PIPE fsstress >> $seqres.full 2>&1
_wait_for_fsstress
return $?
fi
@@ -87,8 +83,7 @@ _kill_fsstress()
_run_fsstress_bg()
{
- cp -f $FSSTRESS_PROG $_FSSTRESS_PROG
- $_FSSTRESS_PROG $FSSTRESS_AVOID "$@" >> $seqres.full 2>&1 &
+ $FSSTRESS_PROG $FSSTRESS_AVOID "$@" >> $seqres.full 2>&1 &
_FSSTRESS_PID=$!
}
diff --git a/tests/generic/270 b/tests/generic/270
index d74971bb535239..ce51592004fe77 100755
--- a/tests/generic/270
+++ b/tests/generic/270
@@ -28,8 +28,8 @@ _workout()
args=`_scale_fsstress_args -p128 -n999999999 -f setattr=1 $FSSTRESS_AVOID -d $out`
echo "fsstress $args" >> $seqres.full
# Grant chown capability
- cp $FSSTRESS_PROG $_FSSTRESS_PROG
- $SETCAP_PROG cap_chown=epi $_FSSTRESS_PROG
+ cp $FSSTRESS_PROG $tmp.fsstress.bin
+ $SETCAP_PROG cap_chown=epi $tmp.fsstress.bin
# io_uring accounts memory it needs under the rlimit memlocked option,
# which can be quite low on some setups (especially 64K pagesize). root
@@ -37,7 +37,7 @@ _workout()
# io_uring_queue_init fail on ENOMEM, set max locked memory to unlimited
# temporarily.
ulimit -l unlimited
- _su $qa_user -c "$_FSSTRESS_PROG $args" > /dev/null 2>&1 &
+ _su $qa_user -c "$tmp.fsstress.bin $args" > /dev/null 2>&1 &
_FSSTRESS_PID=$!
echo "Run dd writers in parallel"
@@ -50,7 +50,9 @@ _workout()
sleep $enospc_time
done
- _kill_fsstress
+ _pkill -PIPE -f fsstress
+ pidwait $_FSSTRESS_PID
+ return 0
}
_require_quota
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 18/34] unmount: resume logging of stdout and stderr for filtering
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (16 preceding siblings ...)
2025-02-04 21:26 ` [PATCH 17/34] common/rc: don't copy fsstress to $TEST_DIR Darrick J. Wong
@ 2025-02-04 21:27 ` Darrick J. Wong
2025-02-04 21:27 ` [PATCH 19/34] mkfs: don't hardcode log size Darrick J. Wong
` (16 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:27 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
There's a number of places where a test program calls a variant of
_unmount but then pipes the output through a _filter script or
something. The new _unmount helper redirects stdout and stderr to
seqres.full, which means that those error messages (some of which are
encoded in the golden outputs) are suppressed. This leads to test
regressions in generic/050 and other places, so let's resume logging.
This also undoes all the changes that removed /dev/null redirection of
unmount calls.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 4c6bc4565105e6 ("fstests: clean up mount and unmount operations")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
check | 10 ++++++++--
common/quota | 2 +-
common/rc | 21 +++++++++++++++++++--
tests/generic/050 | 2 +-
tests/generic/085 | 2 +-
tests/generic/361 | 4 ++--
tests/generic/590 | 2 +-
tests/generic/746 | 2 +-
tests/xfs/149 | 2 +-
tests/xfs/530 | 2 +-
10 files changed, 36 insertions(+), 13 deletions(-)
diff --git a/check b/check
index 5beb10612b75fb..bc04a50810b9df 100755
--- a/check
+++ b/check
@@ -1048,8 +1048,8 @@ function run_section()
if [ $sts -ne 0 ]; then
_dump_err_cont "[failed, exit status $sts]"
- _test_unmount 2>> $seqres.full
- _scratch_unmount 2>> $seqres.full
+ _test_unmount 2> /dev/null
+ _scratch_unmount 2> /dev/null
rm -f ${RESULT_DIR}/require_test*
rm -f ${RESULT_DIR}/require_scratch*
# Even though we failed, there may be something interesting in
@@ -1135,6 +1135,12 @@ function run_section()
_stash_test_status "$seqnum" "$tc_status"
done
+ # Reset these three variables so that unmount output doesn't get
+ # written to $seqres.full of the last test to run.
+ seq="check.$$"
+ check="$RESULT_BASE/check"
+ seqres="$check"
+
sect_stop=`_wallclock`
interrupt=false
_wrapup
diff --git a/common/quota b/common/quota
index 8688116c6547a9..4dad9b79a27a7f 100644
--- a/common/quota
+++ b/common/quota
@@ -274,7 +274,7 @@ _choose_prid()
_qmount()
{
- _scratch_unmount
+ _scratch_unmount >/dev/null 2>&1
_try_scratch_mount || _fail "qmount failed"
# xfs doesn't need these setups and quotacheck even fails on xfs
# redirect the output to $seqres.full for debug purpose and ignore results
diff --git a/common/rc b/common/rc
index 6a1d4fd712bb40..b78d903fe34ba3 100644
--- a/common/rc
+++ b/common/rc
@@ -479,11 +479,28 @@ _scratch_mount_idmapped()
}
# Unmount the filesystem based on the directory or device passed.
+# Log everything that happens to seqres.full, and use BASHPID because
+# background subshells have the same $$ as the parent but not the same
+# $BASHPID.
_unmount()
{
- local args="$*"
+ local outlog="$tmp.$BASHPID.umount"
+ local errlog="$tmp.$BASHPID.umount.err"
- $UMOUNT_PROG $args >> $seqres.full 2>&1
+ rm -f "$outlog" "$errlog"
+ $UMOUNT_PROG "$@" 2> "$errlog" > "$outlog"
+ local res="${PIPESTATUS[0]}"
+
+ if [ -s "$outlog" ]; then
+ cat "$outlog" >> $seqres.full
+ cat "$outlog"
+ fi
+ if [ -s "$errlog" ]; then
+ cat "$errlog" >> $seqres.full
+ >&2 cat "$errlog"
+ fi
+ rm -f "$outlog" "$errlog"
+ return $res
}
_scratch_unmount()
diff --git a/tests/generic/050 b/tests/generic/050
index 8e9456db279003..affb072df5969f 100755
--- a/tests/generic/050
+++ b/tests/generic/050
@@ -89,7 +89,7 @@ _try_scratch_mount 2>&1 | _filter_ro_mount | _filter_scratch
# expects an error, so open code the unmount
echo "unmounting read-only filesystem"
-$UMOUNT_PROG $SCRATCH_DEV 2>&1 | _filter_scratch | _filter_ending_dot
+_scratch_unmount 2>&1 | _filter_scratch | _filter_ending_dot
#
# This is the way out if the underlying device really is read-only.
diff --git a/tests/generic/085 b/tests/generic/085
index 7671a36ab9524f..d3fa10be9ccace 100755
--- a/tests/generic/085
+++ b/tests/generic/085
@@ -29,7 +29,7 @@ cleanup_dmdev()
fi
# in case it's still suspended and/or mounted
$DMSETUP_PROG resume $lvdev >> $seqres.full 2>&1
- _unmount -q $SCRATCH_MNT
+ _unmount -q $SCRATCH_MNT >/dev/null 2>&1
_dmsetup_remove $node
}
diff --git a/tests/generic/361 b/tests/generic/361
index e2b7984361e87c..b584af47540020 100755
--- a/tests/generic/361
+++ b/tests/generic/361
@@ -16,7 +16,7 @@ _begin_fstest auto quick
# Override the default cleanup function.
_cleanup()
{
- _unmount $fs_mnt
+ _unmount $fs_mnt &>> /dev/null
[ -n "$loop_dev" ] && _destroy_loop_device $loop_dev
cd /
rm -f $tmp.*
@@ -54,7 +54,7 @@ $XFS_IO_PROG -fc "pwrite 0 520m" $fs_mnt/testfile >>$seqres.full 2>&1
# remount should not hang
$MOUNT_PROG -o remount,ro $fs_mnt >>$seqres.full 2>&1
-_unmount $fs_mnt
+_unmount $fs_mnt &>/dev/null
_destroy_loop_device $loop_dev
unset loop_dev
diff --git a/tests/generic/590 b/tests/generic/590
index 1adeef4c2ad52c..ba1337a856f15d 100755
--- a/tests/generic/590
+++ b/tests/generic/590
@@ -15,7 +15,7 @@ _begin_fstest auto prealloc preallocrw
# Override the default cleanup function.
_cleanup()
{
- _scratch_unmount
+ _scratch_unmount &>/dev/null
[ -n "$loop_dev" ] && _destroy_loop_device $loop_dev
cd /
rm -f $tmp.*
diff --git a/tests/generic/746 b/tests/generic/746
index ba8ed25e845776..6f02b1cc354782 100755
--- a/tests/generic/746
+++ b/tests/generic/746
@@ -223,7 +223,7 @@ while read line; do
done < $fiemap_after
echo "done."
-_unmount $loop_mnt
+_unmount $loop_mnt &>/dev/null
_destroy_loop_device $loop_dev
unset loop_dev
diff --git a/tests/xfs/149 b/tests/xfs/149
index 9a96f82ede1761..28dfc7f04c1773 100755
--- a/tests/xfs/149
+++ b/tests/xfs/149
@@ -22,7 +22,7 @@ loop_symlink=$TEST_DIR/loop_symlink.$$
# Override the default cleanup function.
_cleanup()
{
- _unmount $mntdir
+ _unmount $mntdir &>/dev/null
[ -n "$loop_dev" ] && _destroy_loop_device $loop_dev
rmdir $mntdir
rm -f $loop_symlink
diff --git a/tests/xfs/530 b/tests/xfs/530
index d0d0e2665070f8..95ab32f1e1f828 100755
--- a/tests/xfs/530
+++ b/tests/xfs/530
@@ -116,7 +116,7 @@ done
echo "Check filesystem"
_check_scratch_fs
-_scratch_unmount
+_scratch_unmount &> /dev/null
_destroy_loop_device $rt_loop_dev
unset rt_loop_dev
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 19/34] mkfs: don't hardcode log size
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (17 preceding siblings ...)
2025-02-04 21:27 ` [PATCH 18/34] unmount: resume logging of stdout and stderr for filtering Darrick J. Wong
@ 2025-02-04 21:27 ` Darrick J. Wong
2025-02-05 0:40 ` Dave Chinner
2025-02-04 21:27 ` [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount Darrick J. Wong
` (15 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:27 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Commit 000813899afb46 hardcoded a log size of 256MB into xfs/501,
xfs/502, and generic/530. This seems to be an attempt to reduce test
run times by increasing the log size so that more background threads can
run in parallel. Unfortunately, this breaks a couple of my test
configurations:
- External logs smaller than 256MB
- Internal logs where the AG size is less than 256MB
For example, here's seqres.full from a failed xfs/501 invocation:
** mkfs failed with extra mkfs options added to " -m metadir=2,autofsck=1,uquota,gquota,pquota, -d rtinherit=1," by test 501 **
** attempting to mkfs using only test 501 options: -l size=256m **
size 256m specified for log subvolume is too large, maximum is 32768 blocks
<snip>
mount -ortdev=/dev/sdb4 -ologdev=/dev/sdb2 /dev/sda4 /opt failed
umount: /dev/sda4: not mounted.
Note that there's some formatting error here, so we jettison the entire
rt configuration to force the log size option, but then mount fails
because we didn't edit out the rtdev option there too.
Fortunately, mkfs.xfs already /has/ a few options to try to improve
parallelism in the filesystem by avoiding contention on the log grant
heads by scaling up the log size. These options are aware of log and AG
size constraints so they won't conflict with other geometry options.
Use them.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 000813899afb46 ("fstests: scale some tests for high CPU count sanity")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 27 +++++++++++++++++++++++++++
tests/generic/530 | 6 +-----
tests/generic/531 | 6 +-----
tests/xfs/501 | 2 +-
tests/xfs/502 | 2 +-
5 files changed, 31 insertions(+), 12 deletions(-)
diff --git a/common/rc b/common/rc
index b78d903fe34ba3..03603a5198e3b6 100644
--- a/common/rc
+++ b/common/rc
@@ -688,6 +688,33 @@ _test_cycle_mount()
_test_mount
}
+# Are there mkfs options to try to improve concurrency?
+_scratch_mkfs_concurrency_options()
+{
+ local nr_cpus=$(getconf _NPROCESSORS_CONF)
+
+ case "$FSTYP" in
+ xfs)
+ # If any concurrency options are already specified, don't
+ # compute our own conflicting ones.
+ echo "$SCRATCH_OPTIONS $MKFS_OPTIONS" | \
+ grep -q 'concurrency=' &&
+ return
+
+ local sections=(d r)
+
+ # -l concurrency does not work with external logs
+ test _has_logdev || sections+=(l)
+
+ for section in "${sections[@]}"; do
+ $MKFS_XFS_PROG -$section concurrency=$nr_cpus 2>&1 | \
+ grep -q "unknown option -$section" ||
+ echo "-$section concurrency=$nr_cpus "
+ done
+ ;;
+ esac
+}
+
_scratch_mkfs_options()
{
_scratch_options mkfs
diff --git a/tests/generic/530 b/tests/generic/530
index f2513156a920e8..b2aab4f354e9f3 100755
--- a/tests/generic/530
+++ b/tests/generic/530
@@ -25,11 +25,7 @@ _require_test_program "t_open_tmpfiles"
# For XFS, pushing 50000 unlinked inode inactivations through a small xfs log
# can result in bottlenecks on the log grant heads, so try to make the log
# larger to reduce runtime.
-if [ "$FSTYP" = "xfs" ] && ! _has_logdev; then
- _scratch_mkfs "-l size=256m" >> $seqres.full 2>&1
-else
- _scratch_mkfs >> $seqres.full 2>&1
-fi
+_scratch_mkfs $(_scratch_mkfs_concurrency_options) >> $seqres.full 2>&1
_scratch_mount
# Set ULIMIT_NOFILE to min(file-max / 2, 50000 files per LOAD_FACTOR)
diff --git a/tests/generic/531 b/tests/generic/531
index ed6c3f91153ecc..07dffd9fd5108d 100755
--- a/tests/generic/531
+++ b/tests/generic/531
@@ -23,11 +23,7 @@ _require_test_program "t_open_tmpfiles"
# On high CPU count machines, this runs a -lot- of create and unlink
# concurrency. Set the filesytsem up to handle this.
-if [ $FSTYP = "xfs" ]; then
- _scratch_mkfs "-d agcount=32" >> $seqres.full 2>&1
-else
- _scratch_mkfs >> $seqres.full 2>&1
-fi
+_scratch_mkfs $(_scratch_mkfs_concurrency_options) >> $seqres.full 2>&1
_scratch_mount
# Try to load up all the CPUs, two threads per CPU.
diff --git a/tests/xfs/501 b/tests/xfs/501
index 678c51b52948c5..c62ec443848797 100755
--- a/tests/xfs/501
+++ b/tests/xfs/501
@@ -33,7 +33,7 @@ _require_xfs_sysfs debug/log_recovery_delay
_require_scratch
_require_test_program "t_open_tmpfiles"
-_scratch_mkfs "-l size=256m" >> $seqres.full 2>&1
+_scratch_mkfs $(_scratch_mkfs_concurrency_options) >> $seqres.full 2>&1
_scratch_mount
# Set ULIMIT_NOFILE to min(file-max / 2, 30000 files per LOAD_FACTOR)
diff --git a/tests/xfs/502 b/tests/xfs/502
index 10b0017f6b2eb2..ebfb5ce883c378 100755
--- a/tests/xfs/502
+++ b/tests/xfs/502
@@ -23,7 +23,7 @@ _require_xfs_io_error_injection "iunlink_fallback"
_require_scratch
_require_test_program "t_open_tmpfiles"
-_scratch_mkfs "-l size=256m" | _filter_mkfs 2> $tmp.mkfs > /dev/null
+_scratch_mkfs $(_scratch_mkfs_concurrency_options) | _filter_mkfs 2> $tmp.mkfs > /dev/null
cat $tmp.mkfs >> $seqres.full
. $tmp.mkfs
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (18 preceding siblings ...)
2025-02-04 21:27 ` [PATCH 19/34] mkfs: don't hardcode log size Darrick J. Wong
@ 2025-02-04 21:27 ` Darrick J. Wong
2025-02-05 0:42 ` Dave Chinner
2025-02-04 21:27 ` [PATCH 21/34] preamble: fix missing _kill_fsstress Darrick J. Wong
` (14 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:27 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Make the _try_scratch_mount and _test_mount helpers return the exit code
from mount, not _prepare_for_eio_shutdown.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 1a49022fab9b4d ("fstests: always use fail-at-unmount semantics for XFS")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/rc | 2 ++
1 file changed, 2 insertions(+)
diff --git a/common/rc b/common/rc
index 03603a5198e3b6..56b4e7e018a8e0 100644
--- a/common/rc
+++ b/common/rc
@@ -440,6 +440,7 @@ _try_scratch_mount()
[ $mount_ret -ne 0 ] && return $mount_ret
_idmapped_mount $SCRATCH_DEV $SCRATCH_MNT
_prepare_for_eio_shutdown $SCRATCH_DEV
+ return $mount_ret
}
# mount scratch device with given options and _fail if mount fails
@@ -657,6 +658,7 @@ _test_mount()
[ $mount_ret -ne 0 ] && return $mount_ret
_idmapped_mount $TEST_DEV $TEST_DIR
_prepare_for_eio_shutdown $TEST_DEV
+ return $mount_ret
}
_test_unmount()
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 21/34] preamble: fix missing _kill_fsstress
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (19 preceding siblings ...)
2025-02-04 21:27 ` [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount Darrick J. Wong
@ 2025-02-04 21:27 ` Darrick J. Wong
2025-02-04 21:28 ` [PATCH 22/34] generic/650: revert SOAK DURATION changes Darrick J. Wong
` (13 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:27 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Commit 8973af00ec212f added a _kill_fsstress to the standard _cleanup
function. However, if something breaks during test program
initialization before it gets to sourcing common/rc, then you get
failures that look like this:
--- /tmp/fstests/tests/generic/556.out 2024-09-25 12:09:52.938797554 -0700
+++ /var/tmp/fstests/generic/556.out.bad 2025-01-04 22:34:01.268327003 -0800
@@ -1,16 +1,3 @@
QA output created by 556
-SCRATCH_MNT/basic Casefold
-SCRATCH_MNT/basic
-SCRATCH_MNT/casefold_flag_removal Casefold
-SCRATCH_MNT/casefold_flag_removal Casefold
-SCRATCH_MNT/flag_inheritance/d1/d2/d3 Casefold
-SCRATCH_MNT/symlink/ind1/TARGET
-mv: 'SCRATCH_MNT/rename/rename' and 'SCRATCH_MNT/rename/RENAME' are the same file
-# file: SCRATCH_MNT/xattrs/x
-user.foo="bar"
-
-# file: SCRATCH_MNT/xattrs/x/f1
-user.foo="bar"
-
-touch: 'SCRATCH_MNT/strict/corac'$'\314\247\303': Invalid argument
-touch: 'SCRATCH_MNT/strict/cora'$'\303\247\303': Invalid argument
+./tests/generic/556: 108: common/config: Syntax error: "&" unexpected
+./tests/generic/556: 10: _kill_fsstress: not found
It's that last line that's unnecessary. Fix this by checking for the
presence of a _kill_fsstress before invoking it.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
common/preamble | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/common/preamble b/common/preamble
index 78e45d522f482c..0c9ee2e0377dd5 100644
--- a/common/preamble
+++ b/common/preamble
@@ -7,7 +7,7 @@
# Standard cleanup function. Individual tests can override this.
_cleanup()
{
- _kill_fsstress
+ command -v _kill_fsstress &>/dev/null && _kill_fsstress
cd /
rm -r -f $tmp.*
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 22/34] generic/650: revert SOAK DURATION changes
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (20 preceding siblings ...)
2025-02-04 21:27 ` [PATCH 21/34] preamble: fix missing _kill_fsstress Darrick J. Wong
@ 2025-02-04 21:28 ` Darrick J. Wong
2025-02-05 0:43 ` Dave Chinner
2025-02-04 21:28 ` [PATCH 23/34] generic/032: fix pinned mount failure Darrick J. Wong
` (12 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:28 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Prior to commit 8973af00ec21, in the absence of an explicit
SOAK_DURATION, this test would run 2500 fsstress operations each of ten
times through the loop body. On the author's machines, this kept the
runtime to about 30s total. Oddly, this was changed to 30s per loop
body with no specific justification in the middle of an fsstress process
management change.
On the author's machine, this explodes the runtime from ~30s to 420s.
Put things back the way they were.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
tests/generic/650 | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tests/generic/650 b/tests/generic/650
index 60f86fdf518961..2e051b73156842 100755
--- a/tests/generic/650
+++ b/tests/generic/650
@@ -68,11 +68,11 @@ test "$nr_cpus" -gt 1024 && nr_cpus="$nr_hotplug_cpus"
fsstress_args+=(-p $nr_cpus)
if [ -n "$SOAK_DURATION" ]; then
test "$SOAK_DURATION" -lt 10 && SOAK_DURATION=10
+ fsstress_args+=(--duration="$((SOAK_DURATION / 10))")
else
- # run for 30s per iteration max
- SOAK_DURATION=300
+ # run for 3s per iteration max for a default runtime of ~30s.
+ fsstress_args+=(--duration=3)
fi
-fsstress_args+=(--duration="$((SOAK_DURATION / 10))")
nr_ops=$((2500 * TIME_FACTOR))
fsstress_args+=(-n $nr_ops)
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 23/34] generic/032: fix pinned mount failure
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (21 preceding siblings ...)
2025-02-04 21:28 ` [PATCH 22/34] generic/650: revert SOAK DURATION changes Darrick J. Wong
@ 2025-02-04 21:28 ` Darrick J. Wong
2025-02-04 21:28 ` [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails Darrick J. Wong
` (11 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:28 UTC (permalink / raw)
To: zlang, djwong; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
generic/032 now periodically fails with:
--- /tmp/fstests/tests/generic/032.out 2025-01-05 11:42:14.427388698 -0800
+++ /var/tmp/fstests/generic/032.out.bad 2025-01-06 18:20:17.122818195 -0800
@@ -1,5 +1,7 @@
QA output created by 032
100 iterations
-000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd >................<
-*
-100000
+umount: /opt: target is busy.
+mount: /opt: /dev/sda4 already mounted on /opt.
+ dmesg(1) may have more information after failed mount system call.
+cycle mount failed
+(see /var/tmp/fstests/generic/032.full for details)
The root cause of this regression is the _syncloop subshell. This
background process runs _scratch_sync, which is actually an xfs_io
process that calls syncfs on the scratch mount.
Unfortunately, while the test kills the _syncloop subshell, it doesn't
actually kill the xfs_io process. If the xfs_io process is in D state
running the syncfs, it won't react to the signal, but it will pin the
mount. Then the _scratch_cycle_mount fails because the mount is pinned.
Prior to commit 8973af00ec212f the _syncloop ran sync(1) which avoided
pinning the scratch filesystem.
Fix this by pgrepping for the xfs_io process and killing and waiting for
it if necessary.
Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
common/rc | 10 ++++++++++
tests/generic/032 | 9 +++++++++
2 files changed, 19 insertions(+)
diff --git a/common/rc b/common/rc
index 56b4e7e018a8e0..46cd1ed16892e6 100644
--- a/common/rc
+++ b/common/rc
@@ -40,6 +40,16 @@ _pkill()
fi
}
+# Find only the test processes started by this test
+_pgrep()
+{
+ if [ "$FSTESTS_ISOL" = "setsid" ]; then
+ pgrep --session 0 "$@"
+ else
+ pgrep "$@"
+ fi
+}
+
# Common execution handling for fsstress invocation.
#
# We need per-test fsstress binaries because of the way fsstress forks and
diff --git a/tests/generic/032 b/tests/generic/032
index 30290c7225a2fa..48d594fe9315b8 100755
--- a/tests/generic/032
+++ b/tests/generic/032
@@ -81,6 +81,15 @@ echo $iters iterations
kill $syncpid
wait
+# The xfs_io instance started by _scratch_sync could be stuck in D state when
+# the subshell running _syncloop & is killed. That xfs_io process pins the
+# mount so we must kill it and wait for it to die before cycling the mount.
+dead_syncfs_pid=$(_pgrep xfs_io)
+if [ -n "$dead_syncfs_pid" ]; then
+ _pkill xfs_io
+ wait $dead_syncfs_pid
+fi
+
# clear page cache and dump the file
_scratch_cycle_mount
_hexdump $SCRATCH_MNT/file
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (22 preceding siblings ...)
2025-02-04 21:28 ` [PATCH 23/34] generic/032: fix pinned mount failure Darrick J. Wong
@ 2025-02-04 21:28 ` Darrick J. Wong
2025-02-05 0:44 ` Dave Chinner
2025-02-04 21:28 ` [PATCH 25/34] fuzzy: don't use readarray for xfsfind output Darrick J. Wong
` (10 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:28 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Stop the fsx scrub stress loop if fsx returns a nonzero error code.
Cc: <fstests@vger.kernel.org> # v2023.01.15
Fixes: a2056ca8917bc8 ("fuzzy: enhance scrub stress testing to use fsx")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/fuzzy | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/common/fuzzy b/common/fuzzy
index 6d390d4efbd3da..9d5f01fb087033 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -942,6 +942,7 @@ __stress_scrub_fsx_loop() {
local remount_period="$3"
local stress_tgt="$4" # ignored
local focus=(-q -X) # quiet, validate file contents
+ local res
# As of November 2022, 2 million fsx ops should be enough to keep
# any filesystem busy for a couple of hours.
@@ -993,7 +994,9 @@ __stress_scrub_fsx_loop() {
# Need to recheck running conditions if we cleared anything
__stress_scrub_clean_scratch && continue
$here/ltp/fsx $args >> $seqres.full
- echo "fsx exits with $? at $(date)" >> $seqres.full
+ res=$?
+ echo "fsx exits with $res at $(date)" >> $seqres.full
+ test "$res" -ne 0 && break
done
rm -f "$runningfile"
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 25/34] fuzzy: don't use readarray for xfsfind output
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (23 preceding siblings ...)
2025-02-04 21:28 ` [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails Darrick J. Wong
@ 2025-02-04 21:28 ` Darrick J. Wong
2025-02-04 21:29 ` [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error Darrick J. Wong
` (9 subsequent siblings)
34 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:28 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Some of the scrub stress tests (e.g. xfs/796) walk the directory tree to
find filepaths to scrub, and load the entire list of paths into a bash
array. On a large filesystem or a long-running test this is hugely
wasteful of memory because we use each path exactly once.
Fix __stress_one_scrub_loop to avoid this by reading lines directly from
the output of the xfsfind utility. However, we play some games with fd
77 so that the processes in the loop body will use the same stdin as the
test and /not/ the piped stdout of xfsfind.
To avoid read(1) becoming confused by newlines in the file paths, adapt
xfsfind to print nulls between pathnames, and the bash code to recognize
them.
This was a debugging patch while I was trying to figure out why xfs/286
and other scrub soak tests started OOMing after the v2024.12.08 changes,
though in the end the OOMs were the result of memory leaks in fsstress.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/fuzzy | 22 +++++++++++++---------
src/xfsfind.c | 14 +++++++++++---
2 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/common/fuzzy b/common/fuzzy
index 9d5f01fb087033..41547add83121a 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -861,14 +861,14 @@ __stress_one_scrub_loop() {
;;
esac
- local target_cmd=(echo "$scrub_tgt")
+ local target_cmd=(echo -en "$scrub_tgt\0")
case "$scrub_tgt" in
- "%file%") target_cmd=($here/src/xfsfind -q "$SCRATCH_MNT");;
- "%attrfile%") target_cmd=($here/src/xfsfind -qa "$SCRATCH_MNT");;
- "%datafile%") target_cmd=($here/src/xfsfind -qb "$SCRATCH_MNT");;
- "%dir%") target_cmd=($here/src/xfsfind -qd "$SCRATCH_MNT");;
- "%regfile%") target_cmd=($here/src/xfsfind -qr "$SCRATCH_MNT");;
- "%cowfile%") target_cmd=($here/src/xfsfind -qs "$SCRATCH_MNT");;
+ "%file%") target_cmd=($here/src/xfsfind -0q "$SCRATCH_MNT");;
+ "%attrfile%") target_cmd=($here/src/xfsfind -0qa "$SCRATCH_MNT");;
+ "%datafile%") target_cmd=($here/src/xfsfind -0qb "$SCRATCH_MNT");;
+ "%dir%") target_cmd=($here/src/xfsfind -0qd "$SCRATCH_MNT");;
+ "%regfile%") target_cmd=($here/src/xfsfind -0qr "$SCRATCH_MNT");;
+ "%cowfile%") target_cmd=($here/src/xfsfind -0qs "$SCRATCH_MNT");;
esac
while __stress_scrub_running "$scrub_startat" "$runningfile"; do
@@ -876,12 +876,16 @@ __stress_one_scrub_loop() {
done
while __stress_scrub_running "$end" "$runningfile"; do
- readarray -t fnames < <("${target_cmd[@]}" 2>> $seqres.full)
- for fname in "${fnames[@]}"; do
+ # Attach the stdout of xfsfind to fd 77 so that we can read
+ # pathnames from that file descriptor without passing the pipe
+ # to the loop body as stdin.
+ exec 77< <("${target_cmd[@]}" 2>> $seqres.full)
+ while read -u 77 -d '' fname; do
$XFS_IO_PROG -x "${xfs_io_args[@]}" "$fname" 2>&1 | \
__stress_scrub_filter_output "${extra_filters[@]}"
__stress_scrub_running "$end" "$runningfile" || break
done
+ exec 77<&-
done
}
diff --git a/src/xfsfind.c b/src/xfsfind.c
index c81deaf64f57e9..2043d01ded3210 100644
--- a/src/xfsfind.c
+++ b/src/xfsfind.c
@@ -20,6 +20,7 @@ static int want_dir;
static int want_regfile;
static int want_sharedfile;
static int report_errors = 1;
+static int print0;
static int
check_datafile(
@@ -115,6 +116,7 @@ print_help(
printf("\n");
printf("Print all file paths matching any of the given predicates.\n");
printf("\n");
+ printf("-0 Print nulls between paths instead of newlines.\n");
printf("-a Match files with xattrs.\n");
printf("-b Match files with data blocks.\n");
printf("-d Match directories.\n");
@@ -208,8 +210,13 @@ visit(
out_fd:
close(fd);
out:
- if (printme)
- printf("%s\n", path);
+ if (printme) {
+ if (print0)
+ printf("%s%c", path, 0);
+ else
+ printf("%s\n", path);
+ fflush(stdout);
+ }
return retval;
}
@@ -236,8 +243,9 @@ main(
int c;
int ret;
- while ((c = getopt(argc, argv, "abdqrs")) >= 0) {
+ while ((c = getopt(argc, argv, "0abdqrs")) >= 0) {
switch (c) {
+ case '0': print0 = 1; break;
case 'a': want_attrfile = 1; break;
case 'b': want_datafile = 1; break;
case 'd': want_dir = 1; break;
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (24 preceding siblings ...)
2025-02-04 21:28 ` [PATCH 25/34] fuzzy: don't use readarray for xfsfind output Darrick J. Wong
@ 2025-02-04 21:29 ` Darrick J. Wong
2025-02-05 0:45 ` Dave Chinner
2025-02-04 21:29 ` [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration Darrick J. Wong
` (8 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:29 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Always abort the non-remount scrub fsstress loop in
__stress_scrub_fsstress_loop if fsstress returns a nonzero exit code.
Cc: <fstests@vger.kernel.org> # v2023.01.15
Fixes: 20df87599f66d0 ("fuzzy: make scrub stress loop control more robust")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/fuzzy | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/common/fuzzy b/common/fuzzy
index 41547add83121a..8afa4d35759f62 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -1012,6 +1012,7 @@ __stress_scrub_fsstress_loop() {
local remount_period="$3"
local stress_tgt="$4"
local focus=()
+ local res
case "$stress_tgt" in
"parent")
@@ -1143,7 +1144,9 @@ __stress_scrub_fsstress_loop() {
# Need to recheck running conditions if we cleared anything
__stress_scrub_clean_scratch && continue
_run_fsstress $args >> $seqres.full
- echo "fsstress exits with $? at $(date)" >> $seqres.full
+ res=$?
+ echo "$mode fsstress exits with $res at $(date)" >> $seqres.full
+ [ "$res" -ne 0 ] && break;
done
rm -f "$runningfile"
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (25 preceding siblings ...)
2025-02-04 21:29 ` [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error Darrick J. Wong
@ 2025-02-04 21:29 ` Darrick J. Wong
2025-02-05 0:50 ` Dave Chinner
2025-02-04 21:29 ` [PATCH 28/34] fix _require_scratch_duperemove ordering Darrick J. Wong
` (7 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:29 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Quite a while ago, I added --duration= arguments to fsx and fsstress,
and apparently I forgot to update the scrub stress loops to use them.
Replace the usage of timeout(1) for the remount_period versions of the
loop to clean up that code; and convert the non-remount loop so that
they don't run over time.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/fuzzy | 47 +++++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 18 deletions(-)
diff --git a/common/fuzzy b/common/fuzzy
index 8afa4d35759f62..f676614d8343f7 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -939,6 +939,22 @@ __stress_scrub_clean_scratch() {
return 0
}
+# Compute a --duration= interval for fsx and fsstress
+___stress_scrub_duration()
+{
+ local end="$1"
+ local remount_period="$2"
+ local now="$(date +%s)"
+ local delta="$((end - now))"
+
+ test "$delta" -lt 0 && delta=0
+
+ test -n "$remount_period" && test "$remount_period" -lt "$delta" && \
+ delta="$remount_period"
+
+ echo "--duration=$delta"
+}
+
# Run fsx while we're testing online fsck.
__stress_scrub_fsx_loop() {
local end="$1"
@@ -946,11 +962,9 @@ __stress_scrub_fsx_loop() {
local remount_period="$3"
local stress_tgt="$4" # ignored
local focus=(-q -X) # quiet, validate file contents
+ local duration
local res
- # As of November 2022, 2 million fsx ops should be enough to keep
- # any filesystem busy for a couple of hours.
- focus+=(-N 2000000)
focus+=(-o $((128000 * LOAD_FACTOR)) )
focus+=(-l $((600000 * LOAD_FACTOR)) )
@@ -965,17 +979,12 @@ __stress_scrub_fsx_loop() {
# anything.
test "$mode" = "rw" && __stress_scrub_clean_scratch && continue
- timeout -s TERM "$remount_period" $here/ltp/fsx \
- $args $rw_arg >> $seqres.full
+ duration=$(___stress_scrub_duration "$end" "$remount_period")
+ $here/ltp/fsx $duration $args $rw_arg >> $seqres.full
res=$?
echo "$mode fsx exits with $res at $(date)" >> $seqres.full
- if [ "$res" -ne 0 ] && [ "$res" -ne 124 ]; then
- # Stop if fsx returns error. Mask off
- # the magic code 124 because that is how the
- # timeout(1) program communicates that we ran
- # out of time.
- break;
- fi
+ test "$res" -ne 0 && break
+
if [ "$mode" = "rw" ]; then
mode="ro"
rw_arg="-t 0 -w 0 -FHzCIJBE0"
@@ -997,7 +1006,8 @@ __stress_scrub_fsx_loop() {
while __stress_scrub_running "$end" "$runningfile"; do
# Need to recheck running conditions if we cleared anything
__stress_scrub_clean_scratch && continue
- $here/ltp/fsx $args >> $seqres.full
+ duration=$(___stress_scrub_duration "$end" "$remount_period")
+ $here/ltp/fsx $duration $args >> $seqres.full
res=$?
echo "fsx exits with $res at $(date)" >> $seqres.full
test "$res" -ne 0 && break
@@ -1013,6 +1023,7 @@ __stress_scrub_fsstress_loop() {
local stress_tgt="$4"
local focus=()
local res
+ local duration
case "$stress_tgt" in
"parent")
@@ -1102,9 +1113,7 @@ __stress_scrub_fsstress_loop() {
;;
esac
- # As of March 2022, 2 million fsstress ops should be enough to keep
- # any filesystem busy for a couple of hours.
- local args=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT -n 2000000 "${focus[@]}")
+ local args=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT "${focus[@]}")
echo "Running $FSSTRESS_PROG $args" >> $seqres.full
if [ -n "$remount_period" ]; then
@@ -1115,7 +1124,8 @@ __stress_scrub_fsstress_loop() {
# anything.
test "$mode" = "rw" && __stress_scrub_clean_scratch && continue
- _run_fsstress_bg $args $rw_arg >> $seqres.full
+ duration=$(___stress_scrub_duration "$end" "$remount_period")
+ _run_fsstress_bg $duration $args $rw_arg >> $seqres.full
sleep $remount_period
_kill_fsstress
res=$?
@@ -1143,7 +1153,8 @@ __stress_scrub_fsstress_loop() {
while __stress_scrub_running "$end" "$runningfile"; do
# Need to recheck running conditions if we cleared anything
__stress_scrub_clean_scratch && continue
- _run_fsstress $args >> $seqres.full
+ duration=$(___stress_scrub_duration "$end" "$remount_period")
+ _run_fsstress $duration $args >> $seqres.full
res=$?
echo "$mode fsstress exits with $res at $(date)" >> $seqres.full
[ "$res" -ne 0 ] && break;
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 28/34] fix _require_scratch_duperemove ordering
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (26 preceding siblings ...)
2025-02-04 21:29 ` [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration Darrick J. Wong
@ 2025-02-04 21:29 ` Darrick J. Wong
2025-02-05 0:51 ` Dave Chinner
2025-02-04 21:29 ` [PATCH 29/34] fsstress: fix a memory leak Darrick J. Wong
` (6 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:29 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Zorro complained that generic/559 stopped working, and I noticed that
the duperemove invocation in the _require_scratch_duperemove function
would fail with:
Error 2: No such file or directory while getting path to file /opt/file1. Skipping.
Error 2: No such file or directory while getting path to file /opt/file2. Skipping.
No dedupe candidates found.
Gathering file list...
The cause of this is the incorrect placement of _require_scratch_dedupe
after a _scratch_mount. _require_scratch_dedupe formats, mounts, tests,
and unmounts the scratch filesystem, which means that it should not come
between a _scratch_mount call and code that uses $SCRATCH_MNT.
Cc: <fstests@vger.kernel.org> # v2024.12.22
Fixes: 3b9f5fc7d7d853 ("common: call _require_scratch_dedupe from _require_scratch_duperemove")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/reflink | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/common/reflink b/common/reflink
index 9177c45e70bb37..757f06c1c69fa7 100644
--- a/common/reflink
+++ b/common/reflink
@@ -80,11 +80,11 @@ _require_scratch_duperemove()
{
_require_scratch
_require_command "$DUPEREMOVE_PROG" duperemove
-
- _scratch_mkfs > /dev/null
- _scratch_mount
_require_scratch_dedupe
+ _scratch_mkfs > /dev/null
+ _scratch_mount
+
dd if=/dev/zero of="$SCRATCH_MNT/file1" bs=128k count=1 >& /dev/null
dd if=/dev/zero of="$SCRATCH_MNT/file2" bs=128k count=1 >& /dev/null
if ! "$DUPEREMOVE_PROG" -d "$SCRATCH_MNT/file1" \
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 29/34] fsstress: fix a memory leak
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (27 preceding siblings ...)
2025-02-04 21:29 ` [PATCH 28/34] fix _require_scratch_duperemove ordering Darrick J. Wong
@ 2025-02-04 21:29 ` Darrick J. Wong
2025-02-05 0:54 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 30/34] fsx: fix leaked log file pointer Darrick J. Wong
` (5 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:29 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Someone forgot to free the iovec that readv_f allocates.
Fixes: 80499d8f5f251e ("fsstress: new writev and readv operations test")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
ltp/fsstress.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/ltp/fsstress.c b/ltp/fsstress.c
index 9e8eaa6d8656da..14c29921e8b0f8 100644
--- a/ltp/fsstress.c
+++ b/ltp/fsstress.c
@@ -4652,6 +4652,7 @@ readv_f(opnum_t opno, long r)
}
e = readv(fd, iov, iovcnt) < 0 ? errno : 0;
+ free(iov);
free(buf);
if (v)
printf("%d/%lld: readv %s%s [%lld,%d,%d] %d\n",
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 30/34] fsx: fix leaked log file pointer
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (28 preceding siblings ...)
2025-02-04 21:29 ` [PATCH 29/34] fsstress: fix a memory leak Darrick J. Wong
@ 2025-02-04 21:30 ` Darrick J. Wong
2025-02-05 0:57 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument Darrick J. Wong
` (4 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:30 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Fix a resource leaks in fsx, where we fail to close the fsx logfile,
because the C library could have some buffered contents that aren't
flushed when the program terminates. glibc seems to do this for us, but
I wouldn't be so sure about the others.
Fixes: 3f742550dfed84 ("fsx: add support for recording operations to a file")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
ltp/fsx.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/ltp/fsx.c b/ltp/fsx.c
index d1b0f245582b31..163b9453b5418b 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -3489,6 +3489,7 @@ main(int argc, char **argv)
if (recordops)
logdump();
+ fclose(fsxlogf);
exit(0);
return 0;
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (29 preceding siblings ...)
2025-02-04 21:30 ` [PATCH 30/34] fsx: fix leaked log file pointer Darrick J. Wong
@ 2025-02-04 21:30 ` Darrick J. Wong
2025-02-05 1:00 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 32/34] common/config: add $here to FSSTRESS_PROG Darrick J. Wong
` (3 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:30 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
fsstress -n is the number of fs operations per process, not the total
number of operations. There's no need to factor nr_cpus into the -n
argument because that causes excess runtime as core count increases.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
tests/generic/476 | 2 +-
tests/generic/642 | 2 +-
tests/generic/750 | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tests/generic/476 b/tests/generic/476
index 4537fcd77d2f07..13979127245c77 100755
--- a/tests/generic/476
+++ b/tests/generic/476
@@ -18,7 +18,7 @@ _scratch_mkfs > $seqres.full 2>&1
_scratch_mount >> $seqres.full 2>&1
nr_cpus=$((LOAD_FACTOR * 4))
-nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
+nr_ops=$((25000 * TIME_FACTOR))
fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus)
test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
diff --git a/tests/generic/642 b/tests/generic/642
index 4b92a9c181d49c..3c418aaa32bd23 100755
--- a/tests/generic/642
+++ b/tests/generic/642
@@ -20,7 +20,7 @@ _scratch_mkfs > $seqres.full 2>&1
_scratch_mount >> $seqres.full 2>&1
nr_cpus=$((LOAD_FACTOR * 4))
-nr_ops=$((70000 * nr_cpus * TIME_FACTOR))
+nr_ops=$((70000 * TIME_FACTOR))
args=('-z' '-S' 'c')
diff --git a/tests/generic/750 b/tests/generic/750
index 5c54a5c7888f1d..a0828b50f3c7e4 100755
--- a/tests/generic/750
+++ b/tests/generic/750
@@ -37,7 +37,7 @@ _scratch_mkfs > $seqres.full 2>&1
_scratch_mount >> $seqres.full 2>&1
nr_cpus=$((LOAD_FACTOR * 4))
-nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
+nr_ops=$((25000 * TIME_FACTOR))
fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus)
test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 32/34] common/config: add $here to FSSTRESS_PROG
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (30 preceding siblings ...)
2025-02-04 21:30 ` [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument Darrick J. Wong
@ 2025-02-04 21:30 ` Darrick J. Wong
2025-02-05 1:00 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 33/34] config: add FSX_PROG variable Darrick J. Wong
` (2 subsequent siblings)
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:30 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
In general we're supposed to specify full paths to fstests binaries with
$here so that subtests can change the current working directory without
issues.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/config | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/common/config b/common/config
index 77f3fc153eb731..ae9aa3f4b0b8fc 100644
--- a/common/config
+++ b/common/config
@@ -128,7 +128,7 @@ export MOUNT_PROG="$(type -P mount)"
export UMOUNT_PROG="$(type -P umount)"
[ "$UMOUNT_PROG" = "" ] && _fatal "umount not found"
-export FSSTRESS_PROG="./ltp/fsstress"
+export FSSTRESS_PROG="$here/ltp/fsstress"
[ ! -x $FSSTRESS_PROG ] && _fatal "fsstress not found or executable"
export PERL_PROG="$(type -P perl)"
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 33/34] config: add FSX_PROG variable
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (31 preceding siblings ...)
2025-02-04 21:30 ` [PATCH 32/34] common/config: add $here to FSSTRESS_PROG Darrick J. Wong
@ 2025-02-04 21:30 ` Darrick J. Wong
2025-02-05 1:04 ` Dave Chinner
2025-02-04 21:31 ` [PATCH 34/34] build: initialize stack variables to zero by default Darrick J. Wong
2025-02-05 12:46 ` [PATCHSET v2] fstests: random fixes for v2025.02.02 Amir Goldstein
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:30 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Replace the open-coded $here/ltp/fsx and ./ltp/fsx variants with a
single variable.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
common/config | 3 +++
common/fuzzy | 6 +++---
common/rc | 4 ++--
tests/generic/075 | 2 +-
tests/generic/112 | 2 +-
tests/generic/127 | 16 ++++++++--------
tests/generic/231 | 4 ++--
tests/generic/455 | 2 +-
tests/generic/456 | 2 +-
tests/generic/457 | 2 +-
tests/generic/469 | 2 +-
tests/generic/499 | 2 +-
tests/generic/511 | 2 +-
13 files changed, 26 insertions(+), 23 deletions(-)
diff --git a/common/config b/common/config
index ae9aa3f4b0b8fc..193b7af432dc2b 100644
--- a/common/config
+++ b/common/config
@@ -131,6 +131,9 @@ export UMOUNT_PROG="$(type -P umount)"
export FSSTRESS_PROG="$here/ltp/fsstress"
[ ! -x $FSSTRESS_PROG ] && _fatal "fsstress not found or executable"
+export FSX_PROG="$here/ltp/fsx"
+[ ! -x $FSX_PROG ] && _fatal "fsx not found or executable"
+
export PERL_PROG="$(type -P perl)"
[ "$PERL_PROG" = "" ] && _fatal "perl not found"
diff --git a/common/fuzzy b/common/fuzzy
index f676614d8343f7..67f506a97d9845 100644
--- a/common/fuzzy
+++ b/common/fuzzy
@@ -969,7 +969,7 @@ __stress_scrub_fsx_loop() {
focus+=(-l $((600000 * LOAD_FACTOR)) )
local args="$FSX_AVOID ${focus[@]} ${SCRATCH_MNT}/fsx.$seq"
- echo "Running $here/ltp/fsx $args" >> $seqres.full
+ echo "Running $FSX_PROG $args" >> $seqres.full
if [ -n "$remount_period" ]; then
local mode="rw"
@@ -980,7 +980,7 @@ __stress_scrub_fsx_loop() {
test "$mode" = "rw" && __stress_scrub_clean_scratch && continue
duration=$(___stress_scrub_duration "$end" "$remount_period")
- $here/ltp/fsx $duration $args $rw_arg >> $seqres.full
+ $FSX_PROG $duration $args $rw_arg >> $seqres.full
res=$?
echo "$mode fsx exits with $res at $(date)" >> $seqres.full
test "$res" -ne 0 && break
@@ -1007,7 +1007,7 @@ __stress_scrub_fsx_loop() {
# Need to recheck running conditions if we cleared anything
__stress_scrub_clean_scratch && continue
duration=$(___stress_scrub_duration "$end" "$remount_period")
- $here/ltp/fsx $duration $args >> $seqres.full
+ $FSX_PROG $duration $args >> $seqres.full
res=$?
echo "fsx exits with $res at $(date)" >> $seqres.full
test "$res" -ne 0 && break
diff --git a/common/rc b/common/rc
index 46cd1ed16892e6..56b74287b6daa1 100644
--- a/common/rc
+++ b/common/rc
@@ -5050,7 +5050,7 @@ _get_page_size()
_require_hugepage_fsx()
{
- $here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
+ $FSX_PROG -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
_notrun "fsx binary does not support MADV_COLLAPSE"
}
@@ -5058,7 +5058,7 @@ _run_fsx()
{
echo "fsx $*"
local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
- set -- $here/ltp/fsx $args $FSX_AVOID $TEST_DIR/junk
+ set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
echo "$@" >>$seqres.full
rm -f $TEST_DIR/junk
"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
diff --git a/tests/generic/075 b/tests/generic/075
index 8db58e866d0d05..94a96e3341d3e7 100755
--- a/tests/generic/075
+++ b/tests/generic/075
@@ -53,7 +53,7 @@ _do_test()
# This cd and use of -P gets full debug on "$RESULT_DIR" (not TEST_DEV)
cd $out
- if ! $here/ltp/fsx $_param -P "$RESULT_DIR" $seq.$_n $FSX_AVOID &>/dev/null
+ if ! $FSX_PROG $_param -P "$RESULT_DIR" $seq.$_n $FSX_AVOID &>/dev/null
then
echo " fsx ($_param) failed, $? - compare $seqres.$_n.{good,bad,fsxlog}"
mv $out/$seq.$_n $seqres.$_n.full
diff --git a/tests/generic/112 b/tests/generic/112
index 0084b555a7f5e8..c171750055b1ee 100755
--- a/tests/generic/112
+++ b/tests/generic/112
@@ -53,7 +53,7 @@ _do_test()
# This cd and use of -P gets full debug on "$RESULT_DIR" (not TEST_DEV)
cd $out
- if ! $here/ltp/fsx $_param -P "$RESULT_DIR" $FSX_AVOID $seq.$_n &>/dev/null
+ if ! $FSX_PROG $_param -P "$RESULT_DIR" $FSX_AVOID $seq.$_n &>/dev/null
then
echo " fsx ($_param) returned $? - see $seq.$_n.full"
mv "$RESULT_DIR"/$seq.$_n.fsxlog $seqres.$_n.full
diff --git a/tests/generic/127 b/tests/generic/127
index 985c99cfb5bef7..fcd535b46075e2 100755
--- a/tests/generic/127
+++ b/tests/generic/127
@@ -31,9 +31,9 @@ FSX_ARGS="-q -l $FSX_FILE_SIZE -o 65536 -S 191110531 -N 100000"
_fsx_lite_nommap()
{
dd if=/dev/zero of=$TEST_DIR/fsx_lite_nommap bs=${FSX_FILE_SIZE} count=1 > /dev/null 2>&1
- if ! ltp/fsx $FSX_ARGS -L -R -W $FSX_AVOID $TEST_DIR/fsx_lite_nommap > $tmp.output 2>&1
+ if ! $FSX_PROG $FSX_ARGS -L -R -W $FSX_AVOID $TEST_DIR/fsx_lite_nommap > $tmp.output 2>&1
then
- echo "ltp/fsx $FSX_ARGS -L -R -W $TEST_DIR/fsx_lite_nommap"
+ echo "$FSX_PROG $FSX_ARGS -L -R -W $TEST_DIR/fsx_lite_nommap"
cat $tmp.output
return 1
fi
@@ -44,9 +44,9 @@ _fsx_lite_nommap()
_fsx_lite_mmap()
{
dd if=/dev/zero of=$TEST_DIR/fsx_lite_mmap bs=${FSX_FILE_SIZE} count=1 > /dev/null 2>&1
- if ! ltp/fsx $FSX_ARGS -L $FSX_AVOID $TEST_DIR/fsx_lite_mmap > $tmp.output 2>&1
+ if ! $FSX_PROG $FSX_ARGS -L $FSX_AVOID $TEST_DIR/fsx_lite_mmap > $tmp.output 2>&1
then
- echo "ltp/fsx $FSX_ARGS -L fsx_lite_mmap"
+ echo "$FSX_PROG $FSX_ARGS -L fsx_lite_mmap"
cat $tmp.output
return 1
fi
@@ -58,9 +58,9 @@ _fsx_std_nommap()
{
local fname="$TEST_DIR/$1"
- if ! ltp/fsx $FSX_ARGS -R -W $FSX_AVOID $fname > $tmp.output 2>&1
+ if ! $FSX_PROG $FSX_ARGS -R -W $FSX_AVOID $fname > $tmp.output 2>&1
then
- echo "ltp/fsx $FSX_ARGS -R -W fsx_std_nommap"
+ echo "$FSX_PROG $FSX_ARGS -R -W fsx_std_nommap"
cat $tmp.output
return 1
fi
@@ -72,9 +72,9 @@ _fsx_std_mmap()
{
local fname="$TEST_DIR/$1"
- if ! ltp/fsx $FSX_ARGS $FSX_AVOID $fname > $tmp.output 2>&1
+ if ! $FSX_PROG $FSX_ARGS $FSX_AVOID $fname > $tmp.output 2>&1
then
- echo "ltp/fsx $FSX_ARGS fsx_std_mmap"
+ echo "$FSX_PROG $FSX_ARGS fsx_std_mmap"
cat $tmp.output
return 1
fi
diff --git a/tests/generic/231 b/tests/generic/231
index 8dda926d875e88..b598a5d568bdf6 100755
--- a/tests/generic/231
+++ b/tests/generic/231
@@ -23,8 +23,8 @@ _fsx()
echo "=== FSX Standard Mode, Memory Mapping, $tasks Tasks ==="
for (( i = 1; i <= $tasks; i++ )); do
SEED=$RANDOM
- echo "ltp/fsx $FSX_ARGS -S $SEED $SCRATCH_MNT/fsx_file$i" >>$seqres.full
- _su $qa_user -c "ltp/fsx $FSX_ARGS -S $SEED \
+ echo "$FSX_PROG $FSX_ARGS -S $SEED $SCRATCH_MNT/fsx_file$i" >>$seqres.full
+ _su $qa_user -c "$FSX_PROG $FSX_ARGS -S $SEED \
$FSX_AVOID $SCRATCH_MNT/fsx_file$i" >$tmp.output$i 2>&1 &
done
diff --git a/tests/generic/455 b/tests/generic/455
index 31f84346daecde..a70da042f07a29 100755
--- a/tests/generic/455
+++ b/tests/generic/455
@@ -79,7 +79,7 @@ FSX_OPTS="-N $NUM_OPS -d -P $SANITY_DIR -i $LOGWRITES_DMDEV"
seeds=(0 0 0 0)
# Run fsx for a while
for j in `seq 0 $((NUM_FILES-1))`; do
- run_check $here/ltp/fsx $FSX_OPTS $FSX_AVOID -S ${seeds[$j]} -j $j $SCRATCH_MNT/testfile$j &
+ run_check $FSX_PROG $FSX_OPTS $FSX_AVOID -S ${seeds[$j]} -j $j $SCRATCH_MNT/testfile$j &
done
wait
diff --git a/tests/generic/456 b/tests/generic/456
index 0123508ce15076..32afa398f11ccc 100755
--- a/tests/generic/456
+++ b/tests/generic/456
@@ -53,7 +53,7 @@ write 0x3e5ec 0x1a14 0x21446
zero_range 0x20fac 0x6d9c 0x40000 keep_size
mapwrite 0x216ad 0x274f 0x40000
EOF
-run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile
+run_check $FSX_PROG -d --replay-ops $fsxops $SCRATCH_MNT/testfile
_flakey_drop_and_remount
_unmount_flakey
diff --git a/tests/generic/457 b/tests/generic/457
index defa73cfb5dfcd..b59cad5672f407 100755
--- a/tests/generic/457
+++ b/tests/generic/457
@@ -82,7 +82,7 @@ FSX_OPTS="-N $NUM_OPS -d -k -P $SANITY_DIR -i $LOGWRITES_DMDEV"
for j in `seq 0 $((NUM_FILES-1))`; do
# clone the clone from prev iteration which may have already mutated
_cp_reflink $SCRATCH_MNT/testfile$((j-1)) $SCRATCH_MNT/testfile$j
- run_check $here/ltp/fsx $FSX_OPTS $FSX_AVOID -S 0 -j $j $SCRATCH_MNT/testfile$j &
+ run_check $FSX_PROG $FSX_OPTS $FSX_AVOID -S 0 -j $j $SCRATCH_MNT/testfile$j &
done
wait
diff --git a/tests/generic/469 b/tests/generic/469
index 1352c3243f6df6..f7718185b7114a 100755
--- a/tests/generic/469
+++ b/tests/generic/469
@@ -35,7 +35,7 @@ _require_xfs_io_command "fzero"
run_fsx()
{
- $here/ltp/fsx $2 --replay-ops $1 $file 2>&1 | tee -a $seqres.full >$tmp.fsx
+ $FSX_PROG $2 --replay-ops $1 $file 2>&1 | tee -a $seqres.full >$tmp.fsx
if [ ${PIPESTATUS[0]} -ne 0 ]; then
cat $tmp.fsx
exit 1
diff --git a/tests/generic/499 b/tests/generic/499
index 9145c1c5dfdc72..868413bad698ea 100755
--- a/tests/generic/499
+++ b/tests/generic/499
@@ -35,7 +35,7 @@ ENDL
victim=$SCRATCH_MNT/a
touch $victim
-$here/ltp/fsx --replay-ops $tmp.fsxops $victim > $tmp.output 2>&1 || cat $tmp.output
+$FSX_PROG --replay-ops $tmp.fsxops $victim > $tmp.output 2>&1 || cat $tmp.output
echo "Silence is golden"
status=0
diff --git a/tests/generic/511 b/tests/generic/511
index 7c903fb0280d7f..66557ab9369059 100755
--- a/tests/generic/511
+++ b/tests/generic/511
@@ -32,7 +32,7 @@ ENDL
victim=$SCRATCH_MNT/a
touch $victim
-$here/ltp/fsx --replay-ops $tmp.fsxops $victim > $tmp.output 2>&1 || cat $tmp.output
+$FSX_PROG --replay-ops $tmp.fsxops $victim > $tmp.output 2>&1 || cat $tmp.output
echo "Silence is golden"
status=0
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 34/34] build: initialize stack variables to zero by default
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (32 preceding siblings ...)
2025-02-04 21:30 ` [PATCH 33/34] config: add FSX_PROG variable Darrick J. Wong
@ 2025-02-04 21:31 ` Darrick J. Wong
2025-02-05 1:05 ` Dave Chinner
2025-02-05 12:46 ` [PATCHSET v2] fstests: random fixes for v2025.02.02 Amir Goldstein
34 siblings, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-04 21:31 UTC (permalink / raw)
To: zlang, djwong; +Cc: fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Newer versions of gcc and clang can include the ability to zero stack
variables by default. Let's enable it so that we (a) reduce the risk of
writing stack contents to disk somewhere and (b) try to reduce
unpredictable program behavior based on random stack contents. The
kernel added this 6 years ago, so I think it's mature enough for
fstests.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
configure.ac | 1 +
include/builddefs.in | 3 ++-
m4/package_libcdev.m4 | 14 ++++++++++++++
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/configure.ac b/configure.ac
index c81411e735a90d..f3c8c643f0eb9e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -72,6 +72,7 @@ AC_HAVE_NFTW
AC_HAVE_RLIMIT_NOFILE
AC_NEED_INTERNAL_XFS_IOC_EXCHANGE_RANGE
AC_HAVE_FICLONE
+AC_HAVE_TRIVIAL_AUTO_VAR_INIT
AC_CHECK_FUNCS([renameat2])
AC_CHECK_FUNCS([reallocarray])
diff --git a/include/builddefs.in b/include/builddefs.in
index 7274cde8d0814c..5b5864278682a4 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -76,6 +76,7 @@ NEED_INTERNAL_XFS_IOC_EXCHANGE_RANGE = @need_internal_xfs_ioc_exchange_range@
HAVE_FICLONE = @have_ficlone@
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
+SANITIZER_CFLAGS += @autovar_init_cflags@
ifeq ($(PKG_PLATFORM),linux)
PCFLAGS = -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 $(GCCFLAGS)
@@ -90,7 +91,7 @@ GCFLAGS = $(OPTIMIZER) $(DEBUG) $(CPPFLAGS) \
-I$(TOPDIR)/include -DVERSION=\"$(PKG_VERSION)\"
# Global, Platform, Local CFLAGS
-CFLAGS += $(GCFLAGS) $(PCFLAGS) $(LCFLAGS)
+CFLAGS += $(GCFLAGS) $(PCFLAGS) $(LCFLAGS) $(SANITIZER_CFLAGS)
include $(TOPDIR)/include/buildmacros
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index a0d50f4d9b68e4..ed8fe6e32ae00a 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -72,3 +72,17 @@ AC_DEFUN([AC_HAVE_FICLONE],
AC_MSG_RESULT(yes)],[AC_MSG_RESULT(no)])
AC_SUBST(have_ficlone)
])
+
+# Check if we have -ftrivial-auto-var-init=zero
+AC_DEFUN([AC_HAVE_TRIVIAL_AUTO_VAR_INIT],
+ [ AC_MSG_CHECKING([if C compiler supports zeroing automatic vars])
+ OLD_CFLAGS="$CFLAGS"
+ TEST_CFLAGS="-ftrivial-auto-var-init=zero"
+ CFLAGS="$CFLAGS $TEST_CFLAGS"
+ AC_LINK_IFELSE([AC_LANG_PROGRAM([])],
+ [AC_MSG_RESULT([yes])]
+ [autovar_init_cflags=$TEST_CFLAGS],
+ [AC_MSG_RESULT([no])])
+ CFLAGS="${OLD_CFLAGS}"
+ AC_SUBST(autovar_init_cflags)
+ ])
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack
2025-02-04 21:23 ` [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack Darrick J. Wong
@ 2025-02-05 0:07 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:07 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:23:51PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Zorro complained about the following regression in generic/589 that I
> can't reproduce here on Debian 12:
....
> > Orignal _clear_mount_stack trys to umount all of them. But Dave gave -R option to
> > umount command, so
> > `umount -R /mnt/test/589-dst/201227_mpC` and `umount -R /mnt/test/589-src/201227_mpA`
> > already umount /mnt/test/589-dst and /mnt/test/589-src. recursively.
> > Then `umount -R /mnt/test/589-dst` shows "not mounted".
>
> I /think/ this is a result of commit 4c6bc4565105e6 performing this
> "conversion" in _clear_mount_stack:
>
> - $UMOUNT_PROG $MOUNTED_POINT_STACK
> + _unmount -R $MOUNTED_POINT_STACK
>
> This is not a 1:1 conversion here -- previously there was no
> -R(ecursive) unmount option, and now there is. It looks as though
> umount parses /proc/self/mountinfo to figure out what to unmount, and
> maybe on Zorro's system it balks if the argument passed is not present
> in that file? Debian 12's umount doesn't care.
Ah, I think I added that whilst trying to work out the weird
shenanigans that the mount namespace cloning that these tests did
caused. It was cloning a shared mount namespace that many tests
interacted and so any changes to the mount namespace inside that
test were visible everywhere.
If two of these mount namespace tests ran at the same time, they
stepped on each other and they failed to unmount their own stack
cleanly because of "mount busy at unmount" errors. This would then
cascade into other test failures because the test/scratch devices
couldn't be unmounted.
The fix I came up with was to run a recursive unmount for the stack,
which seemed to clear all the mounts the test made regardless of
whatever other test could see them. It didn't through any errors on
my test machines (debian unstable) so I promptly forgot about it.
> Regardless, there was no justification for this change in behavior that
> was buried in what is otherwise a hoist patch, so revert it. The author
> can resubmit the change with proper documentation.
I suspect that changes made later on (i.e. private mount namespaces
per runner instance) fixed the underlying namespace interaction
problem and made the recursive unmount unnecessary. The lack of
errors from it meant I likely forgot about it before I posted the
initial RFC that was merged...
>
> Cc: <fstests@vger.kernel.org> # v2024.12.08
> Fixes: 4c6bc4565105e6 ("fstests: clean up mount and unmount operations")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/rc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>
> diff --git a/common/rc b/common/rc
> index 4658e3b8be747f..07646927bad523 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -334,7 +334,7 @@ _put_mount()
> _clear_mount_stack()
> {
> if [ -n "$MOUNTED_POINT_STACK" ]; then
> - _unmount -R $MOUNTED_POINT_STACK
> + _unmount $MOUNTED_POINT_STACK
> fi
> MOUNTED_POINT_STACK=""
> }
<tests>
That doesn't appear to cause any problems on my current
check-parallel stack, so I think the above explains how it got
there.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 07/34] common/dump: don't replace pids arbitrarily
2025-02-04 21:24 ` [PATCH 07/34] common/dump: don't replace pids arbitrarily Darrick J. Wong
@ 2025-02-05 0:09 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:09 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:24:07PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> In the next patch we'll run tests in a pid namespace, which means that
> the test process will have a very low pid. This low pid situation
> causes problems with the dump tests because they unconditionally replace
> it with the word "PID", which causes unnecessary test failures.
>
> Initially I was going to fix it by bracketing the regexp with a
> whitespace/punctuation/eol/sol detector, but then I decided to remove it
> see how many sheep came barreling through. None did, so I removed it
> entirely. The commit adding it (linked below) was not insightful at
> all.
>
> Fixes: 19beb54c96e363 ("Extra filtering as part of IRIX/Linux xfstests reconciliation for dump.")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/dump | 1 -
> 1 file changed, 1 deletion(-)
>
>
> diff --git a/common/dump b/common/dump
> index 3761c16100d808..6dcd6250c33147 100644
> --- a/common/dump
> +++ b/common/dump
> @@ -907,7 +907,6 @@ _dir_filter()
> -e "s#$SCRATCH_MNT#SCRATCH_MNT#g" \
> -e "s#$dump_sdir#DUMP_SUBDIR#g" \
> -e "s#$restore_sdir#RESTORE_SUBDIR#g" \
> - -e "s#$$#PID#g" \
> -e "/Only in SCRATCH_MNT: .use_space/d" \
> -e "s#$RESULT_DIR/##g" \
Looks fine, if it causes problems further down the line then we can
implement a better filter.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 11/34] common/rc: create a wrapper for the su command
2025-02-04 21:25 ` [PATCH 11/34] common/rc: create a wrapper for the su command Darrick J. Wong
@ 2025-02-05 0:14 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:14 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:25:09PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Create a _su wrapper around the su command so that the next patch can
> fix all the pkill isolation code so that this test can only kill
> processes started for this test.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks fine.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT
2025-02-04 21:25 ` [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT Darrick J. Wong
@ 2025-02-05 0:16 ` Dave Chinner
2025-02-05 17:38 ` Darrick J. Wong
0 siblings, 1 reply; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:16 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:25:26PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> The next patch in this series fixes various issues with the recently
> added fstests process isolation scheme by running each new process in a
> separate process group session. Unfortunately, the processes in the
> session are created with SIGINT ignored by default because they are not
> attached to the controlling terminal. Therefore, switch the kill signal
> to SIGPIPE because that is usually fatal and not masked by default.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/fuzzy | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
Change looks fine, but _pkill is not yet defined. It is introduced
in the next patch "common/rc: hoist pkill to a helper function"
so this needs to be reordered.
With that done, however:
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 13/34] common/rc: hoist pkill to a helper function
2025-02-04 21:25 ` [PATCH 13/34] common/rc: hoist pkill to a helper function Darrick J. Wong
@ 2025-02-05 0:17 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:17 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:25:42PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Create a helper function to wrap pkill in preparation for the next
> patch.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks fine, apart from the ordering issue with the previous patch.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 14/34] common: fix pkill by running test program in a separate session
2025-02-04 21:25 ` [PATCH 14/34] common: fix pkill by running test program in a separate session Darrick J. Wong
@ 2025-02-05 0:23 ` Dave Chinner
2025-02-05 17:43 ` Darrick J. Wong
0 siblings, 1 reply; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:23 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:25:57PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Run each test program with a separate session id so that we can tell
> pkill to kill all processes of a given name, but only within our own
> session id. This /should/ suffice to run multiple fstests on the same
> machine without one instance shooting down processes of another
> instance.
I thought you were going to drop this because pidns stuff available.
Also, because it is only check parallel that needs the pidns
isolation, and I'm not doing that external to check. Hence we can
just get rid of the 'pkill --parent' requirement because the
concurrent tests are already being run in isolated PID namespaces...
Regardless, if ppl still want to both pid session and pidns directly
into check, the code is fine.
Just one little nit:
> diff --git a/tools/run_seq_setsid b/tools/run_seq_setsid
> new file mode 100755
> index 00000000000000..5938f80e689255
> --- /dev/null
> +++ b/tools/run_seq_setsid
> @@ -0,0 +1,22 @@
> +#!/bin/bash
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 Oracle. All Rights Reserved.
> +#
> +# Try starting things in a new process session so that test processes have
> +# something with which to filter only their own subprocesses.
> +
> +if [ -n "${FSTESTS_ISOL}" ]; then
> + # Allow the test to become a target of the oom killer
> + oom_knob="/proc/self/oom_score_adj"
> + test -w "${oom_knob}" && echo 250 > "${oom_knob}"
> +
> + exec "$@"
> +fi
> +
> +if [ -z "$1" ] || [ "$1" = "--help" ]; then
> + echo "Usage: $0 command [args...]"
> + exit 1
> +fi
> +
> +FSTESTS_ISOL=setsid exec setsid "$0" "$@"
The wrapper should be called 'run_setsid' because what check is
using it for has nothing to do with what the script actually does.
With that change:
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-04 21:26 ` [PATCH 15/34] check: run tests in a private pid/mount namespace Darrick J. Wong
@ 2025-02-05 0:37 ` Dave Chinner
2025-02-05 18:00 ` Darrick J. Wong
0 siblings, 1 reply; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:37 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:26:13PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> As mentioned in the previous patch, trying to isolate processes from
> separate test instances through the use of distinct Unix process
> sessions is annoying due to the many complications with signal handling.
>
> Instead, we could just use nsexec to run the test program with a private
> pid namespace so that each test instance can only see its own processes;
> and private mount namespace so that tests writing to /tmp cannot clobber
> other tests or the stuff running on the main system.
>
> However, it's not guaranteed that a particular kernel has pid and mount
> namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
> been around for a long time, but there's no hard requirement for the
> latter to be enabled in the kernel. Therefore, this bugfix slips
> namespace support in alongside the session id thing.
>
> Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> support should be a separate conversation, not something that I have to
> do in a bug fix to get mainline QA back up.
>
> Cc: <fstests@vger.kernel.org> # v2024.12.08
> Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> check | 34 +++++++++++++++++++++++-----------
> common/rc | 12 ++++++++++--
> src/nsexec.c | 18 +++++++++++++++---
> tests/generic/504 | 15 +++++++++++++--
> tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
> 5 files changed, 89 insertions(+), 18 deletions(-)
> create mode 100755 tools/run_seq_pidns
Same question as for session ids - is this all really necessary (or
desired) if check-parallel executes check in it's own private PID
namespace?
If so, then the code is fine apart from the same nit about
tools/run_seq_pidns - call it run_pidns because this helper will
also be used by check-parallel to run check in it's own private
mount and PID namespaces...
> diff --git a/tests/generic/504 b/tests/generic/504
> index 271c040e7b842a..96f18a0bbc7ba2 100755
> --- a/tests/generic/504
> +++ b/tests/generic/504
> @@ -18,7 +18,7 @@ _cleanup()
> {
> exec {test_fd}<&-
> cd /
> - rm -f $tmp.*
> + rm -r -f $tmp.*
> }
>
> # Import common functions.
> @@ -35,13 +35,24 @@ echo inode $tf_inode >> $seqres.full
>
> # Create new fd by exec
> exec {test_fd}> $testfile
> -# flock locks the fd then exits, we should see the lock info even the owner is dead
> +# flock locks the fd then exits, we should see the lock info even the owner is
> +# dead. If we're using pid namespace isolation we have to move /proc so that
> +# we can access the /proc/locks from the init_pid_ns.
> +if [ "$FSTESTS_ISOL" = "privatens" ]; then
> + move_proc="$tmp.procdir"
> + mkdir -p "$move_proc"
> + mount --move /proc "$move_proc"
> +fi
> flock -x $test_fd
> cat /proc/locks >> $seqres.full
>
> # Checking
> grep -q ":$tf_inode " /proc/locks || echo "lock info not found"
>
> +if [ -n "$move_proc" ]; then
> + mount --move "$move_proc" /proc
> +fi
> +
> # success, all done
> status=0
> echo "Silence is golden"
Urk. That explains the failure I've noticed but not had time to
debug from check-parallel when using a private pidns. Do you know
why /proc/locks in the overlaid mount does not show the locks taken
from within that namespace? Is that a bug in the namespace/lock
code?
Regardless, the code looks ok so with the helper renamed:
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 16/34] check: deprecate using process sessions to isolate test instances
2025-02-04 21:26 ` [PATCH 16/34] check: deprecate using process sessions to isolate test instances Darrick J. Wong
@ 2025-02-05 0:38 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:38 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:26:29PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> As I've noted elsewhere, the use of process session ids to "isolate"
> test instances from killing each other is kind of hacky and creates
> other weird side effects. I'd rather everyone use the new code that
> runs everything in proper isolation with private pid and mount
> namespaces, but I don't know how many people this would break were it a
> hard dependency.
>
> Deprecate the process session handling immediately with a warning that
> we're going to rip it out in a year.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Ok, that answers some of the questions I raised - the session-id
support is time limited and makes people very aware of it.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 19/34] mkfs: don't hardcode log size
2025-02-04 21:27 ` [PATCH 19/34] mkfs: don't hardcode log size Darrick J. Wong
@ 2025-02-05 0:40 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:40 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:27:15PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Commit 000813899afb46 hardcoded a log size of 256MB into xfs/501,
> xfs/502, and generic/530. This seems to be an attempt to reduce test
> run times by increasing the log size so that more background threads can
> run in parallel. Unfortunately, this breaks a couple of my test
> configurations:
>
> - External logs smaller than 256MB
> - Internal logs where the AG size is less than 256MB
>
> For example, here's seqres.full from a failed xfs/501 invocation:
>
> ** mkfs failed with extra mkfs options added to " -m metadir=2,autofsck=1,uquota,gquota,pquota, -d rtinherit=1," by test 501 **
> ** attempting to mkfs using only test 501 options: -l size=256m **
> size 256m specified for log subvolume is too large, maximum is 32768 blocks
> <snip>
> mount -ortdev=/dev/sdb4 -ologdev=/dev/sdb2 /dev/sda4 /opt failed
> umount: /dev/sda4: not mounted.
>
> Note that there's some formatting error here, so we jettison the entire
> rt configuration to force the log size option, but then mount fails
> because we didn't edit out the rtdev option there too.
>
> Fortunately, mkfs.xfs already /has/ a few options to try to improve
> parallelism in the filesystem by avoiding contention on the log grant
> heads by scaling up the log size. These options are aware of log and AG
> size constraints so they won't conflict with other geometry options.
>
> Use them.
Good solution to the problem.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount
2025-02-04 21:27 ` [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount Darrick J. Wong
@ 2025-02-05 0:42 ` Dave Chinner
2025-02-05 18:03 ` Darrick J. Wong
0 siblings, 1 reply; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:42 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:27:31PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Make the _try_scratch_mount and _test_mount helpers return the exit code
> from mount, not _prepare_for_eio_shutdown.
>
> Cc: <fstests@vger.kernel.org> # v2024.12.08
> Fixes: 1a49022fab9b4d ("fstests: always use fail-at-unmount semantics for XFS")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Ack, missed that. Though:
> ---
> common/rc | 2 ++
> 1 file changed, 2 insertions(+)
>
>
> diff --git a/common/rc b/common/rc
> index 03603a5198e3b6..56b4e7e018a8e0 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -440,6 +440,7 @@ _try_scratch_mount()
> [ $mount_ret -ne 0 ] && return $mount_ret
> _idmapped_mount $SCRATCH_DEV $SCRATCH_MNT
> _prepare_for_eio_shutdown $SCRATCH_DEV
> + return $mount_ret
These could just be 'return 0' because we've already checked
for $mount_ret being non-zero.
Regardless, it gives the same result, so:
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 22/34] generic/650: revert SOAK DURATION changes
2025-02-04 21:28 ` [PATCH 22/34] generic/650: revert SOAK DURATION changes Darrick J. Wong
@ 2025-02-05 0:43 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:43 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:28:02PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Prior to commit 8973af00ec21, in the absence of an explicit
> SOAK_DURATION, this test would run 2500 fsstress operations each of ten
> times through the loop body. On the author's machines, this kept the
> runtime to about 30s total. Oddly, this was changed to 30s per loop
> body with no specific justification in the middle of an fsstress process
> management change.
>
> On the author's machine, this explodes the runtime from ~30s to 420s.
> Put things back the way they were.
>
> Cc: <fstests@vger.kernel.org> # v2024.12.08
> Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> tests/generic/650 | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>
> diff --git a/tests/generic/650 b/tests/generic/650
> index 60f86fdf518961..2e051b73156842 100755
> --- a/tests/generic/650
> +++ b/tests/generic/650
> @@ -68,11 +68,11 @@ test "$nr_cpus" -gt 1024 && nr_cpus="$nr_hotplug_cpus"
> fsstress_args+=(-p $nr_cpus)
> if [ -n "$SOAK_DURATION" ]; then
> test "$SOAK_DURATION" -lt 10 && SOAK_DURATION=10
> + fsstress_args+=(--duration="$((SOAK_DURATION / 10))")
> else
> - # run for 30s per iteration max
> - SOAK_DURATION=300
> + # run for 3s per iteration max for a default runtime of ~30s.
> + fsstress_args+=(--duration=3)
Works for me.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails
2025-02-04 21:28 ` [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails Darrick J. Wong
@ 2025-02-05 0:44 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:44 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:28:34PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Stop the fsx scrub stress loop if fsx returns a nonzero error code.
>
> Cc: <fstests@vger.kernel.org> # v2023.01.15
> Fixes: a2056ca8917bc8 ("fuzzy: enhance scrub stress testing to use fsx")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/fuzzy | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
looks good.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error
2025-02-04 21:29 ` [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error Darrick J. Wong
@ 2025-02-05 0:45 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:45 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:29:05PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Always abort the non-remount scrub fsstress loop in
> __stress_scrub_fsstress_loop if fsstress returns a nonzero exit code.
>
> Cc: <fstests@vger.kernel.org> # v2023.01.15
> Fixes: 20df87599f66d0 ("fuzzy: make scrub stress loop control more robust")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/fuzzy | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
Looks fine.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration
2025-02-04 21:29 ` [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration Darrick J. Wong
@ 2025-02-05 0:50 ` Dave Chinner
2025-02-05 18:08 ` Darrick J. Wong
0 siblings, 1 reply; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:50 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:29:20PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Quite a while ago, I added --duration= arguments to fsx and fsstress,
> and apparently I forgot to update the scrub stress loops to use them.
> Replace the usage of timeout(1) for the remount_period versions of the
> loop to clean up that code; and convert the non-remount loop so that
> they don't run over time.
....
> @@ -1115,7 +1124,8 @@ __stress_scrub_fsstress_loop() {
> # anything.
> test "$mode" = "rw" && __stress_scrub_clean_scratch && continue
>
> - _run_fsstress_bg $args $rw_arg >> $seqres.full
> + duration=$(___stress_scrub_duration "$end" "$remount_period")
> + _run_fsstress_bg $duration $args $rw_arg >> $seqres.full
> sleep $remount_period
> _kill_fsstress
Why does this need to run fsstress in the background any more? If it
is only going to run for $remount_period, then run it in the
foreground and get rid of the sleep/kill that stopped it after
$remount_period. i.e. doesn't this:
- _run_fsstress_bg $args $rw_arg >> $seqres.full
+ duration=$(___stress_scrub_duration "$end" "$remount_period")
+ _run_fsstress $duration $args $rw_arg >> $seqres.full
- sleep $remount_period
- _kill_fsstress
do the same thing, only cleaner?
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 28/34] fix _require_scratch_duperemove ordering
2025-02-04 21:29 ` [PATCH 28/34] fix _require_scratch_duperemove ordering Darrick J. Wong
@ 2025-02-05 0:51 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:51 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:29:36PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Zorro complained that generic/559 stopped working, and I noticed that
> the duperemove invocation in the _require_scratch_duperemove function
> would fail with:
>
> Error 2: No such file or directory while getting path to file /opt/file1. Skipping.
> Error 2: No such file or directory while getting path to file /opt/file2. Skipping.
> No dedupe candidates found.
> Gathering file list...
>
> The cause of this is the incorrect placement of _require_scratch_dedupe
> after a _scratch_mount. _require_scratch_dedupe formats, mounts, tests,
> and unmounts the scratch filesystem, which means that it should not come
> between a _scratch_mount call and code that uses $SCRATCH_MNT.
>
> Cc: <fstests@vger.kernel.org> # v2024.12.22
> Fixes: 3b9f5fc7d7d853 ("common: call _require_scratch_dedupe from _require_scratch_duperemove")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 29/34] fsstress: fix a memory leak
2025-02-04 21:29 ` [PATCH 29/34] fsstress: fix a memory leak Darrick J. Wong
@ 2025-02-05 0:54 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:54 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:29:52PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Someone forgot to free the iovec that readv_f allocates.
>
> Fixes: 80499d8f5f251e ("fsstress: new writev and readv operations test")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> ltp/fsstress.c | 1 +
> 1 file changed, 1 insertion(+)
>
>
> diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> index 9e8eaa6d8656da..14c29921e8b0f8 100644
> --- a/ltp/fsstress.c
> +++ b/ltp/fsstress.c
> @@ -4652,6 +4652,7 @@ readv_f(opnum_t opno, long r)
> }
>
> e = readv(fd, iov, iovcnt) < 0 ? errno : 0;
> + free(iov);
> free(buf);
> if (v)
> printf("%d/%lld: readv %s%s [%lld,%d,%d] %d\n",
looks good - writev_f() frees the iov, so this now matches.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 30/34] fsx: fix leaked log file pointer
2025-02-04 21:30 ` [PATCH 30/34] fsx: fix leaked log file pointer Darrick J. Wong
@ 2025-02-05 0:57 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 0:57 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:30:07PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Fix a resource leaks in fsx, where we fail to close the fsx logfile,
> because the C library could have some buffered contents that aren't
> flushed when the program terminates. glibc seems to do this for us, but
> I wouldn't be so sure about the others.
>
> Fixes: 3f742550dfed84 ("fsx: add support for recording operations to a file")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> ltp/fsx.c | 1 +
> 1 file changed, 1 insertion(+)
>
>
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index d1b0f245582b31..163b9453b5418b 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -3489,6 +3489,7 @@ main(int argc, char **argv)
> if (recordops)
> logdump();
>
> + fclose(fsxlogf);
> exit(0);
> return 0;
> }
Looks fine.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument
2025-02-04 21:30 ` [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument Darrick J. Wong
@ 2025-02-05 1:00 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 1:00 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:30:23PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> fsstress -n is the number of fs operations per process, not the total
> number of operations. There's no need to factor nr_cpus into the -n
> argument because that causes excess runtime as core count increases.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> tests/generic/476 | 2 +-
> tests/generic/642 | 2 +-
> tests/generic/750 | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
Looks good. That'll certainly help keep runtime down on large
machines.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 32/34] common/config: add $here to FSSTRESS_PROG
2025-02-04 21:30 ` [PATCH 32/34] common/config: add $here to FSSTRESS_PROG Darrick J. Wong
@ 2025-02-05 1:00 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 1:00 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:30:39PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> In general we're supposed to specify full paths to fstests binaries with
> $here so that subtests can change the current working directory without
> issues.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/config | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Makes sense.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 33/34] config: add FSX_PROG variable
2025-02-04 21:30 ` [PATCH 33/34] config: add FSX_PROG variable Darrick J. Wong
@ 2025-02-05 1:04 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 1:04 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:30:54PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Replace the open-coded $here/ltp/fsx and ./ltp/fsx variants with a
> single variable.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/config | 3 +++
> common/fuzzy | 6 +++---
> common/rc | 4 ++--
> tests/generic/075 | 2 +-
> tests/generic/112 | 2 +-
> tests/generic/127 | 16 ++++++++--------
> tests/generic/231 | 4 ++--
> tests/generic/455 | 2 +-
> tests/generic/456 | 2 +-
> tests/generic/457 | 2 +-
> tests/generic/469 | 2 +-
> tests/generic/499 | 2 +-
> tests/generic/511 | 2 +-
> 13 files changed, 26 insertions(+), 23 deletions(-)
Looks fine.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 34/34] build: initialize stack variables to zero by default
2025-02-04 21:31 ` [PATCH 34/34] build: initialize stack variables to zero by default Darrick J. Wong
@ 2025-02-05 1:05 ` Dave Chinner
0 siblings, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 1:05 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:31:10PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Newer versions of gcc and clang can include the ability to zero stack
> variables by default. Let's enable it so that we (a) reduce the risk of
> writing stack contents to disk somewhere and (b) try to reduce
> unpredictable program behavior based on random stack contents. The
> kernel added this 6 years ago, so I think it's mature enough for
> fstests.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> configure.ac | 1 +
> include/builddefs.in | 3 ++-
> m4/package_libcdev.m4 | 14 ++++++++++++++
> 3 files changed, 17 insertions(+), 1 deletion(-)
Looks fine.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCHSET v2] fstests: random fixes for v2025.02.02
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
` (33 preceding siblings ...)
2025-02-04 21:31 ` [PATCH 34/34] build: initialize stack variables to zero by default Darrick J. Wong
@ 2025-02-05 12:46 ` Amir Goldstein
34 siblings, 0 replies; 68+ messages in thread
From: Amir Goldstein @ 2025-02-05 12:46 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, dchinner, joannelkoong, fstests, linux-xfs
On Tue, Feb 4, 2025 at 10:22 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> Hi all,
>
> This is an unusually large set of bug fixes for fstests. The first 20
> patches in this patchset are corrections for that RFC series.
>
> The most significant change is that I made ./check run each test with
> its own Unix process session id. This means that a test can use pkill
> to kill all of its own subprocesses, without killing anyone else's
> subprocesses. I wasn't completely sold on that approach, but it works
> for me. A better approach is to run each test in a separate pid and
> mount namespace, but then kernel support for that becomes a hard
> requirement. Both approaches seems to work for check and
> check-parallel, though I've not tested that all that much.
>
> Note: I am /not/ happy about Dave's RFC going straight to for-next
> without even a complete review right before everyone went on PTO for
> several weeks for xmas/solar new year. But in the interests of getting
> QA back on line for myself and everyone else who's having problems, here
> it is.
Thank you!!!
Amir.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT
2025-02-05 0:16 ` Dave Chinner
@ 2025-02-05 17:38 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 17:38 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 11:16:16AM +1100, Dave Chinner wrote:
> On Tue, Feb 04, 2025 at 01:25:26PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > The next patch in this series fixes various issues with the recently
> > added fstests process isolation scheme by running each new process in a
> > separate process group session. Unfortunately, the processes in the
> > session are created with SIGINT ignored by default because they are not
> > attached to the controlling terminal. Therefore, switch the kill signal
> > to SIGPIPE because that is usually fatal and not masked by default.
> >
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > ---
> > common/fuzzy | 13 ++++++-------
> > 1 file changed, 6 insertions(+), 7 deletions(-)
>
> Change looks fine, but _pkill is not yet defined. It is introduced
> in the next patch "common/rc: hoist pkill to a helper function"
> so this needs to be reordered.
Fixed.
> With that done, however:
>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Thanks for spotting that, and for the review!
--D
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 14/34] common: fix pkill by running test program in a separate session
2025-02-05 0:23 ` Dave Chinner
@ 2025-02-05 17:43 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 17:43 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 11:23:10AM +1100, Dave Chinner wrote:
> On Tue, Feb 04, 2025 at 01:25:57PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Run each test program with a separate session id so that we can tell
> > pkill to kill all processes of a given name, but only within our own
> > session id. This /should/ suffice to run multiple fstests on the same
> > machine without one instance shooting down processes of another
> > instance.
>
> I thought you were going to drop this because pidns stuff available.
> Also, because it is only check parallel that needs the pidns
> isolation, and I'm not doing that external to check. Hence we can
> just get rid of the 'pkill --parent' requirement because the
> concurrent tests are already being run in isolated PID namespaces...
>
> Regardless, if ppl still want to both pid session and pidns directly
> into check, the code is fine.
I'll make that clearer in the commit message.
> Just one little nit:
>
> > diff --git a/tools/run_seq_setsid b/tools/run_seq_setsid
> > new file mode 100755
> > index 00000000000000..5938f80e689255
> > --- /dev/null
> > +++ b/tools/run_seq_setsid
> > @@ -0,0 +1,22 @@
> > +#!/bin/bash
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 Oracle. All Rights Reserved.
> > +#
> > +# Try starting things in a new process session so that test processes have
> > +# something with which to filter only their own subprocesses.
> > +
> > +if [ -n "${FSTESTS_ISOL}" ]; then
> > + # Allow the test to become a target of the oom killer
> > + oom_knob="/proc/self/oom_score_adj"
> > + test -w "${oom_knob}" && echo 250 > "${oom_knob}"
> > +
> > + exec "$@"
> > +fi
> > +
> > +if [ -z "$1" ] || [ "$1" = "--help" ]; then
> > + echo "Usage: $0 command [args...]"
> > + exit 1
> > +fi
> > +
> > +FSTESTS_ISOL=setsid exec setsid "$0" "$@"
>
> The wrapper should be called 'run_setsid' because what check is
> using it for has nothing to do with what the script actually does.
Done.
> With that change:
>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Thank you!
--D
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-05 0:37 ` Dave Chinner
@ 2025-02-05 18:00 ` Darrick J. Wong
2025-02-05 18:19 ` Darrick J. Wong
2025-02-05 21:13 ` Dave Chinner
0 siblings, 2 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 18:00 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 11:37:00AM +1100, Dave Chinner wrote:
> On Tue, Feb 04, 2025 at 01:26:13PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > As mentioned in the previous patch, trying to isolate processes from
> > separate test instances through the use of distinct Unix process
> > sessions is annoying due to the many complications with signal handling.
> >
> > Instead, we could just use nsexec to run the test program with a private
> > pid namespace so that each test instance can only see its own processes;
> > and private mount namespace so that tests writing to /tmp cannot clobber
> > other tests or the stuff running on the main system.
> >
> > However, it's not guaranteed that a particular kernel has pid and mount
> > namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
> > been around for a long time, but there's no hard requirement for the
> > latter to be enabled in the kernel. Therefore, this bugfix slips
> > namespace support in alongside the session id thing.
> >
> > Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> > support should be a separate conversation, not something that I have to
> > do in a bug fix to get mainline QA back up.
> >
> > Cc: <fstests@vger.kernel.org> # v2024.12.08
> > Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > ---
> > check | 34 +++++++++++++++++++++++-----------
> > common/rc | 12 ++++++++++--
> > src/nsexec.c | 18 +++++++++++++++---
> > tests/generic/504 | 15 +++++++++++++--
> > tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
> > 5 files changed, 89 insertions(+), 18 deletions(-)
> > create mode 100755 tools/run_seq_pidns
>
> Same question as for session ids - is this all really necessary (or
> desired) if check-parallel executes check in it's own private PID
> namespace?
>
> If so, then the code is fine apart from the same nit about
> tools/run_seq_pidns - call it run_pidns because this helper will
> also be used by check-parallel to run check in it's own private
> mount and PID namespaces...
I prefer to name it tools/run_privatens since it creates more than just
a pid namespace. At some point we might even decide to privatize more
namespaces (e.g. do we want a private network namespace for nfs?) and I
don't want this to become lsfmmbpfbbq'd, as it were.
> > diff --git a/tests/generic/504 b/tests/generic/504
> > index 271c040e7b842a..96f18a0bbc7ba2 100755
> > --- a/tests/generic/504
> > +++ b/tests/generic/504
> > @@ -18,7 +18,7 @@ _cleanup()
> > {
> > exec {test_fd}<&-
> > cd /
> > - rm -f $tmp.*
> > + rm -r -f $tmp.*
> > }
> >
> > # Import common functions.
> > @@ -35,13 +35,24 @@ echo inode $tf_inode >> $seqres.full
> >
> > # Create new fd by exec
> > exec {test_fd}> $testfile
> > -# flock locks the fd then exits, we should see the lock info even the owner is dead
> > +# flock locks the fd then exits, we should see the lock info even the owner is
> > +# dead. If we're using pid namespace isolation we have to move /proc so that
> > +# we can access the /proc/locks from the init_pid_ns.
> > +if [ "$FSTESTS_ISOL" = "privatens" ]; then
> > + move_proc="$tmp.procdir"
> > + mkdir -p "$move_proc"
> > + mount --move /proc "$move_proc"
> > +fi
> > flock -x $test_fd
> > cat /proc/locks >> $seqres.full
> >
> > # Checking
> > grep -q ":$tf_inode " /proc/locks || echo "lock info not found"
> >
> > +if [ -n "$move_proc" ]; then
> > + mount --move "$move_proc" /proc
> > +fi
> > +
> > # success, all done
> > status=0
> > echo "Silence is golden"
>
> Urk. That explains the failure I've noticed but not had time to
> debug from check-parallel when using a private pidns. Do you know
> why /proc/locks in the overlaid mount does not show the locks taken
> from within that namespace? Is that a bug in the namespace/lock
> code?
I /think/ this happens because the code in fs/locks.c records the pid of
"flock -x $test_fd" as the owner of the lock. But then flock exits, so
that pid is no longer recorded in the pid_namespace and this code in
locks_translate_pid:
pid = find_pid_ns(fl->flc_pid, &init_pid_ns);
vnr = pid_nr_ns(pid, ns);
returns with vnr == 0, which causes locks_show to skip the lock.
However, the underlying /proc is associated with init_pid_ns, so
locks_translate_pid always returns a nonzero pid. Unfortunately, that
means we can't have tools/run_privatens unmount the /proc it inherits
before mounting the pidns-specific /proc.
I'll note this in the commit message.
> Regardless, the code looks ok so with the helper renamed:
>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Thanks!
--D
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount
2025-02-05 0:42 ` Dave Chinner
@ 2025-02-05 18:03 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 18:03 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 11:42:56AM +1100, Dave Chinner wrote:
> On Tue, Feb 04, 2025 at 01:27:31PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Make the _try_scratch_mount and _test_mount helpers return the exit code
> > from mount, not _prepare_for_eio_shutdown.
> >
> > Cc: <fstests@vger.kernel.org> # v2024.12.08
> > Fixes: 1a49022fab9b4d ("fstests: always use fail-at-unmount semantics for XFS")
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
>
> Ack, missed that. Though:
>
> > ---
> > common/rc | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> >
> > diff --git a/common/rc b/common/rc
> > index 03603a5198e3b6..56b4e7e018a8e0 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -440,6 +440,7 @@ _try_scratch_mount()
> > [ $mount_ret -ne 0 ] && return $mount_ret
> > _idmapped_mount $SCRATCH_DEV $SCRATCH_MNT
> > _prepare_for_eio_shutdown $SCRATCH_DEV
> > + return $mount_ret
>
> These could just be 'return 0' because we've already checked
> for $mount_ret being non-zero.
Will change.
> Regardless, it gives the same result, so:
>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Thanks!
--D
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO
2025-02-04 21:24 ` [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO Darrick J. Wong
@ 2025-02-05 18:06 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 18:06 UTC (permalink / raw)
To: zlang; +Cc: joannelkoong, fstests, linux-xfs
On Tue, Feb 04, 2025 at 01:24:54PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> On an arm64 VM with 64k base pages and a paltry 8G of RAM, this test
> will frequently fail like this:
>
> > QA output created by 759
> > fsx -N 10000 -l 500000 -h
> > -fsx -N 10000 -o 8192 -l 500000 -h
> > -fsx -N 10000 -o 128000 -l 500000 -h
> > +Seed set to 1
> > +madvise collapse for buf: Cannot allocate memory
> > +init_hugepages_buf failed for good_buf: Cannot allocate memory
>
> This system has a 512MB hugepage size, which means that there's a good
> chance that memory is so fragmented that we won't be able to create a
> huge page (in 1/16th the available DRAM). Create a _run_hugepage_fsx
> helper that will detect this situation at the start of the test and skip
> it, having refactored run_fsx into a properly namespaced version that
> won't exit the test on failure.
>
> Cc: <fstests@vger.kernel.org> # v2025.02.02
> Cc: joannelkoong@gmail.com
> Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> common/rc | 34 ++++++++++++++++++++++++++++++----
> ltp/fsx.c | 6 ++++--
> tests/generic/759 | 6 +++---
> tests/generic/760 | 6 +++---
> 4 files changed, 40 insertions(+), 12 deletions(-)
>
>
> diff --git a/common/rc b/common/rc
> index b7736173e6e839..4005db776309f3 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -4982,20 +4982,46 @@ _require_hugepage_fsx()
> _notrun "fsx binary does not support MADV_COLLAPSE"
> }
>
> -run_fsx()
> +_run_fsx()
> {
> - echo fsx $@
> + echo "fsx $*"
> local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> set -- $here/ltp/fsx $args $FSX_AVOID $TEST_DIR/junk
> echo "$@" >>$seqres.full
> rm -f $TEST_DIR/junk
> "$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
> - if [ ${PIPESTATUS[0]} -ne 0 ]; then
> + local res=${PIPESTATUS[0]}
> + if [ $res -ne 0 ]; then
> cat $tmp.fsx
> rm -f $tmp.fsx
> - exit 1
> + return $res
> fi
> rm -f $tmp.fsx
> + return 0
> +}
> +
> +# Run fsx with -h(ugepage buffers). If we can't set up a hugepage then skip
> +# the test, but if any other error occurs then exit the test.
> +_run_hugepage_fsx() {
> + _run_fsx "$@" -h &> $tmp.hugepage_fsx
> + local res=$?
> + if [ $res -eq 103 ]; then
> + # According to the MADV_COLLAPSE manpage, these three errors
> + # can happen if the kernel could not collapse a collection of
> + # pages into a single huge page.
> + grep -q -E ' for hugebuf: (Cannot allocate memory|Device or resource busy|Resource temporarily unavailable)' $tmp.hugepage_fsx && \
> + _notrun "Could not set up huge page for test"
> + fi
> + cat $tmp.hugepage_fsx
> + rm -f $tmp.hugepage_fsx
> + test $res -ne 0 && exit 1
> + return 0
> +}
> +
> +# run fsx or exit the test
> +run_fsx()
> +{
> + _run_fsx || exit 1
This of course should read:
_run_fsx "$@" || exit 1
and will be corrected in the next published draft.
--D
> }
>
> _require_statx()
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index cf9502a74c17a7..d1b0f245582b31 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -2974,13 +2974,15 @@ init_hugepages_buf(unsigned len, int hugepage_size, int alignment, long *buf_siz
>
> ret = posix_memalign(&buf, hugepage_size, size);
> if (ret) {
> - prterr("posix_memalign for buf");
> + /* common/rc greps this error message */
> + prterr("posix_memalign for hugebuf");
> return NULL;
> }
> memset(buf, '\0', size);
> ret = madvise(buf, size, MADV_COLLAPSE);
> if (ret) {
> - prterr("madvise collapse for buf");
> + /* common/rc greps this error message */
> + prterr("madvise collapse for hugebuf");
> free(buf);
> return NULL;
> }
> diff --git a/tests/generic/759 b/tests/generic/759
> index a7dec155056abc..49c02214559a55 100755
> --- a/tests/generic/759
> +++ b/tests/generic/759
> @@ -15,9 +15,9 @@ _require_test
> _require_thp
> _require_hugepage_fsx
>
> -run_fsx -N 10000 -l 500000 -h
> -run_fsx -N 10000 -o 8192 -l 500000 -h
> -run_fsx -N 10000 -o 128000 -l 500000 -h
> +_run_hugepage_fsx -N 10000 -l 500000
> +_run_hugepage_fsx -N 10000 -o 8192 -l 500000
> +_run_hugepage_fsx -N 10000 -o 128000 -l 500000
>
> status=0
> exit
> diff --git a/tests/generic/760 b/tests/generic/760
> index 4781a8d1eec4ec..f270636e56a377 100755
> --- a/tests/generic/760
> +++ b/tests/generic/760
> @@ -19,9 +19,9 @@ _require_hugepage_fsx
> psize=`$here/src/feature -s`
> bsize=`$here/src/min_dio_alignment $TEST_DIR $TEST_DEV`
>
> -run_fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -run_fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -run_fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> +_run_hugepage_fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> +_run_hugepage_fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
> +_run_hugepage_fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
>
> status=0
> exit
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration
2025-02-05 0:50 ` Dave Chinner
@ 2025-02-05 18:08 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 18:08 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 11:50:52AM +1100, Dave Chinner wrote:
> On Tue, Feb 04, 2025 at 01:29:20PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Quite a while ago, I added --duration= arguments to fsx and fsstress,
> > and apparently I forgot to update the scrub stress loops to use them.
> > Replace the usage of timeout(1) for the remount_period versions of the
> > loop to clean up that code; and convert the non-remount loop so that
> > they don't run over time.
>
> ....
>
> > @@ -1115,7 +1124,8 @@ __stress_scrub_fsstress_loop() {
> > # anything.
> > test "$mode" = "rw" && __stress_scrub_clean_scratch && continue
> >
> > - _run_fsstress_bg $args $rw_arg >> $seqres.full
> > + duration=$(___stress_scrub_duration "$end" "$remount_period")
> > + _run_fsstress_bg $duration $args $rw_arg >> $seqres.full
> > sleep $remount_period
> > _kill_fsstress
>
> Why does this need to run fsstress in the background any more? If it
> is only going to run for $remount_period, then run it in the
> foreground and get rid of the sleep/kill that stopped it after
> $remount_period. i.e. doesn't this:
>
> - _run_fsstress_bg $args $rw_arg >> $seqres.full
> + duration=$(___stress_scrub_duration "$end" "$remount_period")
> + _run_fsstress $duration $args $rw_arg >> $seqres.full
> - sleep $remount_period
> - _kill_fsstress
>
> do the same thing, only cleaner?
It does, I was probably just moving too fast with the mechanical changes
when I wrote this. Will fix for the next draft.
--D
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-05 18:00 ` Darrick J. Wong
@ 2025-02-05 18:19 ` Darrick J. Wong
2025-02-05 21:15 ` Dave Chinner
2025-02-05 21:13 ` Dave Chinner
1 sibling, 1 reply; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 18:19 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 10:00:48AM -0800, Darrick J. Wong wrote:
> On Wed, Feb 05, 2025 at 11:37:00AM +1100, Dave Chinner wrote:
> > On Tue, Feb 04, 2025 at 01:26:13PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > >
> > > As mentioned in the previous patch, trying to isolate processes from
> > > separate test instances through the use of distinct Unix process
> > > sessions is annoying due to the many complications with signal handling.
> > >
> > > Instead, we could just use nsexec to run the test program with a private
> > > pid namespace so that each test instance can only see its own processes;
> > > and private mount namespace so that tests writing to /tmp cannot clobber
> > > other tests or the stuff running on the main system.
> > >
> > > However, it's not guaranteed that a particular kernel has pid and mount
> > > namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
> > > been around for a long time, but there's no hard requirement for the
> > > latter to be enabled in the kernel. Therefore, this bugfix slips
> > > namespace support in alongside the session id thing.
> > >
> > > Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> > > support should be a separate conversation, not something that I have to
> > > do in a bug fix to get mainline QA back up.
> > >
> > > Cc: <fstests@vger.kernel.org> # v2024.12.08
> > > Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > > ---
> > > check | 34 +++++++++++++++++++++++-----------
> > > common/rc | 12 ++++++++++--
> > > src/nsexec.c | 18 +++++++++++++++---
> > > tests/generic/504 | 15 +++++++++++++--
> > > tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
> > > 5 files changed, 89 insertions(+), 18 deletions(-)
> > > create mode 100755 tools/run_seq_pidns
> >
> > Same question as for session ids - is this all really necessary (or
> > desired) if check-parallel executes check in it's own private PID
> > namespace?
Forgot to respond to this question --
Because check-parallel runs (will run?) each child ./check instance in a
private namepsace, each of those instances will be isolated from the
others. So no, it's probably not absolutely necessary.
However, there are a couple of reasons to let it happen: (a) the private
ns that ./check creates in _run_seq() isolates the actual test code from
its parent ./check process; and (b) the process started by nsexec is
considered to be the "init" process for that namespace, so when it dies,
the kernel will kill -9 all other processes in that namespace, so we
won't have any stray fsstress processes that bleed into the next test.
--D
> > If so, then the code is fine apart from the same nit about
> > tools/run_seq_pidns - call it run_pidns because this helper will
> > also be used by check-parallel to run check in it's own private
> > mount and PID namespaces...
>
> I prefer to name it tools/run_privatens since it creates more than just
> a pid namespace. At some point we might even decide to privatize more
> namespaces (e.g. do we want a private network namespace for nfs?) and I
> don't want this to become lsfmmbpfbbq'd, as it were.
>
> > > diff --git a/tests/generic/504 b/tests/generic/504
> > > index 271c040e7b842a..96f18a0bbc7ba2 100755
> > > --- a/tests/generic/504
> > > +++ b/tests/generic/504
> > > @@ -18,7 +18,7 @@ _cleanup()
> > > {
> > > exec {test_fd}<&-
> > > cd /
> > > - rm -f $tmp.*
> > > + rm -r -f $tmp.*
> > > }
> > >
> > > # Import common functions.
> > > @@ -35,13 +35,24 @@ echo inode $tf_inode >> $seqres.full
> > >
> > > # Create new fd by exec
> > > exec {test_fd}> $testfile
> > > -# flock locks the fd then exits, we should see the lock info even the owner is dead
> > > +# flock locks the fd then exits, we should see the lock info even the owner is
> > > +# dead. If we're using pid namespace isolation we have to move /proc so that
> > > +# we can access the /proc/locks from the init_pid_ns.
> > > +if [ "$FSTESTS_ISOL" = "privatens" ]; then
> > > + move_proc="$tmp.procdir"
> > > + mkdir -p "$move_proc"
> > > + mount --move /proc "$move_proc"
> > > +fi
> > > flock -x $test_fd
> > > cat /proc/locks >> $seqres.full
> > >
> > > # Checking
> > > grep -q ":$tf_inode " /proc/locks || echo "lock info not found"
> > >
> > > +if [ -n "$move_proc" ]; then
> > > + mount --move "$move_proc" /proc
> > > +fi
> > > +
> > > # success, all done
> > > status=0
> > > echo "Silence is golden"
> >
> > Urk. That explains the failure I've noticed but not had time to
> > debug from check-parallel when using a private pidns. Do you know
> > why /proc/locks in the overlaid mount does not show the locks taken
> > from within that namespace? Is that a bug in the namespace/lock
> > code?
>
> I /think/ this happens because the code in fs/locks.c records the pid of
> "flock -x $test_fd" as the owner of the lock. But then flock exits, so
> that pid is no longer recorded in the pid_namespace and this code in
> locks_translate_pid:
>
> pid = find_pid_ns(fl->flc_pid, &init_pid_ns);
> vnr = pid_nr_ns(pid, ns);
>
> returns with vnr == 0, which causes locks_show to skip the lock.
> However, the underlying /proc is associated with init_pid_ns, so
> locks_translate_pid always returns a nonzero pid. Unfortunately, that
> means we can't have tools/run_privatens unmount the /proc it inherits
> before mounting the pidns-specific /proc.
>
> I'll note this in the commit message.
>
> > Regardless, the code looks ok so with the helper renamed:
> >
> > Reviewed-by: Dave Chinner <dchinner@redhat.com>
>
> Thanks!
>
> --D
>
> > --
> > Dave Chinner
> > david@fromorbit.com
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-05 18:00 ` Darrick J. Wong
2025-02-05 18:19 ` Darrick J. Wong
@ 2025-02-05 21:13 ` Dave Chinner
1 sibling, 0 replies; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 21:13 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 10:00:48AM -0800, Darrick J. Wong wrote:
> On Wed, Feb 05, 2025 at 11:37:00AM +1100, Dave Chinner wrote:
> > On Tue, Feb 04, 2025 at 01:26:13PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > >
> > > As mentioned in the previous patch, trying to isolate processes from
> > > separate test instances through the use of distinct Unix process
> > > sessions is annoying due to the many complications with signal handling.
> > >
> > > Instead, we could just use nsexec to run the test program with a private
> > > pid namespace so that each test instance can only see its own processes;
> > > and private mount namespace so that tests writing to /tmp cannot clobber
> > > other tests or the stuff running on the main system.
> > >
> > > However, it's not guaranteed that a particular kernel has pid and mount
> > > namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
> > > been around for a long time, but there's no hard requirement for the
> > > latter to be enabled in the kernel. Therefore, this bugfix slips
> > > namespace support in alongside the session id thing.
> > >
> > > Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> > > support should be a separate conversation, not something that I have to
> > > do in a bug fix to get mainline QA back up.
> > >
> > > Cc: <fstests@vger.kernel.org> # v2024.12.08
> > > Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > > ---
> > > check | 34 +++++++++++++++++++++++-----------
> > > common/rc | 12 ++++++++++--
> > > src/nsexec.c | 18 +++++++++++++++---
> > > tests/generic/504 | 15 +++++++++++++--
> > > tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
> > > 5 files changed, 89 insertions(+), 18 deletions(-)
> > > create mode 100755 tools/run_seq_pidns
> >
> > Same question as for session ids - is this all really necessary (or
> > desired) if check-parallel executes check in it's own private PID
> > namespace?
> >
> > If so, then the code is fine apart from the same nit about
> > tools/run_seq_pidns - call it run_pidns because this helper will
> > also be used by check-parallel to run check in it's own private
> > mount and PID namespaces...
>
> I prefer to name it tools/run_privatens since it creates more than just
> a pid namespace.
I'm fine with that. It was only the "seq" part of the name that
triggered me.
> At some point we might even decide to privatize more
> namespaces (e.g. do we want a private network namespace for nfs?) and I
> don't want this to become lsfmmbpfbbq'd, as it were.
*nod*
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-05 18:19 ` Darrick J. Wong
@ 2025-02-05 21:15 ` Dave Chinner
2025-02-05 21:25 ` Darrick J. Wong
0 siblings, 1 reply; 68+ messages in thread
From: Dave Chinner @ 2025-02-05 21:15 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs
On Wed, Feb 05, 2025 at 10:19:59AM -0800, Darrick J. Wong wrote:
> On Wed, Feb 05, 2025 at 10:00:48AM -0800, Darrick J. Wong wrote:
> > On Wed, Feb 05, 2025 at 11:37:00AM +1100, Dave Chinner wrote:
> > > On Tue, Feb 04, 2025 at 01:26:13PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <djwong@kernel.org>
> > > >
> > > > As mentioned in the previous patch, trying to isolate processes from
> > > > separate test instances through the use of distinct Unix process
> > > > sessions is annoying due to the many complications with signal handling.
> > > >
> > > > Instead, we could just use nsexec to run the test program with a private
> > > > pid namespace so that each test instance can only see its own processes;
> > > > and private mount namespace so that tests writing to /tmp cannot clobber
> > > > other tests or the stuff running on the main system.
> > > >
> > > > However, it's not guaranteed that a particular kernel has pid and mount
> > > > namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
> > > > been around for a long time, but there's no hard requirement for the
> > > > latter to be enabled in the kernel. Therefore, this bugfix slips
> > > > namespace support in alongside the session id thing.
> > > >
> > > > Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> > > > support should be a separate conversation, not something that I have to
> > > > do in a bug fix to get mainline QA back up.
> > > >
> > > > Cc: <fstests@vger.kernel.org> # v2024.12.08
> > > > Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> > > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > > > ---
> > > > check | 34 +++++++++++++++++++++++-----------
> > > > common/rc | 12 ++++++++++--
> > > > src/nsexec.c | 18 +++++++++++++++---
> > > > tests/generic/504 | 15 +++++++++++++--
> > > > tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
> > > > 5 files changed, 89 insertions(+), 18 deletions(-)
> > > > create mode 100755 tools/run_seq_pidns
> > >
> > > Same question as for session ids - is this all really necessary (or
> > > desired) if check-parallel executes check in it's own private PID
> > > namespace?
>
> Forgot to respond to this question --
>
> Because check-parallel runs (will run?) each child ./check instance in a
> private namepsace, each of those instances will be isolated from the
> others. So no, it's probably not absolutely necessary.
>
> However, there are a couple of reasons to let it happen: (a) the private
> ns that ./check creates in _run_seq() isolates the actual test code from
> its parent ./check process; and (b) the process started by nsexec is
> considered to be the "init" process for that namespace, so when it dies,
> the kernel will kill -9 all other processes in that namespace, so we
> won't have any stray fsstress processes that bleed into the next test.
Ok - that might be worth adding to the commit message so that anyone
looking at it in a few months time doesn't need to remember this
detail.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 15/34] check: run tests in a private pid/mount namespace
2025-02-05 21:15 ` Dave Chinner
@ 2025-02-05 21:25 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-05 21:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: zlang, fstests, linux-xfs
On Thu, Feb 06, 2025 at 08:15:03AM +1100, Dave Chinner wrote:
> On Wed, Feb 05, 2025 at 10:19:59AM -0800, Darrick J. Wong wrote:
> > On Wed, Feb 05, 2025 at 10:00:48AM -0800, Darrick J. Wong wrote:
> > > On Wed, Feb 05, 2025 at 11:37:00AM +1100, Dave Chinner wrote:
> > > > On Tue, Feb 04, 2025 at 01:26:13PM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > >
> > > > > As mentioned in the previous patch, trying to isolate processes from
> > > > > separate test instances through the use of distinct Unix process
> > > > > sessions is annoying due to the many complications with signal handling.
> > > > >
> > > > > Instead, we could just use nsexec to run the test program with a private
> > > > > pid namespace so that each test instance can only see its own processes;
> > > > > and private mount namespace so that tests writing to /tmp cannot clobber
> > > > > other tests or the stuff running on the main system.
> > > > >
> > > > > However, it's not guaranteed that a particular kernel has pid and mount
> > > > > namespaces enabled. Mount (2.4.19) and pid (2.6.24) namespaces have
> > > > > been around for a long time, but there's no hard requirement for the
> > > > > latter to be enabled in the kernel. Therefore, this bugfix slips
> > > > > namespace support in alongside the session id thing.
> > > > >
> > > > > Declaring CONFIG_PID_NS=n a deprecated configuration and removing
> > > > > support should be a separate conversation, not something that I have to
> > > > > do in a bug fix to get mainline QA back up.
> > > > >
> > > > > Cc: <fstests@vger.kernel.org> # v2024.12.08
> > > > > Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
> > > > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > > > > ---
> > > > > check | 34 +++++++++++++++++++++++-----------
> > > > > common/rc | 12 ++++++++++--
> > > > > src/nsexec.c | 18 +++++++++++++++---
> > > > > tests/generic/504 | 15 +++++++++++++--
> > > > > tools/run_seq_pidns | 28 ++++++++++++++++++++++++++++
> > > > > 5 files changed, 89 insertions(+), 18 deletions(-)
> > > > > create mode 100755 tools/run_seq_pidns
> > > >
> > > > Same question as for session ids - is this all really necessary (or
> > > > desired) if check-parallel executes check in it's own private PID
> > > > namespace?
> >
> > Forgot to respond to this question --
> >
> > Because check-parallel runs (will run?) each child ./check instance in a
> > private namepsace, each of those instances will be isolated from the
> > others. So no, it's probably not absolutely necessary.
> >
> > However, there are a couple of reasons to let it happen: (a) the private
> > ns that ./check creates in _run_seq() isolates the actual test code from
> > its parent ./check process; and (b) the process started by nsexec is
> > considered to be the "init" process for that namespace, so when it dies,
> > the kernel will kill -9 all other processes in that namespace, so we
> > won't have any stray fsstress processes that bleed into the next test.
>
> Ok - that might be worth adding to the commit message so that anyone
> looking at it in a few months time doesn't need to remember this
> detail.
It /does/ say that in _run_seq() but yeah I'll add it to the commit
message too:
"Instead, we could just use nsexec to run the test program with a
private pid namespace so that each test instance can only see its own
processes; and private mount namespace so that tests writing to /tmp
cannot clobber other tests or the stuff running on the main system.
Further, the process created by the clone(CLONE_NEWPID) call is
considered the init process of that pid namespace, so all processes will
be SIGKILL'd when the init process terminates, so we no longer need
systemd scopes for externally enforced cleanup."
--D
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH 30/34] fsx: fix leaked log file pointer
2025-02-12 3:30 [PATCHSET v3] " Darrick J. Wong
@ 2025-02-12 3:38 ` Darrick J. Wong
0 siblings, 0 replies; 68+ messages in thread
From: Darrick J. Wong @ 2025-02-12 3:38 UTC (permalink / raw)
To: djwong, zlang; +Cc: dchinner, fstests, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Fix a resource leaks in fsx, where we fail to close the fsx logfile,
because the C library could have some buffered contents that aren't
flushed when the program terminates. glibc seems to do this for us, but
I wouldn't be so sure about the others.
Fixes: 3f742550dfed84 ("fsx: add support for recording operations to a file")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
ltp/fsx.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/ltp/fsx.c b/ltp/fsx.c
index d1b0f245582b31..163b9453b5418b 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -3489,6 +3489,7 @@ main(int argc, char **argv)
if (recordops)
logdump();
+ fclose(fsxlogf);
exit(0);
return 0;
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
end of thread, other threads:[~2025-02-12 3:38 UTC | newest]
Thread overview: 68+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-04 21:22 [PATCHSET v2] fstests: random fixes for v2025.02.02 Darrick J. Wong
2025-02-04 21:22 ` [PATCH 01/34] generic/476: fix fsstress process management Darrick J. Wong
2025-02-04 21:22 ` [PATCH 02/34] metadump: make non-local function variables more obvious Darrick J. Wong
2025-02-04 21:23 ` [PATCH 03/34] metadump: fix cleanup for v1 metadump testing Darrick J. Wong
2025-02-04 21:23 ` [PATCH 04/34] generic/019: don't fail if fio crashes while shutting down Darrick J. Wong
2025-02-04 21:23 ` [PATCH 05/34] fuzzy: do not set _FSSTRESS_PID when exercising fsx Darrick J. Wong
2025-02-04 21:23 ` [PATCH 06/34] common/rc: revert recursive unmount in _clear_mount_stack Darrick J. Wong
2025-02-05 0:07 ` Dave Chinner
2025-02-04 21:24 ` [PATCH 07/34] common/dump: don't replace pids arbitrarily Darrick J. Wong
2025-02-05 0:09 ` Dave Chinner
2025-02-04 21:24 ` [PATCH 08/34] common/populate: correct the parent pointer name creation formulae Darrick J. Wong
2025-02-04 21:24 ` [PATCH 09/34] generic/759,760: fix MADV_COLLAPSE detection and inclusion Darrick J. Wong
2025-02-04 21:24 ` [PATCH 10/34] generic/759,760: skip test if we can't set up a hugepage for IO Darrick J. Wong
2025-02-05 18:06 ` Darrick J. Wong
2025-02-04 21:25 ` [PATCH 11/34] common/rc: create a wrapper for the su command Darrick J. Wong
2025-02-05 0:14 ` Dave Chinner
2025-02-04 21:25 ` [PATCH 12/34] fuzzy: kill subprocesses with SIGPIPE, not SIGINT Darrick J. Wong
2025-02-05 0:16 ` Dave Chinner
2025-02-05 17:38 ` Darrick J. Wong
2025-02-04 21:25 ` [PATCH 13/34] common/rc: hoist pkill to a helper function Darrick J. Wong
2025-02-05 0:17 ` Dave Chinner
2025-02-04 21:25 ` [PATCH 14/34] common: fix pkill by running test program in a separate session Darrick J. Wong
2025-02-05 0:23 ` Dave Chinner
2025-02-05 17:43 ` Darrick J. Wong
2025-02-04 21:26 ` [PATCH 15/34] check: run tests in a private pid/mount namespace Darrick J. Wong
2025-02-05 0:37 ` Dave Chinner
2025-02-05 18:00 ` Darrick J. Wong
2025-02-05 18:19 ` Darrick J. Wong
2025-02-05 21:15 ` Dave Chinner
2025-02-05 21:25 ` Darrick J. Wong
2025-02-05 21:13 ` Dave Chinner
2025-02-04 21:26 ` [PATCH 16/34] check: deprecate using process sessions to isolate test instances Darrick J. Wong
2025-02-05 0:38 ` Dave Chinner
2025-02-04 21:26 ` [PATCH 17/34] common/rc: don't copy fsstress to $TEST_DIR Darrick J. Wong
2025-02-04 21:27 ` [PATCH 18/34] unmount: resume logging of stdout and stderr for filtering Darrick J. Wong
2025-02-04 21:27 ` [PATCH 19/34] mkfs: don't hardcode log size Darrick J. Wong
2025-02-05 0:40 ` Dave Chinner
2025-02-04 21:27 ` [PATCH 20/34] common/rc: return mount_ret in _try_scratch_mount Darrick J. Wong
2025-02-05 0:42 ` Dave Chinner
2025-02-05 18:03 ` Darrick J. Wong
2025-02-04 21:27 ` [PATCH 21/34] preamble: fix missing _kill_fsstress Darrick J. Wong
2025-02-04 21:28 ` [PATCH 22/34] generic/650: revert SOAK DURATION changes Darrick J. Wong
2025-02-05 0:43 ` Dave Chinner
2025-02-04 21:28 ` [PATCH 23/34] generic/032: fix pinned mount failure Darrick J. Wong
2025-02-04 21:28 ` [PATCH 24/34] fuzzy: stop __stress_scrub_fsx_loop if fsx fails Darrick J. Wong
2025-02-05 0:44 ` Dave Chinner
2025-02-04 21:28 ` [PATCH 25/34] fuzzy: don't use readarray for xfsfind output Darrick J. Wong
2025-02-04 21:29 ` [PATCH 26/34] fuzzy: always stop the scrub fsstress loop on error Darrick J. Wong
2025-02-05 0:45 ` Dave Chinner
2025-02-04 21:29 ` [PATCH 27/34] fuzzy: port fsx and fsstress loop to use --duration Darrick J. Wong
2025-02-05 0:50 ` Dave Chinner
2025-02-05 18:08 ` Darrick J. Wong
2025-02-04 21:29 ` [PATCH 28/34] fix _require_scratch_duperemove ordering Darrick J. Wong
2025-02-05 0:51 ` Dave Chinner
2025-02-04 21:29 ` [PATCH 29/34] fsstress: fix a memory leak Darrick J. Wong
2025-02-05 0:54 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 30/34] fsx: fix leaked log file pointer Darrick J. Wong
2025-02-05 0:57 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 31/34] misc: don't put nr_cpus into the fsstress -n argument Darrick J. Wong
2025-02-05 1:00 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 32/34] common/config: add $here to FSSTRESS_PROG Darrick J. Wong
2025-02-05 1:00 ` Dave Chinner
2025-02-04 21:30 ` [PATCH 33/34] config: add FSX_PROG variable Darrick J. Wong
2025-02-05 1:04 ` Dave Chinner
2025-02-04 21:31 ` [PATCH 34/34] build: initialize stack variables to zero by default Darrick J. Wong
2025-02-05 1:05 ` Dave Chinner
2025-02-05 12:46 ` [PATCHSET v2] fstests: random fixes for v2025.02.02 Amir Goldstein
-- strict thread matches above, loose matches on Subject: below --
2025-02-12 3:30 [PATCHSET v3] " Darrick J. Wong
2025-02-12 3:38 ` [PATCH 30/34] fsx: fix leaked log file pointer Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox