From: Amir Goldstein <amir73il@gmail.com>
To: Zorro Lang <zlang@redhat.com>
Cc: "Darrick J . Wong" <darrick.wong@oracle.com>,
Dave Chinner <david@fromorbit.com>,
fstests@vger.kernel.org
Subject: [PATCH v2 3/3] xfs/{422,517}: kill background jobs on test termination
Date: Tue, 21 Jun 2022 20:37:29 +0300 [thread overview]
Message-ID: <20220621173729.2135249-4-amir73il@gmail.com> (raw)
In-Reply-To: <20220621173729.2135249-1-amir73il@gmail.com>
Those tests failed to cleanup background jobs properly after test
is interrupted and even sometimes when it completed succefully.
xfs/517 would sometime fails randomally with this false positive error:
QA output created by 517
Format and populate
Concurrent fsmap and freeze
+Terminated
Test done
The tests have several background sub-shells that spawn short lived
programs in a loop. By killing the spawned programs using killall,
killall could find no process to kill and the sub-shell loop could still
spawn another process that is not going to be killed and in the worst
case, the freeze_loop() could spawn the xfs_io "freeze" command after
test has thawn the fs before exit, which leaves the fs frozen after the
test.
The "Terminated" output is emitted by the sub-shell when killing the
programs that it has spawned when the loop did not finish before test
timeout. By killing the sub-shell and not the spawned programs, we
avoid the false positive "Terminated" error.
Use a helper to perform this cleanup dance:
First kill and wait the freeze_loop so it won't try to freeze fs again
Then make sure fs is not frozen.
Then kill and wait for the rest of the sub-shells, because
if fs is frozen a killed writer process will never exit.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
tests/xfs/422 | 32 ++++++++++++++++++++++++++++++--
tests/xfs/517 | 26 +++++++++++++++++++++++---
2 files changed, 53 insertions(+), 5 deletions(-)
diff --git a/tests/xfs/422 b/tests/xfs/422
index a83a66df..fdbb8bf1 100755
--- a/tests/xfs/422
+++ b/tests/xfs/422
@@ -13,6 +13,32 @@ _begin_fstest dangerous_scrub dangerous_online_repair freeze
_register_cleanup "_cleanup" BUS
+# First kill and wait the freeze loop so it won't try to freeze fs again
+# Then make sure fs is not frozen
+# Then kill and wait for the rest of the workers
+# Because if fs is frozen a killed writer will never exit
+kill_loops() {
+ local sig=$1
+
+ [ -n "$freeze_pid" ] && kill $sig $freeze_pid
+ wait $freeze_pid
+ unset freeze_pid
+ $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT
+ [ -n "$stress_pid" ] && kill $sig $stress_pid
+ [ -n "$repair_pid" ] && kill $sig $repair_pid
+ wait
+ unset stress_pid
+ unset repair_pid
+}
+
+# Override the default cleanup function.
+_cleanup()
+{
+ kill_loops -9 > /dev/null 2>&1
+ cd /
+ rm -rf $tmp.*
+}
+
# Import common functions.
. ./common/filter
. ./common/fuzzy
@@ -78,8 +104,11 @@ end=$((start + (30 * TIME_FACTOR) ))
echo "Loop started at $(date --date="@${start}"), ending at $(date --date="@${end}")" >> $seqres.full
stress_loop $end &
+stress_pid=$!
freeze_loop $end &
+freeze_pid=$!
repair_loop $end &
+repair_pid=$!
# Wait until 2 seconds after the loops should have finished...
while [ "$(date +%s)" -lt $((end + 2)) ]; do
@@ -87,8 +116,7 @@ while [ "$(date +%s)" -lt $((end + 2)) ]; do
done
# ...and clean up after the loops in case they didn't do it themselves.
-$KILLALL_PROG -TERM xfs_io fsstress >> $seqres.full 2>&1
-$XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1
+kill_loops >> $seqres.full 2>&1
echo "Loop finished at $(date)" >> $seqres.full
echo "Test done"
diff --git a/tests/xfs/517 b/tests/xfs/517
index f7f9a8a2..6877af13 100755
--- a/tests/xfs/517
+++ b/tests/xfs/517
@@ -11,11 +11,29 @@ _begin_fstest auto quick fsmap freeze
_register_cleanup "_cleanup" BUS
+# First kill and wait the freeze loop so it won't try to freeze fs again
+# Then make sure fs is not frozen
+# Then kill and wait for the rest of the workers
+# Because if fs is frozen a killed writer will never exit
+kill_loops() {
+ local sig=$1
+
+ [ -n "$freeze_pid" ] && kill $sig $freeze_pid
+ wait $freeze_pid
+ unset freeze_pid
+ $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT
+ [ -n "$stress_pid" ] && kill $sig $stress_pid
+ [ -n "$fsmap_pid" ] && kill $sig $fsmap_pid
+ wait
+ unset stress_pid
+ unset fsmap_pid
+}
+
# Override the default cleanup function.
_cleanup()
{
+ kill_loops -9 > /dev/null 2>&1
cd /
- $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT > /dev/null 2>&1
rm -rf $tmp.*
}
@@ -83,8 +101,11 @@ end=$((start + (30 * TIME_FACTOR) ))
echo "Loop started at $(date --date="@${start}"), ending at $(date --date="@${end}")" >> $seqres.full
stress_loop $end &
+stress_pid=$!
freeze_loop $end &
+freeze_pid=$!
fsmap_loop $end &
+fsmap_pid=$!
# Wait until 2 seconds after the loops should have finished...
while [ "$(date +%s)" -lt $((end + 2)) ]; do
@@ -92,8 +113,7 @@ while [ "$(date +%s)" -lt $((end + 2)) ]; do
done
# ...and clean up after the loops in case they didn't do it themselves.
-$KILLALL_PROG -TERM xfs_io fsstress >> $seqres.full 2>&1
-$XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1
+kill_loops >> $seqres.full 2>&1
echo "Loop finished at $(date)" >> $seqres.full
echo "Test done"
--
2.25.1
prev parent reply other threads:[~2022-06-21 17:37 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-21 17:37 [PATCH v2 0/3] aborted fstests may leave frozen fs behind Amir Goldstein
2022-06-21 17:37 ` [PATCH v2 1/3] fstests: add missing _require_freeze() to tests Amir Goldstein
2022-06-24 15:26 ` Zorro Lang
2022-06-21 17:37 ` [PATCH v2 2/3] fstests: unfreeze fs on cleanup routines Amir Goldstein
2022-06-24 16:06 ` Zorro Lang
2022-06-21 17:37 ` Amir Goldstein [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220621173729.2135249-4-amir73il@gmail.com \
--to=amir73il@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox