Re: [PATCH 16/40] fstests: use udevadm wait in preference to settle

public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: fstests@vger.kernel.org
Subject: Re: [PATCH 16/40] fstests: use udevadm wait in preference to settle
Date: Fri, 29 Nov 2024 09:10:09 -0800	[thread overview]
Message-ID: <20241129171009.GF9425@frogsfrogsfrogs> (raw)
In-Reply-To: <20241127045403.3665299-17-david@fromorbit.com>

On Wed, Nov 27, 2024 at 03:51:46PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When running lots of tests in parallel, there are lots of
> filesystems and block devices changing state. This generates a lot
> of udev events when means the udev event queue is rarely empty.
> Unfortunately, an empty event queue is what udev settling waits
> upon. Hence calling UDEV_SETTLE_PROG can mean waiting for a lot of
> time for other tests to stop generating udev events.
> 
> For the majority of cases, what we care about is that udev has
> performed device node addition or removal, not that there are no
> udev events pending. Recent(-ish) systemd releases support 'udevadm
> wait' to wait for a specific file to be created or unlinked rather
> than waiting for the event that does that work to be completed.
> 
> Hence we don't have to wait for the udev event queue to empty,
> just for the udev event that does the device node manipulation to
> complete.
> 
> Introduce detection of 'udevadm wait' support and a _udev_wait()
> wrapper function to use it if it is available. If it isn't, the use
> the existing UDEV_SETTLE_PROG behaviour.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  common/config     | 35 +++++++++++++++++++++++++----------
>  common/rc         | 25 ++++++++++++++++---------
>  tests/btrfs/291   |  5 +++--
>  tests/generic/081 |  6 +++---
>  tests/generic/108 |  7 +++----
>  tests/generic/459 |  6 +++---
>  6 files changed, 53 insertions(+), 31 deletions(-)
> 
> diff --git a/common/config b/common/config
> index fcff0660b..41b8f29d1 100644
> --- a/common/config
> +++ b/common/config
> @@ -165,7 +165,7 @@ export XFS_MDRESTORE_PROG="$(type -P xfs_mdrestore)"
>  export XFS_ADMIN_PROG="$(type -P xfs_admin)"
>  export XFS_GROWFS_PROG=$(type -P xfs_growfs)
>  export XFS_SPACEMAN_PROG="$(type -P xfs_spaceman)"
> -export XFS_SCRUB_PROG="$(type -P xfs_scrub)"
> +#export XFS_SCRUB_PROG="$(type -P xfs_scrub)"

If you have problems with online fsck, please report them to the mailing
list.

--D

>  export XFS_PARALLEL_REPAIR_PROG="$(type -P xfs_prepair)"
>  export XFS_PARALLEL_REPAIR64_PROG="$(type -P xfs_prepair64)"
>  export __XFSDUMP_PROG="$(type -P xfsdump)"
> @@ -236,18 +236,30 @@ export BTRFS_MAP_LOGICAL_PROG=$(type -P btrfs-map-logical)
>  export PARTED_PROG="$(type -P parted)"
>  export XFS_PROPERTY_PROG="$(type -P xfs_property)"
>  
> -# use 'udevadm settle' or 'udevsettle' to wait for lv to be settled.
> -# newer systems have udevadm command but older systems like RHEL5 don't.
> -# But if neither one is available, just set it to "sleep 1" to wait for lv to
> -# be settled
> -UDEV_SETTLE_PROG="$(type -P udevadm)"
> -if [ "$UDEV_SETTLE_PROG" == "" ]; then
> -	# try udevsettle command
> +# udev wait functions.
> +#
> +# This is how we wait for udev to create or remove device nodes after running a
> +# device create/remove command for logical volumes (e.g. lvm or dm).
> +#
> +# We can wait for the udev queue to empty via "settling". This, however, has
> +# major issues when running tests in parallel - the udev queue takes a long time
> +# to reach empty state. Hence if we have udev > 2.51 installed we use device
> +# waiting instead. This waits for the device node to appear/disappear rather
> +# than waiting for the udev queue to empty.
> +#
> +# If none of these methods are available, fall back to a simple delay (sleep 1)
> +# and hope this is sufficient.
> +UDEVADM_PROG="$(type -P udevadm)"
> +if [ -z "$UDEVADM_PROG" ]; then
>  	UDEV_SETTLE_PROG="$(type -P udevsettle)"
>  else
> -	# udevadm is available, add 'settle' as subcommand
> -	UDEV_SETTLE_PROG="$UDEV_SETTLE_PROG settle"
> +	UDEV_SETTLE_PROG="$UDEVADM_PROG settle"
> +	$UDEVADM_PROG help | grep -q "Wait for device or device symlink"
> +	if [ $? -eq 0 ]; then
> +		UDEV_WAIT_PROG="$UDEVADM_PROG wait"
> +	fi
>  fi
> +
>  # neither command is available, use sleep 1
>  #
>  # Udev events are sent via netlink to userspace through
> @@ -258,8 +270,11 @@ fi
>  # exist or always be 0. We check for /proc/net to see CONFIG_NET was enabled.
>  if [[ "$UDEV_SETTLE_PROG" == "" || ! -d /proc/net ]]; then
>  	UDEV_SETTLE_PROG="sleep 1"
> +	unset UDEV_WAIT_PROG
>  fi
>  export UDEV_SETTLE_PROG
> +export UDEVADM_PROG
> +export UDEV_WAIT_PROG
>  
>  # Set MODPROBE_PATIENT_RM_TIMEOUT_SECONDS to "forever" if you want the patient
>  # modprobe removal to run forever trying to remove a module.
> diff --git a/common/rc b/common/rc
> index 3f35da7fe..fdd18a386 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -5191,22 +5191,29 @@ _require_label_get_max()
>  	dummy=$(_label_get_max)
>  }
>  
> +_udev_wait()
> +{
> +	local args="$*"
> +
> +	if [ -z "$UDEV_WAIT_PROG" ]; then
> +		$UDEV_SETTLE_PROG >/dev/null 2>&1
> +	else
> +		$UDEV_WAIT_PROG $args
> +	fi
> +}
> +
>  _dmsetup_remove()
>  {
> -	$UDEV_SETTLE_PROG >/dev/null 2>&1
> -	$DMSETUP_PROG remove --retry "$@" >>$seqres.full 2>&1
> -	$UDEV_SETTLE_PROG >/dev/null 2>&1
> +	[ $# -le 0 ] && return
> +
> +	$DMSETUP_PROG remove --deferred "$@" >>$seqres.full 2>&1
> +	_udev_wait --removed /dev/mapper/$1
>  }
>  
>  _dmsetup_create()
>  {
> -	# Wait for udev to settle so that the dm creation doesn't fail because
> -	# some udev subprogram opened one of the block devices mentioned in the
> -	# table string w/ O_EXCL.  Do it again at the end so that an immediate
> -	# device open won't also fail.
> -	$UDEV_SETTLE_PROG >/dev/null 2>&1
>  	$DMSETUP_PROG create "$@" >>$seqres.full 2>&1 || return 1
> -	$UDEV_SETTLE_PROG >/dev/null 2>&1
> +	_udev_wait /dev/mapper/$1
>  }
>  
>  _require_btime()
> diff --git a/tests/btrfs/291 b/tests/btrfs/291
> index c31de3a96..122aeaa5d 100755
> --- a/tests/btrfs/291
> +++ b/tests/btrfs/291
> @@ -21,6 +21,7 @@ _cleanup()
>  	cd /
>  	_log_writes_cleanup &> /dev/null
>  	$LVM_PROG vgremove -f -y $vgname >>$seqres.full 2>&1
> +	_udev_wait --removed /dev/mapper/$vgname-$lvname
>  	losetup -d $loop_dev >>$seqres.full 2>&1
>  	rm -f $img
>  	_restore_fsverity_signatures
> @@ -106,7 +107,7 @@ snap_dev=/dev/mapper/vg_replay-$snapname
>  $LVM_PROG vgcreate -f $vgname $loop_dev >>$seqres.full 2>&1 || _fail "failed to vgcreate $vgname"
>  $LVM_PROG lvcreate -L "$replay_bytes"B -n $lvname $vgname -y >>$seqres.full 2>&1 || \
>  	_fail "failed to lvcreate $lvname"
> -$UDEV_SETTLE_PROG >>$seqres.full 2>&1
> +_udev_wait /dev/mapper/$vgname-$lvname
>  
>  replay_log_prog=$here/src/log-writes/replay-log
>  num_entries=$($replay_log_prog --log $LOGWRITES_DEV --num-entries)
> @@ -125,7 +126,7 @@ do
>  
>  	$LVM_PROG lvcreate -s -L 4M -n $snapname $vgname/$lvname >>$seqres.full 2>&1 || \
>  		_fail "Failed to create snapshot"
> -	$UDEV_SETTLE_PROG >>$seqres.full 2>&1
> +	_udev_wait /dev/mapper/$vgname-$snapname
>  
>  	orphan=$(count_item $snap_dev ORPHAN)
>  	[ $state -eq 0 ] && [ $orphan -gt 0 ] && state=1
> diff --git a/tests/generic/081 b/tests/generic/081
> index df17ab6c1..37137d937 100755
> --- a/tests/generic/081
> +++ b/tests/generic/081
> @@ -38,7 +38,7 @@ _cleanup()
>  		$LVM_PROG vgremove -f $vgname >>$seqres.full 2>&1
>  		$LVM_PROG pvremove -f $SCRATCH_DEV >>$seqres.full 2>&1
>  		pv_ret=$?
> -		$UDEV_SETTLE_PROG
> +		_udev_wait --removed /dev/mapper/$vgname-$lvname
>  		test $pv_ret -eq 0 && break
>  		sleep 2
>  	done
> @@ -70,8 +70,8 @@ $LVM_PROG vgcreate -f $vgname $SCRATCH_DEV >>$seqres.full 2>&1
>  # We use yes pipe instead of 'lvcreate --yes' because old version of lvm
>  # (like 2.02.95 in RHEL6) don't support --yes option
>  yes | $LVM_PROG lvcreate -L ${lvsize}M -n $lvname $vgname >>$seqres.full 2>&1
> -# wait for lvcreation to fully complete
> -$UDEV_SETTLE_PROG >>$seqres.full 2>&1
> +_udev_wait /dev/mapper/$vgname-$lvname
> +
>  
>  # _mkfs_dev exits the test on failure, this can make sure lv is created in
>  # above vgcreate/lvcreate steps
> diff --git a/tests/generic/108 b/tests/generic/108
> index 2709472f6..f630450ec 100755
> --- a/tests/generic/108
> +++ b/tests/generic/108
> @@ -20,8 +20,8 @@ _cleanup()
>  	echo running > /sys/block/`_short_dev $SCSI_DEBUG_DEV`/device/state
>  	_unmount $SCRATCH_MNT >>$seqres.full 2>&1
>  	$LVM_PROG vgremove -f $vgname >>$seqres.full 2>&1
> -	$LVM_PROG pvremove -f $SCRATCH_DEV $SCSI_DEBUG_DEV >>$seqres.full 2>&1
> -	$UDEV_SETTLE_PROG
> +	 pvremove -f $SCRATCH_DEV $SCSI_DEBUG_DEV >>$seqres.full 2>&1
> +	_udev_wait --removed /dev/mapper/$vgname-$lvname
>  	_put_scsi_debug_dev
>  	rm -f $tmp.*
>  }
> @@ -57,8 +57,7 @@ $LVM_PROG vgcreate -f $vgname $SCSI_DEBUG_DEV $SCRATCH_DEV >>$seqres.full 2>&1
>  # (like 2.02.95 in RHEL6) don't support --yes option
>  yes | $LVM_PROG lvcreate -i 2 -I 4m -L ${lvsize}m -n $lvname $vgname \
>  	>>$seqres.full 2>&1
> -# wait for lv creation to fully complete
> -$UDEV_SETTLE_PROG >>$seqres.full 2>&1
> +_udev_wait /dev/mapper/$vgname-$lvname
>  
>  # _mkfs_dev exits the test on failure, this makes sure test lv is created by
>  # above vgcreate/lvcreate operations
> diff --git a/tests/generic/459 b/tests/generic/459
> index daccc80ce..1986c2e8f 100755
> --- a/tests/generic/459
> +++ b/tests/generic/459
> @@ -31,7 +31,7 @@ _cleanup()
>  	_unmount $SCRATCH_MNT >>$seqres.full 2>&1
>  	$LVM_PROG vgremove -ff $vgname >>$seqres.full 2>&1
>  	$LVM_PROG pvremove -ff $SCRATCH_DEV >>$seqres.full 2>&1
> -	$UDEV_SETTLE_PROG
> +	_udev_wait --removed /dev/mapper/$vgname-$lvname
>  }
>  
>  # Import common functions.
> @@ -88,8 +88,7 @@ $LVM_PROG lvcreate  --thinpool $poolname  --errorwhenfull y \
>  $LVM_PROG lvcreate  --virtualsize $virtsize \
>  		    -T $vgname/$poolname \
>  		    -n $lvname >>$seqres.full 2>&1
> -
> -$UDEV_SETTLE_PROG &>/dev/null
> +_udev_wait /dev/mapper/$vgname-$lvname
>  _mkfs_dev /dev/mapper/$vgname-$lvname >>$seqres.full 2>&1
>  
>  # Running the test over the original volume doesn't reproduce the problem
> @@ -97,6 +96,7 @@ _mkfs_dev /dev/mapper/$vgname-$lvname >>$seqres.full 2>&1
>  # reproducible, so, create a snapshot and run the test over it.
>  $LVM_PROG lvcreate  -k n -s $vgname/$lvname \
>  		    -n $snapname >>$seqres.full 2>&1
> +_udev_wait /dev/mapper/$vgname-$snapname
>  
>  # Catch mount failure so we don't blindly go an freeze the root filesystem
>  # instead of lvm volume.
> -- 
> 2.45.2
> 
>

next prev parent reply	other threads:[~2024-11-29 17:10 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-27  4:51 [RFC PATCH 00/40] fstests: concurrent test execution Dave Chinner
2024-11-27  4:51 ` [PATCH 01/40] xfs/448: get rid of assert-on-failure Dave Chinner
2024-11-27  4:51 ` [PATCH 02/40] fstests: cleanup fsstress process management Dave Chinner
2024-11-29  4:03   ` Zorro Lang
2024-12-04 17:57   ` Zorro Lang
2024-12-05  4:42     ` Dave Chinner
2024-12-05  9:57       ` Zorro Lang
2024-12-04 18:04   ` Zorro Lang
2024-12-05  4:55     ` Dave Chinner
2024-12-05 10:05       ` Zorro Lang
2024-11-27  4:51 ` [PATCH 03/40] fuzzy: don't use killall Dave Chinner
2024-11-27  4:51 ` [PATCH 04/40] fstests: per-test dmflakey instances Dave Chinner
2024-11-27  4:51 ` [PATCH 05/40] fstests: per-test dmerror instances Dave Chinner
2024-11-27  4:51 ` [PATCH 06/40] fstests: per-test dmhuge instances Dave Chinner
2024-11-27  4:51 ` [PATCH 07/40] fstests: per-test dmthin instances Dave Chinner
2024-11-27  4:51 ` [PATCH 08/40] fstests: per-test dmdust instances Dave Chinner
2024-11-27  4:51 ` [PATCH 09/40] fstests: per-test dmdelay instances Dave Chinner
2024-11-27  4:51 ` [PATCH 10/40] fstests: fix DM device creation/removal vs udev races Dave Chinner
2024-11-27  4:51 ` [PATCH 11/40] fstests: use syncfs rather than sync Dave Chinner
2024-11-27  4:51 ` [PATCH 12/40] fstests: clean up mount and unmount operations Dave Chinner
2024-11-27  4:51 ` [PATCH 13/40] fstests: clean up loop device instantiation Dave Chinner
2024-12-01 12:31   ` Zorro Lang
2024-12-01 12:50   ` Zorro Lang
2024-12-07 12:44   ` Zorro Lang
2024-12-07 18:59     ` Zorro Lang
2024-12-07 19:51       ` Zorro Lang
2024-11-27  4:51 ` [PATCH 14/40] fstests: xfs/227 is really slow Dave Chinner
2024-11-27  4:51 ` [PATCH 15/40] fstests: mark tests that are unreliable when run in parallel Dave Chinner
2024-11-27  4:51 ` [PATCH 16/40] fstests: use udevadm wait in preference to settle Dave Chinner
2024-11-29 17:10   ` Darrick J. Wong [this message]
2024-11-29 22:33     ` Dave Chinner
2024-11-30  2:34       ` Zorro Lang
2024-11-27  4:51 ` [PATCH 17/40] xfs/442: rescale load so it's not exponential Dave Chinner
2024-11-27  4:51 ` [PATCH 18/40] xfs/176: fix broken setup code Dave Chinner
2024-11-27  4:51 ` [PATCH 19/40] xfs/177: remove unused slab object count location checks Dave Chinner
2024-11-27  4:51 ` [PATCH 20/40] fstests: remove uses of killall where possible Dave Chinner
2024-11-27  4:51 ` [PATCH 21/40] generic/127: reduce runtime Dave Chinner
2024-11-27  4:51 ` [PATCH 22/40] quota: system project quota files need to be shared Dave Chinner
2024-11-27  4:51 ` [PATCH 23/40] dmesg: reduce noise from other tests Dave Chinner
2024-11-27  4:51 ` [PATCH 24/40] fstests: stop using /tmp directly Dave Chinner
2024-11-27  4:51 ` [PATCH 25/40] fstests: scale some tests for high CPU count sanity Dave Chinner
2024-11-29  3:34   ` Zorro Lang
2024-11-27  4:51 ` [PATCH 26/40] generic/310: cleanup killing background processes Dave Chinner
2024-11-27  4:51 ` [PATCH 27/40] filter: handle mount errors from CONFIG_BLK_DEV_WRITE_MOUNTED=y Dave Chinner
2024-11-27  4:51 ` [PATCH 28/40] filters: add a filter that accepts EIO instead of other errors Dave Chinner
2024-11-27  4:51 ` [PATCH 29/40] generic/085: general cleanup for reliability and debugging Dave Chinner
2024-11-27  4:52 ` [PATCH 30/40] fstests: don't use directory stacks Dave Chinner
2024-12-01 12:10   ` Zorro Lang
2024-12-01 21:37     ` Dave Chinner
2024-11-27  4:52 ` [PATCH 31/40] fstests: clean up a couple of dm-flakey tests Dave Chinner
2024-11-27  4:52 ` [PATCH 32/40] fstests: clean up termination of various tests Dave Chinner
2024-11-27  4:52 ` [PATCH 33/40] vfstests: some tests require the testdir to be shared Dave Chinner
2024-11-27  4:52 ` [PATCH 34/40] xfs/629: single extent files should be within tolerance Dave Chinner
2024-11-27  4:52 ` [PATCH 35/40] xfs/076: fix broken mkfs filtering Dave Chinner
2024-11-27  4:52 ` [PATCH 36/40] fstests: capture some failures to seqres.full Dave Chinner
2024-11-27  4:52 ` [PATCH 37/40] fstests: always use fail-at-unmount semantics for XFS Dave Chinner
2024-11-27  4:52 ` [PATCH 38/40] generic/062: don't leave debug files in $here on failure Dave Chinner
2024-11-27  4:52 ` [PATCH 39/40] fstests: quota grace periods unreliable under load Dave Chinner
2024-11-27  4:52 ` [PATCH 40/40] fstests: check-parallel Dave Chinner
2024-11-29  4:22 ` [RFC PATCH 00/40] fstests: concurrent test execution Zorro Lang
2024-12-07  0:09   ` Darrick J. Wong
2024-12-07  9:38     ` Zorro Lang
2024-12-08  0:02     ` Dave Chinner
2024-12-08  6:15       ` Zorro Lang
2024-12-10  0:55         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241129171009.GF9425@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox