public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Eryu Guan <eguan@linux.alibaba.com>
To: Qu Wenruo <wqu@suse.com>
Cc: fstests@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/2] fstests: btrfs/011: Handle finished scrub/replace operation gracefully
Date: Wed, 18 Sep 2019 16:47:34 +0800	[thread overview]
Message-ID: <20190918084734.GA52397@e18g06458.et15sqa> (raw)
In-Reply-To: <20190918065626.34902-2-wqu@suse.com>

On Wed, Sep 18, 2019 at 02:56:26PM +0800, Qu Wenruo wrote:
> [BUG]
> When btrfs/011 is executed on a fast enough system (fully memory backed
> VM, with test device has unsafe cache mode), the test can fail like
> this:
> 
>   btrfs/011 43s ... [failed, exit status 1]- output mismatch (see /home/adam/xfstests-dev/results//btrfs/011.out.bad)
>     --- tests/btrfs/011.out     2019-07-22 14:13:44.643333326 +0800
>     +++ /home/adam/xfstests-dev/results//btrfs/011.out.bad      2019-09-18 14:49:28.308798022 +0800
>     @@ -1,3 +1,4 @@
>      QA output created by 011
>      *** test btrfs replace
>     -*** done
>     +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
>     +(see /home/adam/xfstests-dev/results//btrfs/011.full for details)
>     ...
> 
> [CAUSE]
> Looking into the full output, it shows:
>   ...
>   Replace from /dev/mapper/test-scratch1 to /dev/mapper/test-scratch2
> 
>   # /usr/bin/btrfs replace start -f /dev/mapper/test-scratch1 /dev/mapper/test-scratch2 /mnt/scratch
>   # /usr/bin/btrfs replace cancel /mnt/scratch
>   INFO: ioctl(DEV_REPLACE_CANCEL)"/mnt/scratch": not started
>   failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
> 
> So this means the replace is already finished before we cancel it.
> For fast system, it's very common.

Does generate heavier load & more data make replace operation last
longer? e.g. make more 'noise' by running fsstress instead of dumping
/dev/urandom before starting replace.

And does sleep shorter time (0.5s?) before cancel work?

Thanks,
Eryu

> 
> [FIX]
> Instead of using _run_btrfs_util_prog which requires 0 as return value,
> we just call "$BTRFS_UTIL_PROG replace cancel" and ignore all its
> stderr/stdout, and completely rely on "$BTRFS_UTIL_PROG replace status"
> output to verify the work.
> 
> Furthermore if we finished replac before cancelling it, we should
> replace again to switch the device back, or after the test case, btrfs
> check will fail as there is no valid btrfs on that replaced device.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  tests/btrfs/011 | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/btrfs/011 b/tests/btrfs/011
> index 89bb4d11..858b00e8 100755
> --- a/tests/btrfs/011
> +++ b/tests/btrfs/011
> @@ -148,13 +148,25 @@ btrfs_replace_test()
>  		# background the replace operation (no '-B' option given)
>  		_run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
>  		sleep 1
> -		_run_btrfs_util_prog replace cancel $SCRATCH_MNT
> +		# 1s is enough for fast system to finish replace, so here we
> +		# ignore all the output, completely rely on later status
> +		# output to determine
> +		$BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT &> /dev/null
>  
>  		# 'replace status' waits for the replace operation to finish
>  		# before the status is printed
>  		$BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
>  		cat $tmp.tmp >> $seqres.full
> -		grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
> +		grep -q -e canceled -e finished $tmp.tmp ||\
> +			_fail "btrfs replace status (canceled) failed"
> +
> +		# If replace finished before cancel, replace them back or
> +		# the final fsck after test case will fail as there is no btrfs
> +		# on the $source_dev anymore
> +		if grep -q -e finished $tmp.tmp ; then
> +			$BTRFS_UTIL_PROG replace start -Bf $replace_options \
> +				$target_dev $source_dev $SCRATCH_MNT
> +		fi
>  	else
>  		if [ "${quick}Q" = "thoroughQ" ]; then
>  			# On current hardware, the thorough test runs
> -- 
> 2.22.0

  parent reply	other threads:[~2019-09-18  8:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-18  6:56 [PATCH 1/2] fstests: btrfs/028: Don't pollute golden output for killing already finished process Qu Wenruo
2019-09-18  6:56 ` [PATCH 2/2] fstests: btrfs/011: Handle finished scrub/replace operation gracefully Qu Wenruo
2019-09-18  7:54   ` Anand Jain
2019-09-18  7:57     ` Qu Wenruo
2019-09-18  8:47   ` Eryu Guan [this message]
2019-09-18 10:12     ` Qu Wenruo
2019-09-18  7:33 ` [PATCH 1/2] fstests: btrfs/028: Don't pollute golden output for killing already finished process Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190918084734.GA52397@e18g06458.et15sqa \
    --to=eguan@linux.alibaba.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox