linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Add testcase for fundamental scrub recovery
@ 2016-11-22  8:38 Qu Wenruo
  2016-11-22  8:38 ` [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand Qu Wenruo
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Qu Wenruo @ 2016-11-22  8:38 UTC (permalink / raw)
  To: linux-btrfs, fstests

Despite the scrub test cases in fstests, there is not even one test case
which really checked if scrub can recover data.

In fact, btrfs scrub for RAID56 will even corrupt correct data stripes.

So let's start from the needed facilities and check scrub starting from RAID1.

The main reason the patchset is RFC, is the method I take to corrupt/verify
btrfs data.

unlike other stable fs, like ext* or xfs, which has good tool to
manipulate the fs.

There used to be btrfs-corrupt-block, but it gets fewer and fewer
update, no man page nor installed by default.

Here I take the method I normally do in my scripts: use btrfs-dump-tree
(btrfs inspect-internal dump-tree) and use bash to corrupt btrfs
pinpointly.
It works quite fine for several different mount options which affects
how data lies on-disk.

But the problem is also obvious, bash script is super hard to maintain,
and there is no promise that btrfs-dump-tree output won't change.
Such change will be destructive.

But IMHO it's still worthy, or btrfs scrub may break at any time.

Qu Wenruo (3):
  fstests: common: rename _require_btrfs to _require_btrfs_subcommand
  fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk  
      info
  fstests: btrfs: Add new test case to check scrub recovery and report

 common/ondisk.btrfs | 113 ++++++++++++++++++++++++++
 common/rc           |   2 +-
 tests/btrfs/004     |   2 +-
 tests/btrfs/048     |   2 +-
 tests/btrfs/059     |   2 +-
 tests/btrfs/131     |   2 +-
 tests/btrfs/132     | 229 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/132.out | 111 +++++++++++++++++++++++++
 tests/btrfs/group   |   2 +
 9 files changed, 460 insertions(+), 5 deletions(-)
 create mode 100644 common/ondisk.btrfs
 create mode 100755 tests/btrfs/132
 create mode 100644 tests/btrfs/132.out

-- 
2.7.4




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand
  2016-11-22  8:38 [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
@ 2016-11-22  8:38 ` Qu Wenruo
  2016-12-05  7:10   ` Eryu Guan
  2016-11-22  8:38 ` [RFC PATCH 2/3] fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk info Qu Wenruo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2016-11-22  8:38 UTC (permalink / raw)
  To: linux-btrfs, fstests

Rename _require_btrfs() to _require_btrfs_subcommand() to avoid
confusion, as all other _require_btrfs_* has a quite clear suffix, like
_require_btrfs_mkfs_feature() or _require_btrfs_fs_feature().

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 common/rc       | 2 +-
 tests/btrfs/004 | 2 +-
 tests/btrfs/048 | 2 +-
 tests/btrfs/059 | 2 +-
 tests/btrfs/131 | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/common/rc b/common/rc
index e3b54ec..6d333dd 100644
--- a/common/rc
+++ b/common/rc
@@ -3017,7 +3017,7 @@ _require_deletable_scratch_dev_pool()
 }
 
 # We check for btrfs and (optionally) features of the btrfs command
-_require_btrfs()
+_require_btrfs_subcommand()
 {
 	cmd=$1
 	_require_command "$BTRFS_UTIL_PROG" btrfs
diff --git a/tests/btrfs/004 b/tests/btrfs/004
index 905770a..2ce628e 100755
--- a/tests/btrfs/004
+++ b/tests/btrfs/004
@@ -51,7 +51,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_scratch
 _require_no_large_scratch_dev
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal
 _require_command "/usr/sbin/filefrag" filefrag
 
 rm -f $seqres.full
diff --git a/tests/btrfs/048 b/tests/btrfs/048
index 0b907b0..ac731d1 100755
--- a/tests/btrfs/048
+++ b/tests/btrfs/048
@@ -48,7 +48,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_test
 _require_scratch
-_require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 send_files_dir=$TEST_DIR/btrfs-test-$seq
 
diff --git a/tests/btrfs/059 b/tests/btrfs/059
index 8f106d2..fd67ebb 100755
--- a/tests/btrfs/059
+++ b/tests/btrfs/059
@@ -51,7 +51,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_test
 _require_scratch
-_require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 rm -f $seqres.full
 
diff --git a/tests/btrfs/131 b/tests/btrfs/131
index d1a11d2..ff1dbdd 100755
--- a/tests/btrfs/131
+++ b/tests/btrfs/131
@@ -48,7 +48,7 @@ rm -f $seqres.full
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal
 
 mkfs_v1()
 {
-- 
2.7.4




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 2/3] fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk info
  2016-11-22  8:38 [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
  2016-11-22  8:38 ` [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand Qu Wenruo
@ 2016-11-22  8:38 ` Qu Wenruo
  2016-11-22  8:38 ` [RFC PATCH 3/3] fstests: btrfs: Add new test case to check scrub recovery and report Qu Wenruo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2016-11-22  8:38 UTC (permalink / raw)
  To: linux-btrfs, fstests

Introduce a new common header, ondisk.btrfs, to allow we manipulate
btrfs on-disk data directly, to corrupt or verify data.

Since the designed tool, btrfs-corrupt-block is no longer default
installed, and furthermore it doesn't support to corrupt given mirror or
stripe, it's near useless for later scrub verification test cases.

Not to mention its parameter is not updated and there is no man page for
it.

So here what we can do is, either wait for years for a reliable
corrupter and see btrfs scrub never get verified for its basic function.
Or use existing tools like btrfs-debug-tree
(btrfs inspect-internal dump-tree) and bash to build a basic tool.

I choose the later one, which is not perfect, only support one data
chunk, hard to maintain(why we are using bash?!), but is good enough for
RAID1/DUP testing.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 common/ondisk.btrfs | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)
 create mode 100644 common/ondisk.btrfs

diff --git a/common/ondisk.btrfs b/common/ondisk.btrfs
new file mode 100644
index 0000000..df87a08
--- /dev/null
+++ b/common/ondisk.btrfs
@@ -0,0 +1,113 @@
+#!/bin/bash
+#-----------------------------------------------------------------------
+# Copyright (c) 2016 Fujitsu.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+# Helper functions to manipulate btrfs ondisk data, for scrub verification
+
+# Get the logical bytenr of the first block group matches the @profile
+# @1: device contains the btrfs, mandatory 
+# @2: type, must be one of "DATA", "SYSTEM" or "METADATA", ignore case, mandatory
+# @3: profile, must be one of the support profile, ignore case, optional,
+#     default to any profile
+# Output "logical length" into stdout for usage.
+# If failed, nothing is outputted, so caller should check the output
+_btrfs_get_first_bg_by_type()
+{
+	local dev=$1
+	local chunk_type=$(echo $2 | awk '{print toupper($0)}')
+	local profile=$(echo $3 | awk '{print toupper($0)}')
+
+	_require_btrfs_subcommand inspect-internal
+
+	echo "_btrfs_get_first_bg_by_type: dev=$dev type=$chunk_type profile=$profile" \
+		>> $seqres.full
+
+	if [ -z "$dev" -o -z "$chunk_type" ]; then
+		_fail "Missing device or chunk type argument for _btrfs_get_first_bg_by_type"
+	fi
+
+	if [ ! -z $profile ]; then
+		if [ $profile == "SINGLE" ]; then
+			full_profile="$chunk_type"
+		else
+			full_profile="$chunk_type|$profile"
+		fi
+	else
+		full_profile="$chunk_type"
+	fi
+
+	logical=$($BTRFS_UTIL_PROG inspect-internal dump-tree $dev -t chunk |\
+		grep "type $full_profile" -m1 -B2 |\
+		grep "CHUNK_ITEM [[:digit:]]*" -o | awk '{print $2}')
+	length=$($BTRFS_UTIL_PROG inspect-internal dump-tree $dev -t chunk |\
+		grep "type $full_profile" -m1 -B2 |\
+		egrep "chunk length [[:digit:]]*" -o | awk '{print $3}')
+	if [ -z "$logical" -o -z "$length" ]; then
+		return;
+	fi
+	echo "found bg logical=$logical length=$length" >> $seqres.full
+	echo "$logical $length"
+}
+
+# Get the stripe map for chunk specified by @logical parameter
+# @1: device contains the btrfs, mandatory
+# @2: logical bytenr of the chunk, mandatory
+# @3: stripe number, optional, default to 0(first stripe)
+# Output "devid dev_offset" into stdout for usage
+# If failed, nothing is outputed, so caller should check the output
+_btrfs_get_chunk_stripe()
+{
+	local dev=$1
+	local logical=$2
+	local stripenr=$3
+
+	_require_btrfs_subcommand inspect-internal
+
+	echo "_btrfs_get_chunk_stripe: dev=$dev logical=$logical stripenr=$stripenr" \
+		>> $seqres.full
+	if [ -z "$dev" -o -z "$logical" ]; then
+		_fail "Missing device or logical argument for _btrfs_get_chunk_stripe"
+	fi
+
+	if [ -z $stripenr ]; then
+		stripenr=0
+	fi
+
+	stripes=$($BTRFS_UTIL_PROG inspect-internal dump-tree $dev -t chunk |\
+		grep "CHUNK_ITEM $logical" -A 2 |\
+		grep -o "num_stripes [[:digit:]]*" |\
+		awk '{print $2}')
+	if [ -z $stripes ]; then
+		return
+	fi
+
+	# For CHUNK_ITEM it takes 3 lines(Hit line includes), follows with
+	# stripe info, each stripe takes 2 lines
+	nr_lines=$((2 + 2 * $stripes))
+	devid=$($BTRFS_UTIL_PROG inspect-internal dump-tree $dev -t chunk |\
+		grep "CHUNK_ITEM $logical" -A $nr_lines |\
+		grep "stripe $stripenr" | awk '{print $4}')
+	dev_offset=$($BTRFS_UTIL_PROG inspect-internal dump-tree $dev -t chunk |\
+		grep "CHUNK_ITEM $logical" -A $nr_lines |\
+		grep "stripe $stripenr" | awk '{print $6}')
+	if [ -z "$devid" -o -z "$dev_offset" ]; then
+		return;
+	fi
+	echo "found stripe=$stripenr devid=$devid dev_offset=$dev_offset" \
+		>> $seqres.full
+	echo "$devid $dev_offset"
+}
-- 
2.7.4




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 3/3] fstests: btrfs: Add new test case to check scrub recovery and report
  2016-11-22  8:38 [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
  2016-11-22  8:38 ` [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand Qu Wenruo
  2016-11-22  8:38 ` [RFC PATCH 2/3] fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk info Qu Wenruo
@ 2016-11-22  8:38 ` Qu Wenruo
  2016-11-28  2:50 ` [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
  2016-12-05  7:07 ` Eryu Guan
  4 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2016-11-22  8:38 UTC (permalink / raw)
  To: linux-btrfs, fstests

In fact, desipte the existing btrfs scrub test cases, we didn't even
test if scrub can really recovery data.

Due to the recent exposed several critical RAID56 scrub problem, it's
really needed to test the fundamental function before things get worse
and worse.

This test case add verification for btrfs RAID1/DUP scrub recovery.
As it's the simplest profile, pure mirroring, not stripping nor parity.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 tests/btrfs/132     | 229 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/132.out | 111 +++++++++++++++++++++++++
 tests/btrfs/group   |   2 +
 3 files changed, 342 insertions(+)
 create mode 100755 tests/btrfs/132
 create mode 100644 tests/btrfs/132.out

diff --git a/tests/btrfs/132 b/tests/btrfs/132
new file mode 100755
index 0000000..3198125
--- /dev/null
+++ b/tests/btrfs/132
@@ -0,0 +1,229 @@
+#! /bin/bash
+# FS QA Test 132
+#
+# Check if pure mirror based btrfs raid profiles(RAID1/DUP) can recover
+# data correctly by scrubbing, and the correctness of the error report
+#
+# We don't have good enough off-line or on-line tool to corrupt on-disk
+# data which can handle extents on different chunks.
+# So here we only create a small file extents, this will restrict the coverage
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2016 Fujitsu.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/ondisk.btrfs
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# Golden output has csum, so no LOAD_FACTOR here
+runtimes=4
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs btrfs
+_supported_os Linux
+
+# RAID1 needs at least 2 devices
+_require_scratch_dev_pool 2
+
+stripelen=65536
+# Corrupt mirrors at given logical bytenr
+# Only works if the range is in the first data chunk
+corrupt_one_mirror()
+{
+	local dest_logical=$1
+	local dest_len=$2
+	local devs[]="( $SCRATCH_DEV_POOL )"
+
+	dev=$(echo $SCRATCH_DEV_POOL | awk '{print $1}')
+	output=$(_btrfs_get_first_bg_by_type $dev data)
+	if [ -z "$output" ]; then
+		_fail "testcase error: failed to get first data chunk info"
+	fi
+	bg_logical=$(echo $output | awk '{print $1}')
+	bg_len=$(echo $output | awk '{print $2}')
+
+	if [ $dest_logical -ge $(( $bg_logical + $bg_len)) -o \
+	     $(( $dest_logical + $dest_len)) -le $bg_logical ]; then
+	     _fail "testcase error: ondisk layout is out of expect"
+	fi
+
+
+	# Select random stripe (mirror) as corrupt destination
+	dest_stripe=$((RANDOM % 2))
+	output=$(_btrfs_get_chunk_stripe $dev $bg_logical $dest_stripe)
+	if [ -z "$output" ]; then
+		_fail "testcase error: failed to get first data chunk info"
+	fi
+	stripe_devid=$(echo $output | awk '{print $1}')
+	stripe_devid=$(($stripe_devid - 1))
+
+	stripe_offset=$(echo $output | awk '{print $2}')
+
+	real_offset=$(( $dest_logical - $bg_logical + $stripe_offset ))
+	dev=${devs[$(($stripe_devid))]}
+
+	# Corrupt the last stripe using random data
+	$XFS_IO_PROG -c "pwrite -S 0x0000 $real_offset $dest_len" $dev \
+		> /dev/null 2>&1
+}
+
+csum_mirrors()
+{
+	local dest_logical=$1
+	local dest_len=$2
+	local verify_corrupt=$3
+	local csums[]="( 0 0 )"
+
+	dev=$(echo $SCRATCH_DEV_POOL | awk '{print $1}')
+	output=$(_btrfs_get_first_bg_by_type $dev data)
+	if [ -z "$output" ]; then
+		_fail "testcase error: failed to get first data chunk info"
+	fi
+	bg_logical=$(echo $output | awk '{print $1}')
+	bg_len=$(echo $output | awk '{print $2}')
+
+	if [ $dest_logical -ge $(( $bg_logical + $bg_len)) -o \
+	     $(( $dest_logical + $dest_len)) -le $bg_logical ]; then
+	     _fail "testcase error: ondisk layout is out of expect"
+	fi
+
+	bg_offset=$(( $dest_logical - $bg_logical ))
+
+	for i in $(seq 0 1); do
+		local devs[]="( $SCRATCH_DEV_POOL )"
+		output=$(_btrfs_get_chunk_stripe $dev $bg_logical $i)
+		if [ -z "$output" ]; then
+			_fail "testcase error: failed to get first data chunk info"
+		fi
+		stripe_devid=$(echo $output | awk '{print $1}')
+		stripe_offset=$(echo $output | awk '{print $2}')
+
+		real_offset=$(( $bg_offset + $stripe_offset ))
+
+		index=$(($stripe_devid - 1))
+		dev=${devs[$index]}
+
+		# xfs_io don't support to read out data into file, must use dd here now
+		csums[$i]=$(dd if=$dev bs=1 count=$dest_len skip=$real_offset \
+			   status=none | md5sum | awk '{print $1}')
+		if [ $verify_corrupt -eq 0 ]; then
+			echo "csum for stripe $i: ${csums[$i]}"
+		fi
+	done
+	if [ $verify_corrupt -ne 0 ]; then
+		if [ ${csums[0]} == ${csums[1]} ]; then
+			_fail "corruption failed"
+		else
+			echo "one of the mirror corrupted"
+			echo
+		fi
+	fi
+}
+
+do_test()
+{
+	profile=$1
+
+	if [ -z "$profile" ]; then
+		_fail "testcase error: profile argument is not given for do_test()"
+	fi
+
+	if [ $profile == "dup" ]; then
+		ndevs=1
+	elif [ $profile == "raid1" ]; then
+		ndevs=2
+	else
+		_fail "testcase error: profile $profile is not supported for this test case"
+	fi
+
+	echo "===Test scrub recovery for profile: $profile==="
+	_scratch_dev_pool_get $ndevs
+	_scratch_pool_mkfs "-m $profile -d $profile" >> $seqres.full 2>&1
+	dev=$(echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $1}') 
+	output=$(_btrfs_get_first_bg_by_type $dev data $profile)
+
+	if [ -z "$output" ]; then
+		_fail "testcase error: unable to get first block group logical"
+	fi
+
+
+	_scratch_mount >> $seqres.full 2>&1
+	_pwrite_byte 0xcdcd 0 $stripelen $SCRATCH_MNT/file > /dev/null 2>&1
+	_scratch_cycle_mount
+	file_logical=$($XFS_IO_PROG -c "fiemap" $SCRATCH_MNT/file | tail -n +2 |
+		awk '{print $3}' | cut -f1 -d\.)
+	file_logical=$(( $file_logical * 512 ))
+	_scratch_unmount
+
+	echo "before csum corruption"
+	csum_mirrors $file_logical $stripelen 0
+	echo ""
+	corrupt_one_mirror $file_logical $stripelen
+
+	# verify the corruption is good
+	csum_mirrors $file_logical $stripelen 1
+
+	_scratch_mount
+	# redirect the output, we will need it for several usage
+	$BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT &> $tmp.scrub_output
+	if [ $? -ne 0 ]; then
+		_fail "scrub found un-recoverable error"
+	fi
+	cat $tmp.scrub_output >> $seqres.full
+	csum_errors=$(grep -o "csum=[[:digit:]]*" $tmp.scrub_output | cut -f2 -d=)
+	pagesize=$(getconf PAGESIZE)
+	expected_errors=$(($stripelen / $pagesize))
+	if [ $csum_errors -ne $(($stripelen / $pagesize)) ]; then
+		_fail "incorrect csum error number reported, have=$csum_errors, expected=$expected_errors"
+	fi
+	_scratch_unmount
+
+	echo "after csum corruption and scrub"
+	csum_mirrors $file_logical $stripelen 0
+	echo ""
+	_scratch_dev_pool_put
+}
+
+for i in $(seq 0 $runtimes); do
+	do_test dup
+	do_test raid1
+done
+
+status=0
+exit
diff --git a/tests/btrfs/132.out b/tests/btrfs/132.out
new file mode 100644
index 0000000..362574f
--- /dev/null
+++ b/tests/btrfs/132.out
@@ -0,0 +1,111 @@
+QA output created by 132
+===Test scrub recovery for profile: dup===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: raid1===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: dup===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: raid1===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: dup===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: raid1===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: dup===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: raid1===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: dup===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+===Test scrub recovery for profile: raid1===
+before csum corruption
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
+one of the mirror corrupted
+
+after csum corruption and scrub
+csum for stripe 0: 27c9068d1b51da575a53ad34c57ca5cc
+csum for stripe 1: 27c9068d1b51da575a53ad34c57ca5cc
+
diff --git a/tests/btrfs/group b/tests/btrfs/group
index c090604..61f26ed 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -134,3 +134,5 @@
 129 auto quick send
 130 auto clone send
 131 auto quick
+132 auto scrub
+
-- 
2.7.4




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Add testcase for fundamental scrub recovery
  2016-11-22  8:38 [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
                   ` (2 preceding siblings ...)
  2016-11-22  8:38 ` [RFC PATCH 3/3] fstests: btrfs: Add new test case to check scrub recovery and report Qu Wenruo
@ 2016-11-28  2:50 ` Qu Wenruo
  2016-11-28  3:29   ` Eryu Guan
  2016-12-05  7:07 ` Eryu Guan
  4 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2016-11-28  2:50 UTC (permalink / raw)
  To: linux-btrfs, fstests

Any comment?

At 11/22/2016 04:38 PM, Qu Wenruo wrote:
> Despite the scrub test cases in fstests, there is not even one test case
> which really checked if scrub can recover data.
>
> In fact, btrfs scrub for RAID56 will even corrupt correct data stripes.
>
> So let's start from the needed facilities and check scrub starting from RAID1.
>
> The main reason the patchset is RFC, is the method I take to corrupt/verify
> btrfs data.
>
> unlike other stable fs, like ext* or xfs, which has good tool to
> manipulate the fs.
>
> There used to be btrfs-corrupt-block, but it gets fewer and fewer
> update, no man page nor installed by default.
>
> Here I take the method I normally do in my scripts: use btrfs-dump-tree
> (btrfs inspect-internal dump-tree) and use bash to corrupt btrfs
> pinpointly.
> It works quite fine for several different mount options which affects
> how data lies on-disk.
>
> But the problem is also obvious, bash script is super hard to maintain,
> and there is no promise that btrfs-dump-tree output won't change.
> Such change will be destructive.
>
> But IMHO it's still worthy, or btrfs scrub may break at any time.
>
> Qu Wenruo (3):
>   fstests: common: rename _require_btrfs to _require_btrfs_subcommand
>   fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk
>       info

Especially for fstests guys, is this OK to use complex bash scripts to 
extract needed data for btrfs?

Or should we just put things into btrfs-progs (and wait for some time, 
leaving btrfs raid56 corrupted)

Thanks,
Qu
>   fstests: btrfs: Add new test case to check scrub recovery and report
>
>  common/ondisk.btrfs | 113 ++++++++++++++++++++++++++
>  common/rc           |   2 +-
>  tests/btrfs/004     |   2 +-
>  tests/btrfs/048     |   2 +-
>  tests/btrfs/059     |   2 +-
>  tests/btrfs/131     |   2 +-
>  tests/btrfs/132     | 229 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/132.out | 111 +++++++++++++++++++++++++
>  tests/btrfs/group   |   2 +
>  9 files changed, 460 insertions(+), 5 deletions(-)
>  create mode 100644 common/ondisk.btrfs
>  create mode 100755 tests/btrfs/132
>  create mode 100644 tests/btrfs/132.out
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Add testcase for fundamental scrub recovery
  2016-11-28  2:50 ` [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
@ 2016-11-28  3:29   ` Eryu Guan
  0 siblings, 0 replies; 9+ messages in thread
From: Eryu Guan @ 2016-11-28  3:29 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, fstests

On Mon, Nov 28, 2016 at 10:50:30AM +0800, Qu Wenruo wrote:
> Any comment?

Sorry for the late review, I'm planning to look at them this week.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Add testcase for fundamental scrub recovery
  2016-11-22  8:38 [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
                   ` (3 preceding siblings ...)
  2016-11-28  2:50 ` [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
@ 2016-12-05  7:07 ` Eryu Guan
  2016-12-05  7:15   ` Qu Wenruo
  4 siblings, 1 reply; 9+ messages in thread
From: Eryu Guan @ 2016-12-05  7:07 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, fstests

On Tue, Nov 22, 2016 at 04:38:08PM +0800, Qu Wenruo wrote:
> Despite the scrub test cases in fstests, there is not even one test case
> which really checked if scrub can recover data.
> 
> In fact, btrfs scrub for RAID56 will even corrupt correct data stripes.
> 
> So let's start from the needed facilities and check scrub starting from RAID1.
> 
> The main reason the patchset is RFC, is the method I take to corrupt/verify
> btrfs data.
> 
> unlike other stable fs, like ext* or xfs, which has good tool to
> manipulate the fs.
> 
> There used to be btrfs-corrupt-block, but it gets fewer and fewer
> update, no man page nor installed by default.

I really think that a tool manipulating btrfs should be developed first,
especially it's greatly demanded now and there's already a prototype
btrfs-corrupt-block available. Then this tool can act as a base &
benefit all future tests that corrupt fs metadata. e.g. Darrick
introduced blocktrash command to xfs_db and used it in many fuzz tests.

IMO, let's build and/or fix the tools first. To me, this is just the
right time to do this.

> 
> Here I take the method I normally do in my scripts: use btrfs-dump-tree
> (btrfs inspect-internal dump-tree) and use bash to corrupt btrfs
> pinpointly.
> It works quite fine for several different mount options which affects
> how data lies on-disk.
> 
> But the problem is also obvious, bash script is super hard to maintain,
> and there is no promise that btrfs-dump-tree output won't change.
> Such change will be destructive.

Agreed, it's very hard to maintain, and, as you pointed out, the output
format changing of btrfs-dump-tree makes the code even harder to read &
maintain. And it's hard to extend to work with other profiles too.

I'd like to hear what other btrfs developers and fstests users think.

Thanks,
Eryu

> 
> But IMHO it's still worthy, or btrfs scrub may break at any time.
> 
> Qu Wenruo (3):
>   fstests: common: rename _require_btrfs to _require_btrfs_subcommand
>   fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk  
>       info
>   fstests: btrfs: Add new test case to check scrub recovery and report
> 
>  common/ondisk.btrfs | 113 ++++++++++++++++++++++++++
>  common/rc           |   2 +-
>  tests/btrfs/004     |   2 +-
>  tests/btrfs/048     |   2 +-
>  tests/btrfs/059     |   2 +-
>  tests/btrfs/131     |   2 +-
>  tests/btrfs/132     | 229 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/132.out | 111 +++++++++++++++++++++++++
>  tests/btrfs/group   |   2 +
>  9 files changed, 460 insertions(+), 5 deletions(-)
>  create mode 100644 common/ondisk.btrfs
>  create mode 100755 tests/btrfs/132
>  create mode 100644 tests/btrfs/132.out
> 
> -- 
> 2.7.4
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand
  2016-11-22  8:38 ` [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand Qu Wenruo
@ 2016-12-05  7:10   ` Eryu Guan
  0 siblings, 0 replies; 9+ messages in thread
From: Eryu Guan @ 2016-12-05  7:10 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, fstests

On Tue, Nov 22, 2016 at 04:38:09PM +0800, Qu Wenruo wrote:
> Rename _require_btrfs() to _require_btrfs_subcommand() to avoid
> confusion, as all other _require_btrfs_* has a quite clear suffix, like
> _require_btrfs_mkfs_feature() or _require_btrfs_fs_feature().
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> ---
>  common/rc       | 2 +-
>  tests/btrfs/004 | 2 +-
>  tests/btrfs/048 | 2 +-
>  tests/btrfs/059 | 2 +-
>  tests/btrfs/131 | 2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/common/rc b/common/rc
> index e3b54ec..6d333dd 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -3017,7 +3017,7 @@ _require_deletable_scratch_dev_pool()
>  }
>  
>  # We check for btrfs and (optionally) features of the btrfs command
> -_require_btrfs()
> +_require_btrfs_subcommand()

_require_btrfs_command seems good enough. And it should be improved to
handle subcommands like dump-tree (like what _require_xfs_io_command
does), otherwise tests fail if old version of btrfs-progs is installed
where inspect-internal doesn't have dump-tree support.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Add testcase for fundamental scrub recovery
  2016-12-05  7:07 ` Eryu Guan
@ 2016-12-05  7:15   ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2016-12-05  7:15 UTC (permalink / raw)
  To: Eryu Guan, David Sterba; +Cc: linux-btrfs, fstests



At 12/05/2016 03:07 PM, Eryu Guan wrote:
> On Tue, Nov 22, 2016 at 04:38:08PM +0800, Qu Wenruo wrote:
>> Despite the scrub test cases in fstests, there is not even one test case
>> which really checked if scrub can recover data.
>>
>> In fact, btrfs scrub for RAID56 will even corrupt correct data stripes.
>>
>> So let's start from the needed facilities and check scrub starting from RAID1.
>>
>> The main reason the patchset is RFC, is the method I take to corrupt/verify
>> btrfs data.
>>
>> unlike other stable fs, like ext* or xfs, which has good tool to
>> manipulate the fs.
>>
>> There used to be btrfs-corrupt-block, but it gets fewer and fewer
>> update, no man page nor installed by default.
>
> I really think that a tool manipulating btrfs should be developed first,
> especially it's greatly demanded now and there's already a prototype
> btrfs-corrupt-block available. Then this tool can act as a base &
> benefit all future tests that corrupt fs metadata. e.g. Darrick
> introduced blocktrash command to xfs_db and used it in many fuzz tests.
>
> IMO, let's build and/or fix the tools first. To me, this is just the
> right time to do this.

Thanks for the comment.

This idea really helps.
>
>>
>> Here I take the method I normally do in my scripts: use btrfs-dump-tree
>> (btrfs inspect-internal dump-tree) and use bash to corrupt btrfs
>> pinpointly.
>> It works quite fine for several different mount options which affects
>> how data lies on-disk.
>>
>> But the problem is also obvious, bash script is super hard to maintain,
>> and there is no promise that btrfs-dump-tree output won't change.
>> Such change will be destructive.
>
> Agreed, it's very hard to maintain, and, as you pointed out, the output
> format changing of btrfs-dump-tree makes the code even harder to read &
> maintain. And it's hard to extend to work with other profiles too.

It's really a nightmare debugging the bash script, so I'm quite happy 
that I don't need to bother this code anymore.

Add Cc to David to see if he has any better idea on this.

Thanks,
Qu

>
> I'd like to hear what other btrfs developers and fstests users think.
>
> Thanks,
> Eryu
>
>>
>> But IMHO it's still worthy, or btrfs scrub may break at any time.
>>
>> Qu Wenruo (3):
>>   fstests: common: rename _require_btrfs to _require_btrfs_subcommand
>>   fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk
>>       info
>>   fstests: btrfs: Add new test case to check scrub recovery and report
>>
>>  common/ondisk.btrfs | 113 ++++++++++++++++++++++++++
>>  common/rc           |   2 +-
>>  tests/btrfs/004     |   2 +-
>>  tests/btrfs/048     |   2 +-
>>  tests/btrfs/059     |   2 +-
>>  tests/btrfs/131     |   2 +-
>>  tests/btrfs/132     | 229 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  tests/btrfs/132.out | 111 +++++++++++++++++++++++++
>>  tests/btrfs/group   |   2 +
>>  9 files changed, 460 insertions(+), 5 deletions(-)
>>  create mode 100644 common/ondisk.btrfs
>>  create mode 100755 tests/btrfs/132
>>  create mode 100644 tests/btrfs/132.out
>>
>> --
>> 2.7.4
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fstests" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-12-05  7:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-22  8:38 [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
2016-11-22  8:38 ` [RFC PATCH 1/3] fstests: common: rename _require_btrfs to _require_btrfs_subcommand Qu Wenruo
2016-12-05  7:10   ` Eryu Guan
2016-11-22  8:38 ` [RFC PATCH 2/3] fstests: common/ondisk.btrfs: Introduce function to get btrfs ondisk info Qu Wenruo
2016-11-22  8:38 ` [RFC PATCH 3/3] fstests: btrfs: Add new test case to check scrub recovery and report Qu Wenruo
2016-11-28  2:50 ` [RFC PATCH 0/3] Add testcase for fundamental scrub recovery Qu Wenruo
2016-11-28  3:29   ` Eryu Guan
2016-12-05  7:07 ` Eryu Guan
2016-12-05  7:15   ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).