public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: wangyf <wangyf-fnst@cn.fujitsu.com>
To: Omar Sandoval <osandov@fb.com>, <linux-btrfs@vger.kernel.org>
Cc: Miao Xie <miaoxie@huawei.com>, Zhao Lei <zhaolei@cn.fujitsu.com>,
	Philip <bugzilla@philip-seeger.de>
Subject: Re: [PATCH v2 0/5] Btrfs: RAID 5/6 missing device scrub+replace
Date: Tue, 23 Jun 2015 11:07:00 +0800	[thread overview]
Message-ID: <5588CD54.9050405@cn.fujitsu.com> (raw)
In-Reply-To: <cover.1434739053.git.osandov@fb.com>

Hi,
I have tested your PATCH v2 , but something wrong happened.

kernel: 4.1.0-rc7+ with your five patches
vitrualBox ubuntu14.10-server + LVM

I make a new btrfs.ko with your patches,
rmmod original module and insmod the new.

When I use the profile RAID1/10, mkfs successfully
But when mount the fs, dmesg dumped:
     trans: 18446612133975020584 running 5
     btrfs transid mismatch buffer 29507584, found 18446612133975020584 
running 5
     btrfs transid mismatch buffer 29507584, found 18446612133975020584 
running 5
     btrfs transid mismatch buffer 29507584, found 18446612133975020584 
running 5
... ...

When use the RAID5/6, mkfs and mount
system stoped at the 'mount -t btrfs /dev/mapper/server-dev1 /mnt' cmd.

That's all.





在 2015年06月20日 02:52, Omar Sandoval 写道:
> Hi,
>
> Here's version 2 of the missing device RAID 5/6 fixes. The original
> problem was reported by a user on Bugzilla: the kernel crashed when
> attempting to replace a missing device in a RAID 6 filesystem. This is
> detailed and fixed in patch 4. After the initial posting, Zhao Lei
> reported a similar issue when doing a scrub on a RAID 5 filesystem with
> a missing device. This is fixed in the added patch 5.
>
> My new-and-improved-and-overengineered reproducer as well as Zhao Lei's
> reproducer can be found below.
>
> Thanks!
>
> v1: http://article.gmane.org/gmane.comp.file-systems.btrfs/45045
> v1->v2:
> - Add missing scrub_wr_submit() in scrub_missing_raid56_worker()
> - Add clarifying comment in dev->missing case of scrub_stripe()
>    (Zhaolei)
> - Add fix for scrub with missing device (patch 5)
>
> Omar Sandoval (5):
>    Btrfs: remove misleading handling of missing device scrub
>    Btrfs: count devices correctly in readahead during RAID 5/6 replace
>    Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
>    Btrfs: fix device replace of a missing RAID 5/6 device
>    Btrfs: fix parity scrub of RAID 5/6 with missing device
>
>   fs/btrfs/raid56.c |  87 ++++++++++++++++++++---
>   fs/btrfs/raid56.h |  10 ++-
>   fs/btrfs/reada.c  |   4 +-
>   fs/btrfs/scrub.c  | 202 +++++++++++++++++++++++++++++++++++++++++++++---------
>   4 files changed, 259 insertions(+), 44 deletions(-)
>
> Reproducer 1:
>
> ----
> #!/bin/bash
>
> usage () {
> 	USAGE_STRING="Usage: $0 [OPTION]...
> Options:
>    -m    failure mode; MODE is 'eio', 'missing', or 'corrupt' (defaults to
>          'missing')
>    -n    number of files to write, each twice as big as the last, the first
>          being 1M in size (defaults to 4)
>    -o    operation to perform; OP is 'replace' or 'scrub' (defaults to
>          'replace')
>    -r    RAID profile; RAID is 'raid0', 'raid1', 'raid10', 'raid5', or 'raid6'
>          (defaults to 'raid5')
>
> Miscellaneous:
>    -h    display this help message and exit"
>
> 	case "$1" in
> 		out)
> 			echo "$USAGE_STRING"
> 			exit 0
> 			;;
> 		err)
> 			echo "$USAGE_STRING" >&2
> 			exit 1
> 			;;
> 	esac
> }
>
> MODE=missing
> RAID=raid5
> OP=replace
> NUM_FILES=4
>
> while getopts "m:n:o:r:h" OPT; do
> 	case "$OPT" in
> 		m)
> 			MODE="$OPTARG"
> 			;;
> 		r)
> 			RAID="$OPTARG"
> 			;;
> 		o)
> 			OP="$OPTARG"
> 			;;
> 		n)
> 			NUM_FILES="$OPTARG"
> 			if [[ ! "$NUM_FILES" =~ ^[0-9]+$ ]]; then
> 				usage "err"
> 			fi
> 			;;
> 		h)
> 			usage "out"
> 			;;
> 		*)
> 			usage "err"
> 			;;
> 	esac
> done
>
> case "$MODE" in
> 	eio|missing|corrupt)
> 		;;
> 	*)
> 		usage err
> 		;;
> esac
>
> case "$RAID" in
> 	raid[01])
> 		NUM_RAID_DISKS=2
> 		;;
> 	raid10)
> 		NUM_RAID_DISKS=4
> 		;;
> 	raid5)
> 		NUM_RAID_DISKS=3
> 		;;
> 	raid6)
> 		NUM_RAID_DISKS=4
> 		;;
> 	*)
> 		usage err
> 		;;
> esac
>
> case "$OP" in
> 	replace)
> 		NUM_DISKS=$((NUM_RAID_DISKS + 1))
> 		;;
> 	scrub)
> 		NUM_DISKS=$NUM_RAID_DISKS
> 		;;
> 	*)
> 		usage err
> 		;;
> esac
>
> echo "Running $OP on $RAID with $MODE"
>
> SRC_DISK=$((NUM_RAID_DISKS - 1))
> TARGET_DISK=$((NUM_DISKS - 1))
> NUM_SECTORS=$((1024 * 1024))
> LOOP_DEVICES=()
> DM_DEVICES=()
>
> cleanup () {
> 	echo "Done. Press enter to cleanup..."
> 	read
> 	if findmnt /mnt; then
> 		umount /mnt
> 	fi
> 	for DM in "${DM_DEVICES[@]}"; do
> 		dmsetup remove "$DM"
> 	done
> 	for LOOP in "${LOOP_DEVICES[@]}"; do
> 		losetup --detach "$LOOP"
> 	done
> 	for ((i = 0; i < NUM_DISKS; i++)); do
> 		rm -f disk${i}.img
> 	done
> }
> trap 'cleanup; exit 1' ERR
>
> echo "Creating disk images..."
> for ((i = 0; i < NUM_DISKS; i++)); do
> 	rm -f disk${i}.img
> 	dd if=/dev/zero of=disk${i}.img bs=512 seek=$NUM_SECTORS count=0
> 	LOOP_DEVICES+=("$(losetup --find --show disk${i}.img)")
> done
>
> echo "Creating loopback devices..."
> for LOOP in "${LOOP_DEVICES[@]}"; do
> 	DM="${LOOP/\/dev\/loop/dm}"
> 	dmsetup create "$DM" --table "0 $NUM_SECTORS linear $LOOP 0"
> 	DM_DEVICES+=("$DM")
> done
>
> echo "Creating filesystem..."
> FS_DEVICES=("${DM_DEVICES[@]:0:$NUM_RAID_DISKS}")
> FS_DEVICES=("${FS_DEVICES[@]/#//dev/mapper/}")
> echo "${FS_DEVICES[@]}"
> MOUNT_DEVICE="${FS_DEVICES[$(((SRC_DISK + 1) % NUM_RAID_DISKS))]}"
> mkfs.btrfs -d "$RAID" -m "$RAID" "${FS_DEVICES[@]}"
> mount "$MOUNT_DEVICE" /mnt
> for ((i = 0; i < NUM_FILES; i++)); do
> 	dd if=/dev/urandom of=/mnt/file$i bs=1M count=$((1 << $i))
> done
> sync
>
> case "$MODE" in
> 	eio)
> 		echo "Killing disk..."
> 		dmsetup suspend "${DM_DEVICES[$SRC_DISK]}"
> 		dmsetup reload "${DM_DEVICES[$SRC_DISK]}" --table "0 $NUM_SECTORS error"
> 		dmsetup resume "${DM_DEVICES[$SRC_DISK]}"
> 		;;
> 	missing)
> 		echo "Removing disk and remounting degraded..."
> 		umount /mnt
> 		dmsetup remove "${DM_DEVICES[$SRC_DISK]}"
> 		unset DM_DEVICES[$SRC_DISK]
> 		mount -o degraded "$MOUNT_DEVICE" /mnt
> 		;;
> 	corrupt)
> 		echo "Corrupting disk and remounting degraded..."
> 		umount /mnt
> 		dd if=/dev/zero of=/dev/mapper/"${DM_DEVICES[$SRC_DISK]}" bs=1M count=1
> 		mount -o degraded "$MOUNT_DEVICE" /mnt
> 		;;
> esac
>
> case "$OP" in
> 	replace)
> 		echo "Replacing disk..."
> 		btrfs replace start -B $((SRC_DISK + 1)) /dev/mapper/"${DM_DEVICES[$TARGET_DISK]}" /mnt
> 		;;
> 	scrub)
> 		echo "Scrubbing filesystem..."
> 		btrfs scrub start -B /mnt
> 		;;
> esac
>
> echo "Scrubbing to double-check..."
> btrfs scrub start -Br /mnt
>
> cleanup
> ----
>
> Reproducer 2:
>
> ----
> #!/bin/bash
>
> FS_DEVS=(/dev/vdb /dev/vdc /dev/vdd)
> PRUNE_DEV=/dev/vdc
> MNT=/mnt
>
> do_cmd()
> {
> 	echo "   $*"
> 	local output
> 	local ret
> 	output=$("$@" 2>&1)
> 	ret="$?"
> 	[[ "$ret" != 0 ]] && {
> 		echo "$output"
> 	}
> 	return "$ret"
> }
>
> mkdir -p "$MNT"
> for ((i = 0; i < 10; i++)); do
> 	umount "$MNT" &>/dev/null
> done
> dmesg -c >/dev/null
>
> echo "1: Creating filesystem"
> do_cmd mkfs.btrfs -f -d raid5 -m raid5 "${FS_DEVS[@]}" || exit 1
> do_cmd mount "$FS_DEVS" "$MNT" || exit 1
>
> echo "2: Write some data"
> DATA_CNT=4
> for ((i = 0; i < DATA_CNT; i++)); do
> 	size_m="$((1<<i))"
> 	do_cmd dd bs=1M if=/dev/urandom of="$MNT"/file_"$i" count="$size_m" || exit 1
> done
>
> echo "3: Prune a disk in fs"
> do_cmd umount "$MNT" || exit 1
> do_cmd dd bs=1M if=/dev/zero of="$PRUNE_DEV" count=1
> do_cmd mount -o "degraded" "$FS_DEVS" "$MNT" || exit 1
>
> echo "4: Do scrub"
> do_cmd btrfs scrub start -B "$MNT"
>
> echo "5: Checking result"
> dmesg --color
>
> exit 0
> ----
>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in

  parent reply	other threads:[~2015-06-23  3:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-19 18:52 [PATCH v2 0/5] Btrfs: RAID 5/6 missing device scrub+replace Omar Sandoval
2015-06-19 18:52 ` [PATCH v2 1/5] Btrfs: remove misleading handling of missing device scrub Omar Sandoval
2015-06-19 18:52 ` [PATCH v2 2/5] Btrfs: count devices correctly in readahead during RAID 5/6 replace Omar Sandoval
2015-06-19 18:52 ` [PATCH v2 3/5] Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation Omar Sandoval
2015-06-19 18:52 ` [PATCH v2 4/5] Btrfs: fix device replace of a missing RAID 5/6 device Omar Sandoval
2015-06-19 18:52 ` [PATCH v2 5/5] Btrfs: fix parity scrub of RAID 5/6 with missing device Omar Sandoval
2015-06-23  3:07 ` wangyf [this message]
2015-06-24  4:15   ` [PATCH v2 0/5] Btrfs: RAID 5/6 missing device scrub+replace Omar Sandoval
2015-06-24 12:00     ` Ed Tomlinson
2015-06-25  5:03       ` wangyf
2015-06-25 16:35         ` Omar Sandoval
2015-06-26  2:07           ` wangyf
2015-06-26  2:46           ` wangyf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5588CD54.9050405@cn.fujitsu.com \
    --to=wangyf-fnst@cn.fujitsu.com \
    --cc=bugzilla@philip-seeger.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=miaoxie@huawei.com \
    --cc=osandov@fb.com \
    --cc=zhaolei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox