All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: Qu Wenruo <wqu@suse.com>
Cc: Forza <forza@tnonline.net>, Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: What mechanisms protect against split brain?
Date: Wed, 08 Jun 2022 18:15:02 +0800	[thread overview]
Message-ID: <20220608181502.4AB1.409509F4@e16-tech.com> (raw)
In-Reply-To: <20220608104421.3759.409509F4@e16-tech.com>

Hi, Forza, Qu Wenruo

I write a script to test RAID1 split brain base on Qu's work of raid5(*1)
*1: https://lore.kernel.org/linux-btrfs/53f7bace2ac75d88ace42dd811d48b7912647301.1654672140.git.wqu@suse.com/T/#u

#!/bin/bash
set -uxe -o pipefail

mnt=/mnt/test
dev1=/dev/vdb1
dev2=/dev/vdb2

  dmesg -C
  mkdir -p $mnt

  mkfs.btrfs -f -m raid1 -d raid1 $dev1 $dev2
  mount $dev1 $mnt
  xfs_io -f -c "pwrite -S 0xee 0 1M" $mnt/file1
  sync
  umount $mnt

  btrfs dev scan -u $dev2
  mount -o degraded $dev1 $mnt
  #xfs_io -f -c "pwrite -S 0xff 0 128M" $mnt/file2
  mkdir -p $mnt/branch1; /bin/cp -R /usr/bin $mnt/branch1 #complex than xfs_io
  umount $mnt

  btrfs dev scan
  btrfs dev scan -u $dev1
  mount -o degraded $dev2 $mnt
  #xfs_io -f -c "pwrite -S 0xff 0 128M" $mnt/file2
  mkdir -p $mnt/branch2; /bin/cp -R /usr/lib64 $mnt/branch2 #complex than xfs_io
  umount $mnt

  btrfs dev scan
  mount $dev1 $mnt # *1
  ls $mnt

  btrfs balance start --full-balance $mnt # *2
  #btrfs scrub start -B $mnt  # *3
  #btrfs scrub start $mnt; sleep 2; btrfs scrub status $mnt; btrfs scrub start -B $mnt; # *4

  umount $mnt

test result:
we may fail in # *1; # *2; # *3; #*4 with different frequency.

dmesg output:
1)
[ 1379.124079] BTRFS error (device vdb1): tree level mismatch detected, bytenr=31866880 level expected=1 has=0
[ 1379.127928] BTRFS error (device vdb1): tree level mismatch detected, bytenr=31866880 level expected=1 has=0
[ 1379.132109] BTRFS error (device vdb1: state C): failed to load root csum
[ 1379.137281] BTRFS error (device vdb1: state C): open_ctree failed

2)
[ 2950.467178] BTRFS error (device vdb1): tree first key mismatch detected, bytenr=32342016 parent_transid=9 key expected=(301555712,168,106496) has=(2552,96,5)
[ 2950.471283] BTRFS error (device vdb1): tree first key mismatch detected, bytenr=32342016 parent_transid=9 key expected=(301555712,168,106496) has=(2552,96,5)
[ 2950.479960] BTRFS info (device vdb1): balance: ended with status: -117

so RAID1 split brain case yet not supported by btrfs now.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/06/08

> Hi,
> 
> I tried some test about this case.
> 
> After the missing RAID1 device is re-introduced,
> 1, mount/read seem to work.
>    checksum based error detect help.
>    current pid based i/o patch select policy may help too.
>        preferred_mirror = first + (current->pid % num_stripes);
> 
> 2, 'btrfs scrub' failed to finish.
>     Any advice to return to clean state?
> 
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2022/06/08
> 
> > Hi,
> > 
> > Recently there have been some discussions, both here on the mailing list and on #btrfs IRC, about the consequences of mounting one RAID1 mirror as degraded and then later re-introduce the missing device. But also on having degraded mount option in fstab and kernel command line.
> > 
> > So I wonder if Btrfs has some protective mechanisms against data loss/corruption if a drive is missing for a bit but later re-introduced. There is also the case of split brain where each mirror might be independently updated and then recombined.
> > 
> > Is there an official recommendation to have with regards to degraded mounts from kernel command line? I understand the use case as it allows the system to boot even if a device goes missing or dead after a reboot.
> > 
> > Thanks,
> > Forza
> 



  reply	other threads:[~2022-06-08 10:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-29 11:34 What mechanisms protect against split brain? Forza
2022-06-08  2:44 ` Wang Yugui
2022-06-08 10:15   ` Wang Yugui [this message]
2022-06-08 10:32     ` Qu Wenruo
2022-06-08 10:58       ` Wang Yugui
2022-06-08 11:19         ` Qu Wenruo
2022-06-08 11:55           ` Wang Yugui
2022-06-08 11:59             ` Qu Wenruo
2022-06-08 11:40       ` Austin S. Hemmelgarn
2022-06-08 14:11       ` Andrei Borzenkov
2022-06-08 20:22         ` Forza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220608181502.4AB1.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=forza@tnonline.net \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.