From: Goffredo Baroncelli <kreijack@libero.it>
To: Lutz Vieweg <lvml@5t9.de>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Does btrfs "raid1" actually provide any resilience?
Date: Thu, 14 Nov 2013 19:22:02 +0100 [thread overview]
Message-ID: <528514CA.8080903@libero.it> (raw)
In-Reply-To: <l62akc$ujc$1@ger.gmane.org>
On 2013-11-14 12:02, Lutz Vieweg wrote:
> Hi,
>
> on a server that so far uses an MD RAID1 with XFS on it we wanted
> to try btrfs, instead.
>
> But even the most basic check for btrfs actually providing
> resilience against one of the physical storage devices failing
> yields a "does not work" result - so I wonder whether I misunderstood
> that btrfs is meant to not require block-device level RAID
> functionality underneath.
I don't think that you have misunderstood btrfs. On the basis of my
knowledge you are right.
With a kernel v3.11.6 I made your test and I got the following:
- 2 disks of 100M each and 1 file of 70M: I was *unable* to create the
file because I got a "No space left on device". I was not surprise BTRFS
behaves bad when the free space is low. However I was able to remove a
disk and remount the filesystem in "degraded" mode.
- 2 disk of 3G each and 1 file of 100M: I was *able* to create the file,
and to remount the filesystem in degraded mode when I deleted a disk.
Note: in any case I needed to mount the filesystem in read-only mode.
I will try also with a 3.12 kernel.
BR
G.Baroncelli
>
> Here are the test procedure:
>
> Testing was done using vanilla linux-3.12 (x86_64) plus btrfs-progs at
> commit c652e4efb8e2dd76ef1627d8cd649c6af5905902.
>
> Preparing two 100 MB image files:
>> # dd if=/dev/zero of=/tmp/img1 bs=1024k count=100
>> 100+0 records in
>> 100+0 records out
>> 104857600 bytes (105 MB) copied, 0.201003 s, 522 MB/s
>>
>> # dd if=/dev/zero of=/tmp/img2 bs=1024k count=100
>> 100+0 records in
>> 100+0 records out
>> 104857600 bytes (105 MB) copied, 0.185486 s, 565 MB/s
>
> Preparing two loop devices on those images to act as the underlying
> block devices for btrfs:
>> # losetup /dev/loop1 /tmp/img1
>> # losetup /dev/loop2 /tmp/img2
>
> Preparing the btrfs filesystem on the loop devices:
>> # mkfs.btrfs --data raid1 --metadata raid1 --label test /dev/loop1
>> /dev/loop2
>> SMALL VOLUME: forcing mixed metadata/data groups
>>
>> WARNING! - Btrfs v0.20-rc1-591-gc652e4e IS EXPERIMENTAL
>> WARNING! - see http://btrfs.wiki.kernel.org before using
>>
>> Performing full device TRIM (100.00MiB) ...
>> Turning ON incompat feature 'mixed-bg': mixed data and metadata block
>> groups
>> Created a data/metadata chunk of size 8388608
>> Performing full device TRIM (100.00MiB) ...
>> adding device /dev/loop2 id 2
>> fs created label test on /dev/loop1
>> nodesize 4096 leafsize 4096 sectorsize 4096 size 200.00MiB
>> Btrfs v0.20-rc1-591-gc652e4e
>
> Mounting the btfs filesystem:
>> # mount -t btrfs /dev/loop1 /mnt/tmp
>
> Copying just 70MB of zeroes into a test file:
>> # dd if=/dev/zero of=/mnt/tmp/testfile bs=1024k count=70
>> 70+0 records in
>> 70+0 records out
>> 73400320 bytes (73 MB) copied, 0.0657669 s, 1.1 GB/s
>
> Checking that the testfile can be read:
>> # md5sum /mnt/tmp/testfile
>> b89fdccdd61d57b371f9611eec7d3cef /mnt/tmp/testfile
>
> Unmounting before further testing:
>> # umount /mnt/tmp
>
>
> Now we assume that one of the two "storage devices" is broken,
> so we remove one of the two loop devices:
>> # losetup -d /dev/loop1
>
> Trying to mount the btrfs filesystem from the one storage device that is
> left:
>> # mount -t btrfs -o device=/dev/loop2,degraded /dev/loop2 /mnt/tmp
>> mount: wrong fs type, bad option, bad superblock on /dev/loop2,
>> missing codepage or helper program, or other error
>> In some cases useful info is found in syslog - try
>> dmesg | tail or so
> ... does not work.
>
> In /var/log/messages we find:
>> kernel: btrfs: failed to read chunk root on loop2
>> kernel: btrfs: open_ctree failed
>
> (The same happenes when adding ",ro" to the mount options.)
>
> Ok, so if the first of two disks was broken, so is our filesystem.
> Isn't that what RAID1 should prevent?
>
> We tried a different scenario, now the first disk remains
> but the second is broken:
>
>> # losetup -d /dev/loop2
>> # losetup /dev/loop1 /tmp/img1
>>
>> # mount -t btrfs -o degraded /dev/loop1 /mnt/tmp
>> mount: wrong fs type, bad option, bad superblock on /dev/loop1,
>> missing codepage or helper program, or other error
>> In some cases useful info is found in syslog - try
>> dmesg | tail or so
>>
>> In /var/log/messages:
>> kernel: Btrfs: too many missing devices, writeable mount is not allowed
>
> The message is different, but still unsatisfactory: Not being
> able to write to a RAID1 because one out of two disks failed
> is not what one would expect - the machine should be operable just
> normal with a degraded RAID1.
>
> But let's try if at least a read-only mount works:
>> # mount -t btrfs -o degraded,ro /dev/loop1 /mnt/tmp
> The mount command itself does work.
>
> But then:
>> # md5sum /mnt/tmp/testfile
>> md5sum: /mnt/tmp/testfile: Input/output error
>
> The testfile is not readable anymore. (At this point, no messages
> are to be found in dmesg/syslog - I would expect such on an
> input/output error.)
>
> So the bottom line is: All the double writing that comes with RAID1
> mode did not provide any usefule resilience.
>
> I am kind of sure this is not as intended, or is it?
>
> Regards,
>
> Lutz Vieweg
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
next prev parent reply other threads:[~2013-11-14 18:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-14 11:02 Does btrfs "raid1" actually provide any resilience? Lutz Vieweg
2013-11-14 17:18 ` George Mitchell
2013-11-14 17:35 ` Lutz Vieweg
2013-11-14 19:59 ` Kyle Gates
2013-11-15 1:58 ` George Mitchell
2013-11-14 18:22 ` Goffredo Baroncelli [this message]
2013-11-14 20:47 ` BUG: btrfsRe: " Goffredo Baroncelli
2013-11-14 21:21 ` Mixed and raid [was Re: BUG: btrfsRe: Does btrfs "raid1" actually provide any resilience?] Goffredo Baroncelli
2013-11-15 4:44 ` Anand Jain
2013-11-15 10:35 ` Lutz Vieweg
2013-11-15 10:36 ` Lutz Vieweg
2013-11-15 7:12 ` Duncan
2013-11-15 7:30 ` Goffredo Baroncelli
2013-11-15 9:37 ` Duncan
2013-11-14 21:22 ` BUG: btrfsRe: Does btrfs "raid1" actually provide any resilience? Chris Murphy
2013-11-14 21:31 ` Goffredo Baroncelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=528514CA.8080903@libero.it \
--to=kreijack@libero.it \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=lvml@5t9.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).