On 2015-10-05 09:14, Hugo Mills wrote: > On Mon, Oct 05, 2015 at 08:30:17AM -0400, Austin S Hemmelgarn wrote: >> I've been having issues recently with a relatively simple setup >> using a two device BTRFS raid1 on top of two two device md RAID0's, >> and every time I've rebooted since starting trying to use this >> particular filesystem, I've found it unable to mount and had to >> recreate it from scratch. This is more of an inconvenience than >> anything else (while I don't have backups of it, all the data is >> trivial to recreate (in fact, so trivial that doing backups would be >> more effort than just recreating the data by hand)), but it's still >> something that I would like to try and fix. >> >> First off, general info: >> Kernel version: 4.2.1-local+ (4.2.1 with minor modifications, >> sources can be found here: https://github.com/ferroin/linux) >> Btrfs-progs version: 4.2 >> >> I would post output from btrfs fi show, but that's spouting >> obviously wrong data (it's saying I'm using only 127MB with 2GB of >> allocations on each 'disk', I had been storing approximately 4-6GB >> of actual data on the filesystem). >> >> This particular filesystem is composed of BTRFS raid1 across two LVM >> managed DM/MD RAID0 devices, each of which spans 2 physical hard >> drives. I have a couple of other filesystems with the exact same >> configuration that have not ever displayed this issue. >> >> When I run 'btrfs check' on the filesystem when it refuses to mount, >> I get a number of lines like the following: >> bad metadata [, ) crossing stripe boundary >> >> followed eventually by: >> Errors found in extent allocation tree or chunk allocation > > I _think_ this is a bug in mkfs from 4.2.0, fixed in later > releases of the btrfs-progs. If so, that's good news (that is, that it's just a mkfs bug). I guess it's time for me to quit waiting around for Gentoo to package the newest version and build it myself. > >> As is typical of a failed mount, dmesg shows a 'failed to read the >> system array on ' 'open_ctree failed'. >> >> I doubt that this is a hardware issue because: >> 1. Memory is brand new, and I ran a 48 hour burn-in test that showed >> no errors. >> 2. A failing storage controller, PSU, or CPU would be manifesting >> with many more issues than just this. >> 3. A disk failure would mean that two different disks, from >> different manufacturing lots, are encountering errors on exactly the >> same LBA's at exactly the same time, which while possible is >> astronomically unlikely for disks bigger than a few hundred >> gigabytes (the disks in question are 1TB each). >>