From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Chris Murphy <lists@colorremedies.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS raid6 unmountable after a couple of days of usage.
Date: Wed, 15 Jul 2015 07:07:47 -0400 [thread overview]
Message-ID: <55A63F03.2080207@gmail.com> (raw)
In-Reply-To: <CAJCQCtSEk3cfwWTyzzQft1+eHcUDm_zFiBpYZM15rhZBzF3U2A@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3994 bytes --]
On 2015-07-14 19:20, Chris Murphy wrote:
> On Tue, Jul 14, 2015 at 7:25 AM, Austin S Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> On 2015-07-14 07:49, Austin S Hemmelgarn wrote:
>>>
>>> So, after experiencing this same issue multiple times (on almost a dozen
>>> different kernel versions since 4.0) and ruling out the possibility of it
>>> being caused by my hardware (or at least, the RAM, SATA controller and disk
>>> drives themselves), I've decided to report it here.
>>>
>>> The general symptom is that raid6 profile filesystems that I have are
>>> working fine for multiple weeks, until I either reboot or otherwise try to
>>> remount them, at which point the system refuses to mount them.
>>>
>>> I'm currently using btrfs-progs v4.1 with kernel 4.1.2, although I've been
>>> seeing this with versions of both since 4.0.
>>>
>>> Output of 'btrfs fi show' for the most recent fs that I had this issue
>>> with:
>>> Label: 'altroot' uuid: 86eef6b9-febe-4350-a316-4cb00c40bbc5
>>> Total devices 4 FS bytes used 9.70GiB
>>> devid 1 size 24.00GiB used 6.03GiB path
>>> /dev/mapper/vg-altroot.0
>>> devid 2 size 24.00GiB used 6.01GiB path
>>> /dev/mapper/vg-altroot.1
>>> devid 3 size 24.00GiB used 6.01GiB path
>>> /dev/mapper/vg-altroot.2
>>> devid 4 size 24.00GiB used 6.01GiB path
>>> /dev/mapper/vg-altroot.3
>>>
>>> btrfs-progs v4.1
>>>
>>> Each of the individual LVS that are in the FS is just a flat chunk of
>>> space on a separate disk from the others.
>>>
>>> The FS itself passes btrfs check just fine (no reported errors, exit value
>>> of 0), but the kernel refuses to mount it with the message 'open_ctree
>>> failed'.
>>>
>>> I've run btrfs chunk recover and attached the output from that.
>>>
>>> Here's a link to an image from 'btrfs image -c9 -w':
>>> https://www.dropbox.com/s/pl7gs305ej65u9q/altroot.btrfs.img?dl=0
>>> (That link will expire in 30 days, let me know if you need access to it
>>> beyond that).
>>>
>>> The filesystems in question all see relatively light but consistent usage
>>> as targets for receiving daily incremental snapshots for on-system backups
>>> (and because I know someone will mention it, yes, I do have other backups of
>>> the data, these are just my online backups).
>>>
>> Further updates, I just tried mounting the filesystem from the image above
>> again, this time passing device= options for each device in the FS, and it
>> seems to be working fine now. I've tried this with the other filesystems
>> however, and they still won't mount.
>>
>
> And it's the same message with the usual suspects: recovery,
> ro,recovery ? How about degraded even though it's not degraded? And
> what about 'btrfs rescue zero-log' ?
Yeah, same result for both, and zero-log didn't help (although that kind
of doesn't surprise me, as it was cleanly unmounted).
>
> Of course it's weird that btrfs check doesn't complain, but mount
> does. I don't understand that, so it's good you've got an image. If
> either recovery or zero-log fix the problem, my understanding is this
> suggests hardware did something Btrfs didn't expect.
I've run into cases in the past where this happens, although not
recently (last time I remember it happening was back around 3.14 I
think); and, interestingly, running check --repair in those cases did
fix things, although that didn't complain about any issues either.
I've managed to get the other filesystems I was having issues with
mounted again with the device= options and clear_cache after running
btrfs dev scan a couple of times. It seems to me (at least from what
I'm seeing) that there is some metadata that isn't synchronized properly
between the disks. I've heard mention from multiple sources of similar
issues happening occasionally with raid1 back around kernel 3.16-3.17,
and passing a different device to mount helping with that.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]
next prev parent reply other threads:[~2015-07-15 11:07 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-14 11:49 BTRFS raid6 unmountable after a couple of days of usage Austin S Hemmelgarn
2015-07-14 13:25 ` Austin S Hemmelgarn
2015-07-14 23:20 ` Chris Murphy
2015-07-15 11:07 ` Austin S Hemmelgarn [this message]
2015-07-15 15:45 ` Chris Murphy
2015-07-15 16:15 ` Hugo Mills
2015-07-15 21:29 ` Chris Murphy
2015-07-16 11:41 ` Austin S Hemmelgarn
2015-08-25 18:12 ` Austin S Hemmelgarn
2015-07-16 11:49 ` Austin S Hemmelgarn
2015-08-25 18:09 ` Austin S Hemmelgarn
[not found] <55A5B13C.6060009@spotprint.com.au>
2015-07-15 6:53 ` Ryan Bourne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55A63F03.2080207@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).