From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: "Miguel Negrão" <miguel.negrao-lists@friendlyvirus.org>,
linux-btrfs@vger.kernel.org
Subject: Re: Btrfs tragedy: lack of space for metadata leads to loss of fs.
Date: Tue, 25 Aug 2015 10:00:02 -0400 [thread overview]
Message-ID: <55DC74E2.9000909@gmail.com> (raw)
In-Reply-To: <loom.20150825T151154-787@post.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 4535 bytes --]
On 2015-08-25 09:44, Miguel Negrão wrote:
> Hi list,
>
> This weekend had my first btrfs horror story.
>
> system: 3.13.0-49-lowlatency, btrfs-progs v4.1.2
>
> A disclaimer: I know 3.13 is very out of date, but I the requirement of
> keeping kernel up to date clashes with my requirement of keeping a stable
> system. At the moment I can't disturb my system as I'm doing important work,
> upgrading kernel requires upgrading ubuntu, which will upgrade a lot of
> packages and might lead to problems which I don't have time to fix. One
> might argue that in the end I lost time anyway dealing with these btrfs
> issues. When I'm done with this current work I will update the whole system
> which will update the kernel in the process.
>
> Story:
>
> btrfs fi show / -> devid 1 size 92.27GiB used 92.27GiB.
>
> Suddenly a 100GB single device fs goes into a state where it doesn't have
> more free space, no new files can be written. 'fi usage' says a minimum of
> 20GB are free, but metadata is 4.97GiB allocated vs 4.45Gib used. I decide
> to do a 'balance -dusage=55' as lower values of usage don't balance
> anything. This starts a balance of '0 out of 0 chunks' which goes on for 24h
> (status always says 0 considered, -nan% left, dmesg had only 'relocating
> block group 32...... flags 36'). This is a OCZ vertex 3, a quite fast SSD.
> 24h seemed excessive to me, I assume that the balance has gone wrong
> somehow. I shutdown the system to see if the balance will stop. On the first
> boot up the fs is still mounted, on a second boot the fs no longer mounts.
>
> I switch to a nixos usb pen running kernel 4.1.6 and progs up to date also
> (probably 4.1.x). Trying to mount the fs results in a kernel error (
> http://pastebin.com/CzryecsX ).
>
> trying mounting with '-o recovery,ro' hangs the system, a reboot is needed.
>
> I proceed to get contents of disk via 'restore'. I then do btrfs-zero-log,
> still doesn't mount, and then do btrfs check --repair a couple of times (log
> before running with '--repair' http://pastebin.com/VPZLjcXR)
>
> I then try to mount with '-o recovery,ro' and it works !! (thank you btrfs
> check !!). I proceed to get the data out of the disk, this seems to go well,
> no errors in dmesg. I then try to mount without recovery,ro and again the
> kernel hangs. One time I had the dmesg window open and was able to see that
> it was something about ..._async_reclaim_metadata_space.
>
> I finally give up on the filesystem and format it. I have an image produced
> via btrfs-image if it is of interest.
>
> A couple of notes:
>
> 1) I know btrfs is experimental, anything that I might have lost is my fault
> (I didn't loose anything).
> 2) I think this issue of free space is a known issues being worked on, that
> no space to allocate when metadata space is needed will cause no free space
> problem. It's on the faq. Still, it doesn't seem acceptable to me that the
> system can go into a state where in the end the fs is destroyed or the linux
> machine is unusable (can never turn it off because balance never stops),
> specially since there was still a lot of space in the disk. If the system
> somehow had rebalanced itself automatically this could have been avoided. It
> shouldn't let itself get into this corner. Perhaps a bit of space should
> always be kept for emergency rebalancing ?
> 4) I wonder if that balance would have ended or if it had just stalled. It
> seems that a reboot with a ongoing or stalled balance will cause fs
> destruction. Possibly an issue already fixed in later kernels. Also, why did
> it say 0 considered, -nan% left ? -nan% looks strange.
> 3) The system freezing when trying to mount a fs is possible not supposed to
> happen ? (this was on the latest kernel)
>
> Just wanted to share the story in case it is of some use for developers.
> Again, possibly this happened due to issues already fixed in later kernels.
> Anyway, despite this issue, I'm still quite happy with btrfs overall.
>
One comment I would like to make about this: I have heard numerous
stories of OCZ brand SSD's having significant data corruption issues
(along the lines of writes returning successful when they really failed,
and blocks that are in use getting randomly erased) that can cause
severe data loss and filesystem problems. While I do think that btrfs
needs to be improved when faced with such things, I would not at all be
surprised if the SSD was the root cause of the issue.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]
next prev parent reply other threads:[~2015-08-25 14:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-25 13:44 Btrfs tragedy: lack of space for metadata leads to loss of fs Miguel Negrão
2015-08-25 14:00 ` Austin S Hemmelgarn [this message]
2015-08-25 14:26 ` Miguel Negrão
2015-08-25 14:53 ` Austin S Hemmelgarn
2015-08-25 14:59 ` Marc MERLIN
2015-08-25 15:24 ` Miguel Negrão
2015-08-25 15:43 ` Austin S Hemmelgarn
2015-08-25 15:17 ` Matt Ruffalo
2015-08-25 15:53 ` Miguel Negrão
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55DC74E2.9000909@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=miguel.negrao-lists@friendlyvirus.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).