From: Robert White <rwhite@pobox.com>
To: Tomasz Chmielewski <tch@virtall.com>
Cc: Josef Bacik <jbacik@fb.com>, linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!
Date: Sat, 13 Dec 2014 01:39:14 -0800 [thread overview]
Message-ID: <548C0942.6020304@pobox.com> (raw)
In-Reply-To: <016ec9492d0340f8ac439ba4bcec03ad@admin.virtall.com>
On 12/13/2014 12:16 AM, Tomasz Chmielewski wrote:
> On 2014-12-12 23:58, Robert White wrote:
>
>> I don't have the history to answer this definitively, but I don't
>> think you have a choice. Nothing else is going to touch that error.
>>
>> I have not seen any "oh my god, btrfsck just ate my filesystem errors"
>> since I joined the list -- but I am a relative newcomer.
>>
>> I know that you, of course, as a contentious and well-traveled system
>> administrator, already have a current backup since you are doing
>> storage maintenance... right? 8-)
>
> Who needs backups with btrfs, right? :)
>
> So apparently btrfsck --repair fixed some issues, the fs is still
> mountable and looks fine.
>
> Running balance again, but that will take many days there.
Might I ask why you are running balance? After a persistent error I'd
understand going straight to scrub, but balance is usually for
transformation or to redistribute things after atypical use.
An entire generation of folks have grown used to defraging windows boxes
and all, but if you've already got an array that is going to take "many
days" to balance what benefit do you actually expect to receive?
Defrag -- used for "I think I'm getting a lot of unnecessary head seek
in this application, these files need to be brought into closer order".
Scrub -- used for defensive checking a-la checkdisk. "I suspect that
after that unexpected power outage something may be a little off", or
alternately "I think my disks are giving me bitrot, I better check".
Btrfsck -- used for "I suspect structural problems caused by real world
events like power hits or that one time when the cat knocked over my
tower case while I was vacuuming all my sql tables." (often reserved for
"hey, I'm getting weird messages from the kernel about things in my
filesystem".)
Balance -- primary -- used for "Well I used to use this filessytem for a
small number of large files, but now I am processing a large number of
small files and I'm running out of metadata even though I've got a lot
of space." (or vice versa)
Balance -- other -- used for "I just changed the geometry of my
filessytem by adding or removing a disk and I want to spread out.
Balance -- (conversion/restructuring) -- used for "single is okay, but
I'd rather raid-0 to spread out my load across these many disks" or
"gee, I'd like some redundancy now that I have the room.
Frequent balancing of a Copy On Write filesystem will tend to make
things somewhat anti-optimal. You are burping the natural working space
out of the natural layout.
Since COW implies mandatory movement of data, every time you burp out
all the slack and pack all the data together you are taking your
regularly modified files and moving them far, far away from the places
where frequently modified files are most happy (e.g. the
only-partly-full data region they were just living in).
Similarly two files that usually get modified at the same time (say a
databse file and its rollback log) will tend to end up in the same
active data extent as time goes on, and if balance decides it can "clean
up" that extent it will likely give those two files a data-extent
divorce and force them to the opposite ends of dataland.
COW systems are inherently somewhat chaotic. If you fight that too
aggressively you will, at best, be wasting the maintenance time.
It may be a decrease in performance measured in very small quanta, but
so is the expected benefit of most maintenance.
From the wiki::
https://btrfs.wiki.kernel.org/index.php/FAQ#What_does_.22balance.22_do.3F
btrfs filesystem balance is an operation which simply takes all of the
data and metadata on the filesystem, and re-writes it in a different
place on the disks, passing it through the allocator algorithm on the
way. It was originally designed for multi-device filesystems, to spread
data more evenly across the devices (i.e. to "balance" their usage).
This is particularly useful when adding new devices to a nearly-full
filesystem.
Due to the way that balance works, it also has some useful side-effects:
If there is a lot of allocated but unused data or metadata chunks, a
balance may reclaim some of that allocated space. This is the main
reason for running a balance on a single-device filesystem.
On a filesystem with damaged replication (e.g. a RAID-1 FS with a dead
and removed disk), it will force the FS to rebuild the missing copy of
the data on one of the currently active devices, restoring the RAID-1
capability of the filesystem.
next prev parent reply other threads:[~2014-12-13 9:39 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-02 7:27 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931! Tomasz Chmielewski
2014-10-03 18:17 ` Josef Bacik
2014-10-03 22:06 ` Tomasz Chmielewski
2014-10-03 22:09 ` Josef Bacik
2014-10-04 21:47 ` Tomasz Chmielewski
2014-10-04 22:07 ` Josef Bacik
2014-11-25 22:33 ` Tomasz Chmielewski
2014-12-12 14:37 ` 3.18.0: kernel BUG at fs/btrfs/relocation.c:242! Tomasz Chmielewski
2014-12-12 21:36 ` Robert White
2014-12-12 21:46 ` Tomasz Chmielewski
2014-12-12 22:34 ` Robert White
2014-12-12 22:46 ` Tomasz Chmielewski
2014-12-12 22:58 ` Robert White
2014-12-13 8:16 ` Tomasz Chmielewski
2014-12-13 9:39 ` Robert White [this message]
2014-12-13 13:53 ` Tomasz Chmielewski
2014-12-13 20:54 ` Robert White
2014-12-13 21:52 ` Tomasz Chmielewski
2014-12-13 23:56 ` Robert White
2014-12-14 8:45 ` Robert White
2014-12-15 20:07 ` Josef Bacik
2014-12-15 23:27 ` Tomasz Chmielewski
2014-12-19 21:47 ` Josef Bacik
2014-12-19 23:18 ` Tomasz Chmielewski
2014-10-13 15:15 ` 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931! Rich Freeman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=548C0942.6020304@pobox.com \
--to=rwhite@pobox.com \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=tch@virtall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).