From: Tomasz Chmielewski <tch@virtall.com>
To: Robert White <rwhite@pobox.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!
Date: Sat, 13 Dec 2014 14:53:18 +0100 [thread overview]
Message-ID: <a54531e35f70cb4d15a606385d2b581e@admin.virtall.com> (raw)
In-Reply-To: <548C0942.6020304@pobox.com>
On 2014-12-13 10:39, Robert White wrote:
> Might I ask why you are running balance? After a persistent error I'd
> understand going straight to scrub, but balance is usually for
> transformation or to redistribute things after atypical use.
There were several reasons for running balance on this system:
1) I was getting "no space left", even though there were hundreds of GBs
left. Not sure if this still applies to the current kernels (3.18 and
later) though, but it was certainly the problem in the past.
2) The system was regularly freezing, I'd say once a week was a norm.
Sometimes I was getting btrfs traces logged in syslog.
After a few freezes the fs was getting corrupted to different degree. At
some point, it was so bad that it was only possible to use it read only.
So I had to get the data off, reformat, copy back... It would start
crashing after a few weeks of usage.
My usage case is quite simple:
- skinny extents, extended inode refs
- mount compress-force=zlib
- rsync many remote data sources (-a -H --inplace --partial) + snapshot
- around 500 snapshots in total, from 20 or so subvolumes
Especially rsync's --inplace option combined with many snapshots and
large fragmentation was deadly for btrfs - I was seeing system freezes
right when rsyncing a highly fragmented, large file.
Then, running balance on the "corrupted" filesystem was more an exercise
(if scrub passes fine, I would expect balance to pass as well). Some
BUGs it was causing was sometimes fixed in newer kernels, sometimes not
(btrfsck was not really usable a few months back).
3) I had different luck with recovering btrfs after a failed drive (in
RAID-1). Sometimes it worked as expected, sometimes, the fs was getting
broken so much I had to rsync data off it and format from scratch (where
mdraid would kick the drive after getting write errors - it's not the
case with btrfs, and weird things can happen).
Sometimes, running "btrfs device delete missing" (it's balance in
principle, I think) would take weeks, during which a second drive could
easily die.
Again, running balance would be more exercise there, to see if the newer
kernel still crashes.
> An entire generation of folks have grown used to defraging windows
> boxes and all, but if you've already got an array that is going to
> take "many days" to balance what benefit do you actually expect to
> receive?
For me - it's a good test to see if btrfs is finally getting stable
(some cases explained above).
> Defrag -- used for "I think I'm getting a lot of unnecessary head seek
> in this application, these files need to be brought into closer
> order".
Fragmentation was an issue for btrfs, at least a few kernels back (as
explained above, with rsync's --inplace).
However, I'm not running autodefrag anywhere - not sure how it affects
snapshots.
> Scrub -- used for defensive checking a-la checkdisk. "I suspect that
> after that unexpected power outage something may be a little off", or
> alternately "I think my disks are giving me bitrot, I better check".
For me, it was passing fine, where balance was crashing the kernel.
Again, my main rationale for running balance is to see if btrfs is
behaving stable. While I have systems with btrfs which are running fine
for months, I also have ones which will crash after 1-2 weeks (once the
system grows in size / complexity).
So hopefully, btrfsck had fixed that fs - once it is running stable for
a week or two, I might be brave to re-enable btrfs quotas (was another
system freezer, at least a few kernels back).
--
Tomasz Chmielewski
http://www.sslrack.com
next prev parent reply other threads:[~2014-12-13 13:53 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-02 7:27 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931! Tomasz Chmielewski
2014-10-03 18:17 ` Josef Bacik
2014-10-03 22:06 ` Tomasz Chmielewski
2014-10-03 22:09 ` Josef Bacik
2014-10-04 21:47 ` Tomasz Chmielewski
2014-10-04 22:07 ` Josef Bacik
2014-11-25 22:33 ` Tomasz Chmielewski
2014-12-12 14:37 ` 3.18.0: kernel BUG at fs/btrfs/relocation.c:242! Tomasz Chmielewski
2014-12-12 21:36 ` Robert White
2014-12-12 21:46 ` Tomasz Chmielewski
2014-12-12 22:34 ` Robert White
2014-12-12 22:46 ` Tomasz Chmielewski
2014-12-12 22:58 ` Robert White
2014-12-13 8:16 ` Tomasz Chmielewski
2014-12-13 9:39 ` Robert White
2014-12-13 13:53 ` Tomasz Chmielewski [this message]
2014-12-13 20:54 ` Robert White
2014-12-13 21:52 ` Tomasz Chmielewski
2014-12-13 23:56 ` Robert White
2014-12-14 8:45 ` Robert White
2014-12-15 20:07 ` Josef Bacik
2014-12-15 23:27 ` Tomasz Chmielewski
2014-12-19 21:47 ` Josef Bacik
2014-12-19 23:18 ` Tomasz Chmielewski
2014-10-13 15:15 ` 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931! Rich Freeman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a54531e35f70cb4d15a606385d2b581e@admin.virtall.com \
--to=tch@virtall.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=rwhite@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).