* backpointer mismatch
@ 2014-01-10 3:59 Peter van Hoof
2014-01-10 14:26 ` Duncan
0 siblings, 1 reply; 4+ messages in thread
From: Peter van Hoof @ 2014-01-10 3:59 UTC (permalink / raw)
To: linux-btrfs
Hi,
I am using btrfs for my backup RAID. This had been running well for
about a year. Recently I decided the upgrade the backup server to
openSUSE 13.1. I checked all filesystems before the upgrade and
everything was clean. I had several attempts at upgrading the system,
but all failed (the installation of some rpm would hang indefinitely).
So I aborted the installation and reverted the system back to openSUSE
12.3 (with a custom-installed 3.9.7 kernel). Unfortunately, after this
the backup RAID reported lots of errors.
When I run btrfsck on the filesystem, I get around 1.3M of these messages:
Extent back ref already exists for 1116254208 parent 11145490432 root 0
and around 1.2M of these:
ref mismatch on [90670907392 4096] extent item 11, found 12
Incorrect global backref count on 90670907392 found 11 wanted 12
backpointer mismatch on [90670907392 4096]
Filtering these out, this is the remaining output:
checking extents
Errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/md2
UUID: 0b6a9d0d-e501-4a23-9d09-259b1f5b5652
found 2213988384746 bytes used err is 0
total csum bytes: 3185850148
total tree bytes: 42770862080
total fs tree bytes: 36787625984
total extent tree bytes: 1643925504
btree space waste bytes: 12475940633
file data blocks allocated: 5269432860672
referenced 5254870626304
Btrfs v3.12+20131125
(this version of btrfsck comes from openSUSE factory).
I also ran btrfs scrub on the file system. This uncovered 4 checksum
errors which I could repair manually. I do not know if that is related
to the problem above. At least it didn't solve it...
The btrfs file system is installed on top of an mdadm RAID5.
How worried should I be about the reported errors? What confuses me is
that in the end btrfsck reports an error count of 0.
Should I try to repair this? I have had bad experiences in the past with
"btrfsck --repair", but that was with a much older version...
I can of course recreate the backups, but this would take a long time
and I would loose my entire snapshot history which I would rather avoid...
Cheers,
Peter.
--
Peter van Hoof
Royal Observatory of Belgium
Ringlaan 3
1180 Brussel
Belgium
http://homepage.oma.be/pvh
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: backpointer mismatch
2014-01-10 3:59 backpointer mismatch Peter van Hoof
@ 2014-01-10 14:26 ` Duncan
2014-01-10 15:16 ` Roman Mamedov
0 siblings, 1 reply; 4+ messages in thread
From: Duncan @ 2014-01-10 14:26 UTC (permalink / raw)
To: linux-btrfs
Peter van Hoof posted on Fri, 10 Jan 2014 04:59:46 +0100 as excerpted:
> I am using btrfs for my backup RAID.
Oh, boy! You're doing several things wrong in this post, tho you've also
managed to get a couple thing right that a lot of people get wrong, too,
which just might have saved your data!. Here's what I see:
1) Btrfs is still in heavy development, and there are warnings all over
your kernel btrfs option (tho that one has been reduced in severity of
late, for 3.13 actually, I believe, but your kernels are earlier than
that), mkfs.btrfs, and the btrfs wiki at https://btrfs.wiki.kernel.org ,
about being sure you have tested backups you're ready and willing to use
before testing btrfs.
IOW, your backups shouldn't be btrfs, because btrfs itself is testing,
and any data stored on it is by definition testing-only data you don't
particularly care about, either because you have good tested-restorable
backups, or because the data really isn't that valuable to you in the
first place. There's no way to avoid it by saying you didn't know
either, as a careful admin researches (and may well early-deployment
test, too) the filesystems he's going to use before /actual/ deployment.
Failing to do that simply means the admin isn't "care-ful", that he
simply isn't full of care about the data he's trusting to a filesystem he
knows little or nothing about -- literally, he does not care, at least
/enough/.
> This had been running well for about a year.
> Recently I decided the upgrade the backup server to
> openSUSE 13.1. I checked all filesystems before the upgrade and
> everything was clean. I had several attempts at upgrading the system,
> but all failed (the installation of some rpm would hang indefinitely).
> So I aborted the installation and reverted the system back to openSUSE
> 12.3 (with a custom-installed 3.9.7 kernel). Unfortunately, after this
> the backup RAID reported lots of errors.
There's a couple things wrong here.
2) Btrfs testers are encouraged to always run a recent kernel, preferably
the latest release of the latest Linus kernel stable series (currently
3.12.x), if not the latest Linus development kernel (we're late in the
3.13 rc cycle), if not even the btrfs-next patches that are slated for
the /next/ development kernel. If you're more than a stable series
behind (thus currently you should be on the 3.11 series at absolute
minimum), you're both risking your data to bugs that are already known
and fixed, and as a btrfs tester, if/when things DO go wrong, your bug
reports aren't as useful because the code you're running is simply too
stale!
3.9.x? For an admin who has chosen to be a btrfs tester, that is, or
should be, ancient history!
(As implied by the mention of 3.13 toning down the btrfs kernel config
option warning a bit, btrfs is indeed now beginning to stabilize, and
these tester/user requirements should be a bit less strict going
forward. But you're not yet using 3.13, so they still apply in pretty
much full force. And even if they /are/ getting less strict now, four
whole kernel series outdated really /is/ outdated, for btrfs!)
3) At this point in btrfs development, the on-device format is still
getting slight tweaks. There's a policy that existing formats are new-
kernel-compatible, but once you mount a btrfs with a new kernel, updates
to the format may be made, and there's no such policy about the
compatibility of mounting btrfs filesystems with old kernels again, once
they've been mounted with a new kernel.
That's actually what you're seeing, I suspect. The filesystem may well
not be damaged. It's simply that it was mounted with a newer kernel, and
now the older kernel you're trying to use again, doesn't understand some
of the changes the new kernel made. That's not damage, it's just
attempting to use a stale kernel on a filesystem mounted with a more
recent version that updated the on-device format enough so your stale
kernel doesn't understand parts of it any more.
But the good news is that particularly since you're already running a
custom 3.9 kernel and thus must already know at least a /bit/ about
configuring and building your own kernel, you shouldn't have much trouble
getting a new kernel, the latest 3.12.x stable or even the latest 3.13.x
rc since it's already late in the cycle, on your system, even if it means
building it yourself to do so. Hopefully with that, you'll find the
currently reported errors gone, and it'll work fine. =:^)
> When I run btrfsck on the filesystem, I get [snipped but for this:]
> Btrfs v3.12+20131125
>
> (this version of btrfsck comes from openSUSE factory).
Well, at least you're running a reasonably current btrfs-tools. Btrfs-
tools 3.12 was released at about the same time as kernel 3.12, and was in
fact the first release to use the new, kernel-synced, version number
scheme.
4) But there's the hint, too. A 3.12 btrfs-tools works best with a 3.12
kernel, and you're attempting to use a very stale 3.9 kernel ATM. No
/wonder/ that combination triggers problems! It /may/ be that an older
btrfs-tools would match that kernel a bit better.
Of course before 3.12, btrfs-tools wasn't actually released all that
often, and the older 0.19/0.20-rc1 style versioning didn't lend itself
particularly well to kernel matching, which was a problem. But a quick
look at the changelog on the btrfs wiki suggests that kernel 3.9 was in
April, 2013, while btrfs-progs 0.20-rc1 was, from memory, late 2012. So
a dated btrfs-progs git-snapshot version of something like 0.20-rc1
+201304xx, if you can find such an animal, might actually work a bit
better with that kernel version, if you can't do the better thing and
properly upgrade the kernel to current.
> I also ran btrfs scrub on the file system. This uncovered 4 checksum
> errors which I could repair manually. I do not know if that is related
> to the problem above. At least it didn't solve it...
>
> The btrfs file system is installed on top of an mdadm RAID5.
Out of curiosity, how often do you run an mdadm scrub?
Unfortunately btrfs' native raid5/6 support, introduced in kernel 3.9,
remained unfinished both then and thru current 3.13 -- it writes the
parity data, but proper use of it in btrfs scrub and recovery remains not
fully implemented, so (even more than btrfs in general) it's DEFINITELY
not recommended except for run-mode testing, since the recovery mode that
people /run/ raid5/6 for remains broken, making it effectively a raid0 --
if you lose a device, consider the entire filesystem lost.
So btrfs-native raid5/6 is entirely out as a viable option.
Which leaves you with mdraid (or possibly lvm?) for raid5/6 if that's
what you need. Unfortunately, while mdraid writes the parity data and
(unlike btrfs raid5/6 mode) /does/ reliably use it for recovery, unlike
btrfs once it's actually implemented properly, mdraid does no parity/
checksum checking in normal operation. Which means any corruption on it
will remain entirely undetected unless you run mdadm scrub.
FWIW, while I ran mdraid6 in the past, once I realized it didn't do
normal runtime parity checking anyway, I switched out to the better
performing mdraid1 mode. Since I had four drives for the raid6, and was
lucky enough to have the space to re-allocate when I squeezed a bit, I
ended up with 4-way mdraid1, giving me loss of three device protection.
But of course I didn't have the operational data integrity that btrfs
raid1 mode provides, altho that's only across two mirrors. (At present,
btrfs raid1 mode is only two-way-mirrored regardless of the number of
devices in the raid. If there's more devices, it still only keeps two
copies, and simply expands the amount of available storage. Btrfs N-way
mirroring remains on the btrfs roadmap for implementation after raid5/6
is completed, but it's not there yet. FWIW, while other btrfs features
are nice, I really /really/ want N-way mirroring, as checksummed three-
way-mirroring for loss-of-two-devices protection really does appear to be
my cost/risk balance sweet spot. But as it's not yet available...
So while waiting for the btrfs full-checksummed 3-way-mirroring I
/really/ want, I content myself with the 2-way-mirroring that's actually
implemented, and /reasonably/ stable, tho I still keep an off-btrfs
backup on reiserfs (which has been proven /extremely/ reliable, here, at
least since the introduction of ordered-journal mode by default now many
years ago, but unfortunately reiserfs isn't suited to use on ssds, the
reason I've upgraded to btrfs even if it is still testing, while still
keeping reiserfs backups on spinning rust).
And due to unclean shutdowns for reasons not entirely related to btrfs, I
do occasionally see btrfs scrub fixing problems here, just as I had to do
mdraid device re-adds on occasion.
But anyway, given the big hole in current functionality, btrfs having run-
time checksumming support but nothing above 2-way-mirrored raid1 (or
raid10), and mdraid/lvm having raid5/6 and n-way-mirroring, but no
routine runtime integrity checking, only on-demand scrub, btrfs on top of
mdraid5 is a reasonable way to go.
5) But if you do start getting btrfs errors I would certainly recommend
an mdraid level scrub, just to be sure they're not coming from that level.
Tho in this case I really do suspect it's simply a matter of trying to
run a filesystem that has been mounted on a newer kernel, again on an
older kernel that it's now no longer fully compatible with, and really do
hope/expect that at least some of your problems will go away once you try
a current kernel. ...
> How worried should I be about the reported errors? What confuses me is
> that in the end btrfsck reports an error count of 0.
... Which if I'm correct, may explain this as well, particularly since
your kernel is old and likely reporting things it doesn't understand,
while your btrfs-tools are new, and thus may well not see a problem,
because there (hopefully) really isn't a problem, except that you're
trying to use too old a kernel for the on-device btrfs format.
> Should I try to repair this? I have had bad experiences in the past with
> "btrfsck --repair", but that was with a much older version...
6) *THIS* is the thing you actually did correctly, that MAY WELL HAVE
SAVED YOUR DATA! =:^)
Currently, btrfsck --repair (or btrfs check --repair in current btrfs-
tools, where it is now part of the main general purpose btrfs tool), is
only recommended as a last resort. It can and sometimes does actually
make the problem worse.
Given that I suspect your problem may not actually be a filesystem issue
at all, but rather, simply due to trying to use an old and stale kernel
with a filesystem updated to use elements the old kernel doesn't
understand, the chance if btrfs --repair actually doing more damage than
fix is even higher.
> I can of course recreate the backups, but this would take a long time
> and I would loose my entire snapshot history which I would rather
> avoid...
Well, given the situation, with any luck the resolution to the immediate
problem is as I said, simply run a current kernel as recommended, that
matches your current btrfs-tools version, and that isn't a regression to
a MUCH too stale kernel significantly older than the most current one
you've ever mounted the filesystem with.
But while with any luck that does solve the /immediate/ problem, it
doesn't do anything to solve the more general situation, that you're
using a still under heavy development btrfs as a backup, an entirely
unappropriate role for a filesystem that is recommended for testing only
with data that you don't care of it's lost, either because you keep off-
btrfs backups, or because the data isn't that important to you if your
btrfs testing should lose it all, in the first place.
So even if a current kernel does resolve the immediate situation, I'd
still recommend using something else rather more mature for your backup
solution.
Meanwhile, redoing your older btrfs filesystems with newly created
mkfs.btrfs filesystems is probably a good idea as well, because there's a
number of efficiency and robustness optimizations now enabled by default
on newly created filesystems, that simply aren't available on filesystems
created before their introduction. Newer kernels and tools can still
mount and run on the older filesystems as that compatibility is policy,
but that doesn't mean it's as efficient or robust an implementation as
if you'd created the filesystem with current tools and kernel, using all
the latest format variants.
So even if using a current kernel solves your immediate issues, as I have
a reasonable expectation/hope that it will, I'd still recommend redoing
those backups, to something more appropriate for backups than btrfs at
this point.
And for anything you do use btrfs for (with appropriate backups), I'd
suggest a fresh mkfs.btrfs filesystem, to take advantage of the latest
format optimizations and robustness features.
And then for anything running btrfs, keep current, both kernel and and
btrfs-tools. It really /can/ be the difference between safe data because
a bug that you might have triggered is fixed, and trashed data because
you triggered a long fixed bug while using ancient tools and kernel!
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: backpointer mismatch
2014-01-10 14:26 ` Duncan
@ 2014-01-10 15:16 ` Roman Mamedov
2014-01-10 15:53 ` Duncan
0 siblings, 1 reply; 4+ messages in thread
From: Roman Mamedov @ 2014-01-10 15:16 UTC (permalink / raw)
To: Duncan; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1098 bytes --]
On Fri, 10 Jan 2014 14:26:19 +0000 (UTC)
Duncan <1i5t5.duncan@cox.net> wrote:
> IOW, your backups shouldn't be btrfs, because btrfs itself is testing,
> and any data stored on it is by definition testing-only data you don't
> particularly care about, either because you have good tested-restorable
> backups, or because the data really isn't that valuable to you in the
> first place.
On the contrary, I think a backup storage area is an excellent place to start
rolling-out btrfs from, because:
1) the snapshot capability allows to do your backups using simple full-mirror
tools such as rsync or mirrordir in incremental mode, propagating only changes
in the directory tree (and then make and keep a number of date/timed snapshots
with some automation of your own);
2) it's *backups*, by definition it's non-unique replaceable data that also
exists elsewhere (and in this case on the primary storage, that's probably
much less experimental and more redundant as well).
My primary storage is currently Ext4 and backups are all on btrfs.
--
With respect,
Roman
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: backpointer mismatch
2014-01-10 15:16 ` Roman Mamedov
@ 2014-01-10 15:53 ` Duncan
0 siblings, 0 replies; 4+ messages in thread
From: Duncan @ 2014-01-10 15:53 UTC (permalink / raw)
To: linux-btrfs
Roman Mamedov posted on Fri, 10 Jan 2014 21:16:59 +0600 as excerpted:
> On Fri, 10 Jan 2014 14:26:19 +0000 (UTC)
> Duncan <1i5t5.duncan@cox.net> wrote:
>
>> IOW, your backups shouldn't be btrfs, because btrfs itself is testing,
>> and any data stored on it is by definition testing-only data you don't
>> particularly care about, either because you have good tested-restorable
>> backups, or because the data really isn't that valuable to you in the
>> first place.
>
> On the contrary, I think a backup storage area is an excellent place to
> start rolling-out btrfs from, because:
>
> 1) the snapshot capability
Point agreed. =:^)
> 2) it's *backups*, by definition it's non-unique replaceable data that
> also exists elsewhere (and in this case on the primary storage, that's
> probably much less experimental and more redundant as well).
>
> My primary storage is currently Ext4 and backups are all on btrfs.
But what happens if you actually /need/ those backups, and in going to
use them, you find they're bugged due to some as yet unfixed bug in still
under development btrfs?
To me, the /point/ of backups is reliability. I need to *KNOW* they're
reliable, and btrfs simply isn't intended or claimed to provide that
guaranteed stable reliability yet.
While admittedly a lot of people are now using btrfs without issue, and
I'm using it here myself as my primary/working copy as well as first
level backup (with off-btrfs backups to my first-level btrfs backups), I
simply couldn't rest well if I were using it for (all level) backups,
because it simply doesn't provide the proven over years level of
stability and reliability that for me is the whole /point/ of backups
(otherwise, why bother?), yet.
Never-the-less, if you're comfortable with that level of additional risk
in your backups, it's your system and your data at risk, so more power to
you! =:^)
But IMO, /recommending/ btrfs for backups at this point (regardless of
what I was or was not doing myself, accepting the brown-bag should my
decision for my own data turn out to have been a bad one) is nothing
other than irresponsible, and as such I could never do it.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-01-10 15:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-10 3:59 backpointer mismatch Peter van Hoof
2014-01-10 14:26 ` Duncan
2014-01-10 15:16 ` Roman Mamedov
2014-01-10 15:53 ` Duncan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox