* BTRFS RAID5 filesystem corruption during balance
@ 2015-05-21 21:43 Jan Voet
2015-05-22 4:43 ` Duncan
0 siblings, 1 reply; 7+ messages in thread
From: Jan Voet @ 2015-05-21 21:43 UTC (permalink / raw)
To: linux-btrfs
Hi,
I recently upgraded a quite old home NAS system (Celeron M based) to Ubuntu
14.04 with an upgraded linux kernel (3.19.8) and BTRFS tools v3.17. This
system has 5 brand new 6TB drives (HGST) with all drives directly handled by
BTRFS, both data and metadata in RAID5.
After loading up the system with 12.5TB data (took some time :-) ), a btrfs
balance was done to see how it would behave. After 3 days into it and
still 48% to go, the system locked up and didn't respond anymore to ssh, usb
keyboard, nor did the VGA output work anymore. Only pings worked (IP/ICMP
Echo Request/Reply) so the kernel IP stack was still active, nothing else
did however and no disk activity was seen at all.
So I did a hard reset, hoping that on restart it would resume the balance.
It actually seemed to restart the balance but showed only a few extents
remaining (11 or so, instead of the 3000+ that were shown originally) and
after a small amount of time seemed to have completed the balance ???
The result seems to be a mess however, with the filesystem being remounted
read-only after a few minutes, with lots of btrfs-related stackdumps in the
kernel message dump. Rebooting doesn't seem to help. It always ends up in
the same situation after some time.
The data is still visible, but I'm a bit of a loss as to how I should
continue. Any advice would be welcome.
Some data:
$ sudo btrfs fi show /dev/sdb
Label: none uuid: d278e7df-e26d-4a9b-99fb-71fbef819dd1
Total devices 5 FS bytes used 11.58TiB
devid 1 size 5.46TiB used 2.92TiB path /dev/sdb
devid 2 size 5.46TiB used 2.92TiB path /dev/sdc
devid 3 size 5.46TiB used 2.92TiB path /dev/sdd
devid 4 size 5.46TiB used 2.92TiB path /dev/sde
devid 5 size 5.46TiB used 2.92TiB path /dev/sdf
Btrfs v3.17
One of the stackdumps:
[ 328.224417] ------------[ cut here ]------------
[ 328.224446] WARNING: CPU: 0 PID: 1633 at
/home/kernel/COD/linux/fs/btrfs/disk-io.c:513 csum_dirty_buffer+0x6f/0xa0
[btrfs]()
[ 328.224448] Modules linked in: ppdev i915 video net2280 udc_core
drm_kms_helper lpc_ich drm serio_raw shpchp i2c_algo_bit 8250_fintek
parport_pc mac_hid lp parport btrfs xor raid6_pq hid_generic usbhid sata_mv
e1000 pata_acpi floppy hid
[ 328.224473] CPU: 0 PID: 1633 Comm: kworker/u2:12 Tainted: G W
3.19.8-031908-generic #201505110938
[ 328.224476] Hardware name: /i854GML-LPC47M182, BIOS 6.00 PG 06/21/2007
[ 328.224508] Workqueue: btrfs-worker btrfs_worker_helper [btrfs]
[ 328.224510] 00000000 00000000 c0ae5e40 c16e4a4d 00000000 c0ae5e70
c106250e c1907948
[ 328.224518] 00000000 00000661 f89c3444 00000201 f893142f f893142f
d6f3a8f0 f72b1ac8
[ 328.224525] f6d5d800 c0ae5e80 c1062572 00000009 00000000 c0ae5e9c
f893142f 187ced34
[ 328.224532] Call Trace:
[ 328.224537] [<c16e4a4d>] dump_stack+0x41/0x52
[ 328.224541] [<c106250e>] warn_slowpath_common+0x8e/0xd0
[ 328.224570] [<f893142f>] ? csum_dirty_buffer+0x6f/0xa0 [btrfs]
[ 328.224598] [<f893142f>] ? csum_dirty_buffer+0x6f/0xa0 [btrfs]
[ 328.224603] [<c1062572>] warn_slowpath_null+0x22/0x30
[ 328.224631] [<f893142f>] csum_dirty_buffer+0x6f/0xa0 [btrfs]
[ 328.224660] [<f893149f>] btree_csum_one_bio.isra.121+0x3f/0x50 [btrfs]
[ 328.224688] [<f89314c3>] __btree_submit_bio_start+0x13/0x20 [btrfs]
[ 328.224715] [<f892f81d>] run_one_async_start+0x3d/0x60 [btrfs]
[ 328.224750] [<f896e2b2>] normal_work_helper+0x62/0x180 [btrfs]
[ 328.224778] [<f8930630>] ? __btree_submit_bio_done+0x50/0x50 [btrfs]
[ 328.224812] [<f896e3e0>] btrfs_worker_helper+0x10/0x20 [btrfs]
[ 328.224817] [<c1077cb1>] process_one_work+0x121/0x3a0
[ 328.224822] [<c16f057c>] ? apic_timer_interrupt+0x34/0x3c
[ 328.224826] [<c107854d>] worker_thread+0xed/0x390
[ 328.224831] [<c1099fbf>] ? __wake_up_locked+0x1f/0x30
[ 328.224835] [<c1078460>] ? create_worker+0x1b0/0x1b0
[ 328.224840] [<c107d09b>] kthread+0x9b/0xb0
[ 328.224845] [<c16efb81>] ret_from_kernel_thread+0x21/0x30
[ 328.224850] [<c107d000>] ? flush_kthread_worker+0x80/0x80
[ 328.224853] ---[ end trace e8386011b87476a4 ]---
There's plenty more of those as well as other messages such as:
[ 329.354420] BTRFS: error (device sdf) in btrfs_run_delayed_refs:2792:
errno=-5 IO failure
[ 329.354522] BTRFS info (device sdf): forced readonly
[ 476.620532] perf interrupt took too long (2512 > 2500), lowering
kernel.perf_event_max_sample_rate to 50000
[ 549.412065] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.425057] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.425415] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.425641] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.425655] BTRFS info (device sdf): no csum found for inode 15963 start 0
[ 549.425943] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.426154] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.426165] BTRFS info (device sdf): no csum found for inode 15963 start 4096
[ 549.426443] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.426653] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.426663] BTRFS info (device sdf): no csum found for inode 15963 start 8192
[ 549.426944] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.427153] BTRFS (device sdf): bad tree block start 17003380002271197777
19274981785600
[ 549.427163] BTRFS info (device sdf): no csum found for inode 15963 start
12288
[ 549.427655] BTRFS info (device sdf): no csum found for inode 15963 start
16384
[ 549.428447] BTRFS info (device sdf): no csum found for inode 15963 start
20480
[ 549.429175] BTRFS info (device sdf): no csum found for inode 15963 start
24576
.....
I can provide more info on request, and don't mind trying out different
things (the data was fully backed up before I started this experiment).
Kind regards,
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: BTRFS RAID5 filesystem corruption during balance 2015-05-21 21:43 BTRFS RAID5 filesystem corruption during balance Jan Voet @ 2015-05-22 4:43 ` Duncan 2015-05-22 18:11 ` Jan Voet 2015-05-22 19:15 ` Chris Murphy 0 siblings, 2 replies; 7+ messages in thread From: Duncan @ 2015-05-22 4:43 UTC (permalink / raw) To: linux-btrfs Jan Voet posted on Thu, 21 May 2015 21:43:36 +0000 as excerpted: > I recently upgraded a quite old home NAS system (Celeron M based) to > Ubuntu 14.04 with an upgraded linux kernel (3.19.8) and BTRFS tools > v3.17. > This system has 5 brand new 6TB drives (HGST) with all drives directly > handled by BTRFS, both data and metadata in RAID5. FWIW, btrfs raid5 (and raid6, together called raid56 mode) is still extremely new, only normal runtime implemented as originally introduced, with complete repair from a device failure only completely implemented in kernel 3.19, and while in theory complete, that implementation is still very immature and poorly tested, and *WILL* have bugs, one of which you may very well have found. For in-production use, therefore, btrfs raid56 mode, while now at least in theory complete, is really too immature at this point to recommend. I'd recommend either btrfs raid1 or raid10 modes as more stable within btrfs at this point, tho by the end of this year or early next, I predict raid56 mode to have stabilized to about that of the rest of btrfs, which is to say, not entirely stable, but heading that way. IOW, for btrfs in general, the sysadmin's backup rule that if you don't have backups by definition you don't care about the data regardless of claims to the contrary, and untested would-be backups aren't backups until you complete them by testing that they can be read and restored from, continues to apply even more than to more stable filesystems, and keeping up with current is still very important as by doing so you're avoiding known and already fixed bugs. Given those constraints, btrfs is /in/ /general/ usable. But not yet raid56 mode, which I'd definitely consider to still be breakable at any time. So certainly for the multi-TB of data you're dealing with, which you say yourself takes some time (and is thus not something you can afford to backup and restore trivially), I'd say stay off btrf raid56 until around the end of the year or early next, at which point it should have stabilized. Until then, consider either btrfs raid1 mode (which I use), or for that amount of data, more likely btrfs raid10 mode. Or if you must keep raid5 due to device and data size limitations, consider sticking with mdraid5 or similar, for now, potentially with btrfs on top, or perhaps with the more stable xfs or ext3/4 (or my favorite reiserfs, which I have found /extremely/ reliable here, even with less than absolutely reliable hardware, the old tales about it being unreliable were from pre-data=ordered times, but that's early kernel 2.4 era and thus rather ancient history, now, but as they say, YMMV...). -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BTRFS RAID5 filesystem corruption during balance 2015-05-22 4:43 ` Duncan @ 2015-05-22 18:11 ` Jan Voet 2015-05-23 15:02 ` Jan Voet 2015-05-22 19:15 ` Chris Murphy 1 sibling, 1 reply; 7+ messages in thread From: Jan Voet @ 2015-05-22 18:11 UTC (permalink / raw) To: linux-btrfs Duncan <1i5t5.duncan <at> cox.net> writes: > FWIW, btrfs raid5 (and raid6, together called raid56 mode) is still > extremely new, only normal runtime implemented as originally introduced, > with complete repair from a device failure only completely implemented in > kernel 3.19, and while in theory complete, that implementation is still > very immature and poorly tested, and *WILL* have bugs, one of which you > may very well have found. > > For in-production use, therefore, btrfs raid56 mode, while now at least > in theory complete, is really too immature at this point to recommend. > I'd recommend either btrfs raid1 or raid10 modes as more stable within > btrfs at this point, tho by the end of this year or early next, I predict > raid56 mode to have stabilized to about that of the rest of btrfs, which > is to say, not entirely stable, but heading that way. > Hi Duncan, Thanks for your reply. I was under the impression that RAID5/6 was considered quite stable in the more recent kernels, hence my use of the 3.19 kernel and the upgraded btrfstools. It's obvious that I was wrong in this assumption and maybe btrfs RAID5 should be labeled as experimental code then. A balance operation is supposed to be safe as it makes a copy of each file, rewrites it, distributing the data over all devices and only then deletes the original file? This should never lead to kernel deadlocks ... Having a corrupted filesystem after a reboot due to this is even more worrisome, I think. And worst of all are the btrfs kworker crashes. Kernel code should never crash IMHO, but maybe I'm slightly naive here ;-) . Anyways, lots of lessons learned, and I'll see if I can repair the filesystem as described in https://btrfs.wiki.kernel.org/index.php/Btrfsck If it doesn't work, I'll simply start over with an alternative filesystem. Regards, Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BTRFS RAID5 filesystem corruption during balance 2015-05-22 18:11 ` Jan Voet @ 2015-05-23 15:02 ` Jan Voet 2015-06-20 3:50 ` Russell Coker 0 siblings, 1 reply; 7+ messages in thread From: Jan Voet @ 2015-05-23 15:02 UTC (permalink / raw) To: linux-btrfs Jan Voet <jan.voet <at> gmail.com> writes: > > Duncan <1i5t5.duncan <at> cox.net> writes: > > > FWIW, btrfs raid5 (and raid6, together called raid56 mode) is still > > extremely new, only normal runtime implemented as originally introduced, > > with complete repair from a device failure only completely implemented in > > kernel 3.19, and while in theory complete, that implementation is still > > very immature and poorly tested, and *WILL* have bugs, one of which you > > may very well have found. > > > > For in-production use, therefore, btrfs raid56 mode, while now at least > > in theory complete, is really too immature at this point to recommend. > > I'd recommend either btrfs raid1 or raid10 modes as more stable within > > btrfs at this point, tho by the end of this year or early next, I predict > > raid56 mode to have stabilized to about that of the rest of btrfs, which > > is to say, not entirely stable, but heading that way. > > > Looks like the the btrfs raid5 filesystem is back in working order. What actually happened was that on reboot of the server, the interrupted btrfs balance tried to resume each time, but wasn't capable of it due to an incorrect/invalid state. The amount of errors that were spawned by this made it very difficult to diagnose, as the kernel log got truncated very quickly. Doing a 'btrfs balance cancel' immediately after the array was mounted seems to have done the trick. A subsequent 'btrfs check' didn't show any errors at all and all the data seems to be there. :-) Kind regards, Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BTRFS RAID5 filesystem corruption during balance 2015-05-23 15:02 ` Jan Voet @ 2015-06-20 3:50 ` Russell Coker 0 siblings, 0 replies; 7+ messages in thread From: Russell Coker @ 2015-06-20 3:50 UTC (permalink / raw) To: Jan Voet; +Cc: linux-btrfs On Sun, 24 May 2015 01:02:21 AM Jan Voet wrote: > Doing a 'btrfs balance cancel' immediately after the array was mounted > seems to have done the trick. A subsequent 'btrfs check' didn't show any > errors at all and all the data seems to be there. :-) I add "rootflags=skip_balance" to the kernel command-line of all my Debian systems to solve this. I've had problems with the balance resuming in the past which had similar results. I've also never seen a situation where resuming the balance did any good. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BTRFS RAID5 filesystem corruption during balance 2015-05-22 4:43 ` Duncan 2015-05-22 18:11 ` Jan Voet @ 2015-05-22 19:15 ` Chris Murphy 2015-05-23 2:56 ` Duncan 1 sibling, 1 reply; 7+ messages in thread From: Chris Murphy @ 2015-05-22 19:15 UTC (permalink / raw) To: Btrfs BTRFS On Thu, May 21, 2015 at 10:43 PM, Duncan <1i5t5.duncan@cox.net> wrote: > For in-production use, therefore, btrfs raid56 mode, while now at least > in theory complete, is really too immature at this point to recommend. At some point perhaps a developer will have time to state the expected stability level on stable hardware. And what things should be included in a complete report. I see many reports only including the bug/ Warning with call trace. And too often problems were happening before that. The XFS FAQ has an explicit "what to include in a report" other that may serve as a guide to adapt for Btrfs reports. -- Chris Murphy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BTRFS RAID5 filesystem corruption during balance 2015-05-22 19:15 ` Chris Murphy @ 2015-05-23 2:56 ` Duncan 0 siblings, 0 replies; 7+ messages in thread From: Duncan @ 2015-05-23 2:56 UTC (permalink / raw) To: linux-btrfs Chris Murphy posted on Fri, 22 May 2015 13:15:09 -0600 as excerpted: > On Thu, May 21, 2015 at 10:43 PM, Duncan <1i5t5.duncan@cox.net> wrote: >> For in-production use, therefore, btrfs raid56 mode, while now at least >> in theory complete, is really too immature at this point to recommend. > > At some point perhaps a developer will have time to state the expected > stability level on stable hardware. And what things should be included > in a complete report. I see many reports only including the bug/ Warning > with call trace. And too often problems were happening before that. > > The XFS FAQ has an explicit "what to include in a report" other that may > serve as a guide to adapt for Btrfs reports. There's one spot on the wiki (bottom of the btrfs mailing lists page) that lists the information to provide when filing a bug. https://btrfs.wiki.kernel.org/index.php/Btrfs_mailing_list But, even being somewhat familiar with the wiki and knowing it was, or had been, somewhere on the wiki, I had trouble finding it. It's definitely not in the first place I looked, the Problem FAQ, under How do I report bugs and issues? (Tho it does link to the list page.) https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#How_do_I_report_bugs_and_issues.3F If I had ever gotten around to getting a wiki login, I'd fix that, but for some reason, while I seem to be fine posting to newsgroups and mailinglists (as newsgroups, via gmane.org's list2news service), I mostly treat the web, wikis included, as read-only, other than the occasional reply to an article. I never got into web forums that much either. So if you have a wiki login and time to fix it... =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-06-20 3:50 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-05-21 21:43 BTRFS RAID5 filesystem corruption during balance Jan Voet 2015-05-22 4:43 ` Duncan 2015-05-22 18:11 ` Jan Voet 2015-05-23 15:02 ` Jan Voet 2015-06-20 3:50 ` Russell Coker 2015-05-22 19:15 ` Chris Murphy 2015-05-23 2:56 ` Duncan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).