Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
@ 2016-03-28  4:41 James Johnston
  2016-03-28 10:26 ` Duncan
  0 siblings, 1 reply; 7+ messages in thread
From: James Johnston @ 2016-03-28  4:41 UTC (permalink / raw)
  To: 'Chris Murphy'; +Cc: 'Btrfs BTRFS'

Hi,

After puzzling over the btrfs failure I reported here a week ago, I think there
is a bad incompatibility between compression and RAID-1 (maybe other RAID
levels too?).  I think it is unsafe for users to use compression, at least with
multiple devices until this is fixed/investigated further.  That seems like a
drastic claim, but I know I will not be using it for now.  Otherwise, checksum
errors scattered across multiple devices that *should* be recoverable will
render the file system unusable, even to read data from.  (One alternative
hypothesis might be that defragmentation causes the issue, since I used
defragment to compress existing files.)

I finally was able to simplify this to a hopefully easy to reproduce test case,
described in lengthier detail below.  In summary, suppose we start with an
uncompressed btrfs file system on only one disk containing the root file system,
such as created by a clean install of a Linux distribution.  I then:
(1) enable compress=lzo in fstab, reboot, and then defragment the disk to
compress all the existing files, (2) add a second drive to the array and balance
for RAID-1, (3) reboot for good measure, (4) cause a high level of I/O errors,
such as hot-removal of the second drive, OR simply a high level of bit rot
(i.e. use dd to corrupt most of the disk, while either mounted or unmounted).
This is guaranteed to cause the kernel to crash.

If the compression step is skipped such that the volume is uncompressed, you
get lots of I/O errors logged - as expected.  For hot-removal, as you point out,
patches to auto-degrade the array aren't merged yet.  For bit rot, the file
system should log lots of checksum errors and corrections, but again should
succeed.  Most importantly, the kernel _does not fall over_ and bring the
system down.  I think that's acceptable behavior until the patches you mention
are merged.

> There are a number of things missing from multiple device support,
> including any concept of a device becoming faulty (i.e. persistent
> failures rather than transient which Btrfs seems to handle OK for the
> most part), and then also getting it to go degraded automatically, and
> finally hot spare support. There are patches that could use testing.

I think in general, if the system can't handle a persistent failure, it can't
reliably handle a transient failure either... you're just less likely to
notice...  The permanent failure just stress-tests the failure code - if you
pay attention to the test case when hot removing, you'll note that oftentimes
dozens of I/O errors are mitigated successfully before one of them finally
brings the system down.

What you've described above in the patch series are nice-to-have "fancy
features" and I do hope they eventually get tested and merged, but I also hope
the above patches let you disable them all so that one can stress-test the code
handling I/O failures without having a drive get auto-dropped from the array
before you tested the failure code enough.  The I/O errors in my dmesg I'm OK
with, but I think if the file system crashes the kernel it's bad news.

> I think when testing, it's simpler to not use any additional device
> mapper layers.

The test case eliminates all device mapper layers, and just uses raw
disks/partitions.  Here it is - skip to step #5 for the meat of it:

1.  Set up a new VirtualBox VM with:
    * System: Enable EFI
    * System: 8 GB RAM
    * System: 1 processor
    * Storage: Two SATA hard drives, 8 GB each, backed by dynamic VDI files
    * Storage: Default IDE CD-ROM is fine
    * Storage: The SATA hard drives must be hot-pluggable
    * Network: As you require
    * Serial port for debugging

2.  Boot to http://releases.ubuntu.com/15.10/ubuntu-15.10-server-amd64.iso

3.  Install Ubuntu 15.10 with default settings except as noted below:
    a.  Network/user settings: make up settings/accounts as needed.
    b.  Use Manual partitioning with these partitions on /dev/sda, in the
        following order:
        * 100 MB EFI System Partition
        * 500 MB btrfs, mount point at /boot
        * Remaining space: btrfs: mount point at /

4.  Install and boot into 4.6 rc-1 mainline kernel:

    wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6-rc1-wily/linux-image-4.6.0-040600rc1-generic_4.6.0-040600rc1.201603261930_amd64.deb
    dpkg -i linux-image-4.6.0-040600rc1-generic_4.6.0-040600rc1.201603261930_amd64.deb
    reboot

5.  Set up compression and RAID-1 for root partition onto /dev/sdb:

    # Add "compress=lzo" to all btrfs mounts:
    vim /etc/fstab
    reboot   # to take effect
    # Add second drive
    btrfs device add /dev/sdb /
    # Defragment to compress files
    btrfs filesystem defragment -v -c -r /home
    btrfs filesystem defragment -v -c -r /
    # Balance to RAID-1
    btrfs balance start -dconvert=raid1 -mconvert=raid1 -v /
    # btrfs fi usage says there was some single data until I did this, too:
    btrfs balance start -dconvert=raid1 -mconvert=raid1 -v /home
    # Make sure everything is RAID-1:
    btrfs filesystem usage /
    shutdown -P now

6.  Take a snapshot of the VM...  Then boot it again.  After the system is done
    booting, log in.  Then, using VirtualBox, remove the second hard drive from
    the system (that is, hot removal of /dev/sdb).

7.  ATA driver reports a problem with the device, shortly followed by some btrfs
    I/O errors that soon start showing up, but that's ok (since the patches for
    marking failed devices as missing aren't merged yet).  But the system will
    soon crash - hard.  If the system doesn't crash soon after you see some
    btrfs I/O errors show up, this will kill it very, very quickly:

    cd /usr
    find | xargs cat > /dev/null

8.  To demonstrate that the file system cannot even handle disk errors
    introduced offline, roll back to the snapshot in #6 you took before removing
    /dev/sdb.  Then:

    a.  Boot to the Ubuntu ISO DVD again, and go to a recovery prompt.  Use
        "mount" to make sure that your btrfs isn't mounted, then wipe most of
        the data from the second drive, leaving the first few MB untouched so
        as to leave file system headers intact:

        dd if=/dev/zero of=/dev/sdb bs=1M seek=10

        This simulates a massive amount of bitrot.  Unrealistic?  Maybe, but
        it's RAID-1 so it should survive; checksums will catch it.

    b.  Remove the DVD and reboot back into your installed grub on /dev/sda.
        Try to boot your system.  The kernel will crash with the same errors as
        previous when hot-removing the drive.  (Note that if your volume was
        uncompressed, you'll get some checksum errors logged from btrfs but
        the system otherwise would boot fine.)

The snippet below was captured from my virtual machine while attempting to boot
after zeroing most of /dev/sdb so as to cause lots of checksum errors.  Again
this is from the Ubuntu 15.10 mainline Linux 4.6 rc-1 kernel. 

Scanning for Btrfs filesystems
[   10.567428] BTRFS: device fsid ea9f0a9a-24f7-4e3c-9024-ed9c445a838d devid 2 transid 149 /dev/sdb
[   10.632885] BTRFS: device fsid ea9f0a9a-24f7-4e3c-9024-ed9c445a838d devid 1 transid 149 /dev/sda3
[   10.671767] BTRFS: device fsid 8a78480a-2c46-41be-a8aa-35bb5b626e07 devid 1 transid 25 /dev/sda2
done.
Begin: Checking root file system ... fsck from util-linux 2.26.2
done.[   10.760787] BTRFS info (device sda3): disk space caching is enabled

[   10.789275] BTRFS: has skinny extents
done.[   10.821080] BTRFS error (device sda3): bad tree block start 0 7879147520

[   10.868372] BTRFS error (device sda3): bad tree block start 0 7774093312
[   10.928190] BTRFS error (device sda3): bad tree block start 0 7755956224
[   10.971386] BTRFS error (device sda3): bad tree block start 0 7756431360
[   11.008786] BTRFS error (device sda3): bad tree block start 0 7756398592
Begin: Running /scripts/local-bo[   11.060932] BTRFS error (device sda3): bad tree block start 0 7881998336
ttom ... done.
Begin: Running /scripts/init-bottom ... done.
[   11.131834] BTRFS error (device sda3): bad tree block start 0 7880556544
[   11.183365] BTRFS warning (device sda3): csum failed ino 1390 off 0 csum 2566472073 expected csum 3255664415
[   11.210541] BTRFS warning (device sda3): csum failed ino 1390 off 4096 csum 2566472073 expected csum 4214559832
[   11.220571] BTRFS warning (device sda3): csum failed ino 1390 off 8192 csum 2566472073 expected csum 480458051
[   11.247389] BTRFS warning (device sda3): csum failed ino 1390 off 4096 csum 2566472073 expected csum 4214559832
[   11.301305] BTRFS warning (device sda3): csum failed ino 1390 off 12288 csum 2566472073 expected csum 2350310827
[   11.326303] BTRFS warning (device sda3): csum failed ino 1390 off 0 csum 2566472073 expected csum 3255664415
[   11.337395] BTRFS warning (device sda3): csum failed ino 1390 off 12288 csum 2566472073 expected csum 2350310827
[   11.368353] random: nonblocking pool is initialized
[   11.376003] BTRFS warning (device sda3): csum failed ino 1390 off 8192 csum 2566472073 expected csum 480458051
[   11.405067] BTRFS error (device sda3): bad tree block start 0 7756693504
[   11.439856] BTRFS error (device sda3): bad tree block start 0 7756709888
[   11.474957] BTRFS warning (device sda3): csum failed ino 1547 off 0 csum 2566472073 expected csum 2456395887
[   11.564869] BTRFS warning (device sda3): csum failed ino 1547 off 4096 csum 2566472073 expected csum 1646416170
[   11.633791] BTRFS error (device sda3): bad tree block start 0 7756267520
[   11.659413] BTRFS info (device sda3): csum failed ino 71484 extent 2603868160 csum 2566472073 wanted 1049865625 mirror 0
[   11.673459] ------------[ cut here ]------------
[   11.677450] kernel BUG at /home/kernel/COD/linux/fs/btrfs/volumes.c:5522!
[   11.711284] invalid opcode: 0000 [#1] SMP
[   11.713253] Modules linked in: btrfs xor raid6_pq hid_generic usbhid hid ahci psmouse libahci e1000 video pata_acpi fjes
[   11.765725] CPU: 0 PID: 6 Comm: kworker/u2:0 Not tainted 4.6.0-040600rc1-generic #201603261930
[   11.824228] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   11.850704] Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
[   11.865689] task: ffff8802161d4740 ti: ffff880216224000 task.ti: ffff880216224000
[   11.905000] RIP: 0010:[<ffffffffc0166ed6>]  [<ffffffffc0166ed6>] __btrfs_map_block+0xe36/0x11c0 [btrfs]
[   11.918735] RSP: 0000:ffff880216227a80  EFLAGS: 00010282
[   11.933331] RAX: 0000000000001b23 RBX: 0000000000000002 RCX: 0000000000000002
[   11.954117] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800360f4e40
[   11.980734] RBP: ffff880216227b68 R08: 000000021d8f0000 R09: 00000000dd801000
[   11.986452] R10: 0000000000010000 R11: 000000001b240000 R12: 00000000dd800fff
[   12.017683] R13: 000000000000d000 R14: ffff880216227bb0 R15: 0000000000010000
[   12.053965] FS:  0000000000000000(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
[   12.068187] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.070529] CR2: 00007fd993116000 CR3: 0000000215aca000 CR4: 00000000000406f0
[   12.127336] Stack:
[   12.127720]  0000000000001000 00000000efaea33e ffff8800df5d6000 0000000000000001
[   12.172375]  ffff880216227ac8 ffffffffc015716d 0000000000000000 0000000000001b24
[   12.199764]  0000000000001b23 ffff880200000000 0000000000000000 ffff8800360b8ee0
[   12.209815] Call Trace:
[   12.218802]  [<ffffffffc015716d>] ? release_extent_buffer+0x2d/0xc0 [btrfs]
[   12.263082]  [<ffffffffc01677d8>] btrfs_map_bio+0x88/0x350 [btrfs]
[   12.291316]  [<ffffffffc0185628>] btrfs_submit_compressed_read+0x468/0x4b0 [btrfs]
[   12.318768]  [<ffffffffc013ad81>] btrfs_submit_bio_hook+0x1a1/0x1b0 [btrfs]
[   12.351884]  [<ffffffffc015a2bc>] ? btrfs_create_repair_bio+0xdc/0x100 [btrfs]
[   12.357366]  [<ffffffffc015a7a6>] end_bio_extent_readpage+0x4c6/0x5c0 [btrfs]
[   12.362801]  [<ffffffffc015a2e0>] ? btrfs_create_repair_bio+0x100/0x100 [btrfs]
[   12.376990]  [<ffffffff813b5617>] bio_endio+0x57/0x60
[   12.388096]  [<ffffffffc012ed3c>] end_workqueue_fn+0x3c/0x40 [btrfs]
[   12.409567]  [<ffffffffc016c11a>] btrfs_scrubparity_helper+0xca/0x2e0 [btrfs]
[   12.411505]  [<ffffffffc016c41e>] btrfs_endio_helper+0xe/0x10 [btrfs]
[   12.425833]  [<ffffffff8109c845>] process_one_work+0x165/0x480
[   12.435196]  [<ffffffff8109cbab>] worker_thread+0x4b/0x500
[   12.449821]  [<ffffffff8109cb60>] ? process_one_work+0x480/0x480
[   12.452473]  [<ffffffff810a2df8>] kthread+0xd8/0xf0
[   12.454110]  [<ffffffff8183a122>] ret_from_fork+0x22/0x40
[   12.455496]  [<ffffffff810a2d20>] ? kthread_create_on_node+0x1a0/0x1a0
[   12.463845] Code: 50 ff ff ff 48 2b 55 b8 48 0f af c2 48 63 d3 48 39 d0 48 0f 46 d0 48 89 55 88 89 d9 c7 85 60 ff ff ff 00 00 00 00 e9 de f3 ff ff <0f> 0b bb f4 ff ff ff e9 59 fb ff ff be 77 16 00 00 48 c7 c7 90
[   12.557348] RIP  [<ffffffffc0166ed6>] __btrfs_map_block+0xe36/0x11c0 [btrfs]
[   12.590478]  RSP <ffff880216227a80>
[   12.599251] ---[ end trace 90172929edc1cb9b ]---
[   12.615185] BUG: unable to handle kernel paging request at ffffffffffffffd8
[   12.664311] IP: [<ffffffff810a34b0>] kthread_data+0x10/0x20
[   12.668463] PGD 3e09067 PUD 3e0b067 PMD 0
[   12.690490] Oops: 0000 [#2] SMP
[   12.700111] Modules linked in: btrfs xor raid6_pq hid_generic usbhid hid ahci psmouse libahci e1000 video pata_acpi fjes
[   12.757005] CPU: 0 PID: 6 Comm: kworker/u2:0 Tainted: G      D         4.6.0-040600rc1-generic #201603261930
[   12.786552] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   12.829870] task: ffff8802161d4740 ti: ffff880216224000 task.ti: ffff880216224000
[   12.853873] RIP: 0010:[<ffffffff810a34b0>]  [<ffffffff810a34b0>] kthread_data+0x10/0x20
[   12.891065] RSP: 0000:ffff880216227768  EFLAGS: 00010002
[   12.910124] RAX: 0000000000000000 RBX: ffff88021fc16c80 RCX: ffffffff8210b000
[   12.911936] RDX: 0000000000000000 RSI: ffff8802161d47c0 RDI: ffff8802161d4740
[   12.913719] RBP: ffff880216227768 R08: 00000000ffffffff R09: 0000000000000000
[   12.949494] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000016c80
[   12.965945] R13: 0000000000000000 R14: ffff88021fc16c80 R15: ffff8802161d4740
[   12.967492] FS:  0000000000000000(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
[   12.969621] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.986957] CR2: 0000000000000028 CR3: 0000000215aca000 CR4: 00000000000406f0
[   13.006393] Stack:
[   13.009209]  ffff880216227778 ffffffff8109dc0e ffff8802162277c8 ffffffff81835baa
[   13.020081]  ffff8800dd2e4078 ffff8802162277c0 ffff8802161d4740 ffff880216228000
[   13.045368]  0000000000000000 ffff880216227338 0000000000000000 0000000000000000
[   13.057369] Call Trace:
[   13.057883]  [<ffffffff8109dc0e>] wq_worker_sleeping+0xe/0x90
[   13.059069]  [<ffffffff81835baa>] __schedule+0x52a/0x790
[   13.072862]  [<ffffffff81835e45>] schedule+0x35/0x80
[   13.089262]  [<ffffffff81086974>] do_exit+0x7b4/0xb50
[   13.097663]  [<ffffffff81031d93>] oops_end+0xa3/0xd0
[   13.107790]  [<ffffffff8103224b>] die+0x4b/0x70
[   13.130895]  [<ffffffff8102f1c3>] do_trap+0xb3/0x140
[   13.134548]  [<ffffffff8102f5b9>] do_error_trap+0x89/0x110
[   13.136240]  [<ffffffffc0166ed6>] ? __btrfs_map_block+0xe36/0x11c0 [btrfs]
[   13.161570]  [<ffffffffc0132453>] ? btrfs_buffer_uptodate+0x53/0x70 [btrfs]
[   13.166816]  [<ffffffffc010f2c1>] ? generic_bin_search.constprop.37+0x91/0x1a0 [btrfs]
[   13.175639]  [<ffffffff8102fb60>] do_invalid_op+0x20/0x30
[   13.177160]  [<ffffffff8183b88e>] invalid_op+0x1e/0x30
[   13.193004]  [<ffffffffc0166ed6>] ? __btrfs_map_block+0xe36/0x11c0 [btrfs]
[   13.213146]  [<ffffffffc015716d>] ? release_extent_buffer+0x2d/0xc0 [btrfs]
[   13.229069]  [<ffffffffc01677d8>] btrfs_map_bio+0x88/0x350 [btrfs]
[   13.231534]  [<ffffffffc0185628>] btrfs_submit_compressed_read+0x468/0x4b0 [btrfs]
[   13.237415]  [<ffffffffc013ad81>] btrfs_submit_bio_hook+0x1a1/0x1b0 [btrfs]
[   13.268544]  [<ffffffffc015a2bc>] ? btrfs_create_repair_bio+0xdc/0x100 [btrfs]
[   13.280976]  [<ffffffffc015a7a6>] end_bio_extent_readpage+0x4c6/0x5c0 [btrfs]
[   13.291688]  [<ffffffffc015a2e0>] ? btrfs_create_repair_bio+0x100/0x100 [btrfs]
[   13.305346]  [<ffffffff813b5617>] bio_endio+0x57/0x60
[   13.319591]  [<ffffffffc012ed3c>] end_workqueue_fn+0x3c/0x40 [btrfs]
[   13.339111]  [<ffffffffc016c11a>] btrfs_scrubparity_helper+0xca/0x2e0 [btrfs]
[   13.387961]  [<ffffffffc016c41e>] btrfs_endio_helper+0xe/0x10 [btrfs]
[   13.406163]  [<ffffffff8109c845>] process_one_work+0x165/0x480
[   13.416087]  [<ffffffff8109cbab>] worker_thread+0x4b/0x500
[   13.478463]  [<ffffffff8109cb60>] ? process_one_work+0x480/0x480
[   13.481838]  [<ffffffff810a2df8>] kthread+0xd8/0xf0
[   13.556946]  [<ffffffff8183a122>] ret_from_fork+0x22/0x40
[   13.561131]  [<ffffffff810a2d20>] ? kthread_create_on_node+0x1a0/0x1a0
[   13.577065] Code: c4 c7 81 e8 e3 f3 fd ff e9 a2 fe ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 87 50 05 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
[   13.625921] RIP  [<ffffffff810a34b0>] kthread_data+0x10/0x20
[   13.649015]  RSP <ffff880216227768>
[   13.653637] CR2: ffffffffffffffd8
[   13.654328] ---[ end trace 90172929edc1cb9c ]---
[   13.672040] Fixing recursive fault but reboot is needed!

Best regards,

James Johnston



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
  2016-03-28  4:41 Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1) James Johnston
@ 2016-03-28 10:26 ` Duncan
  2016-03-28 14:34   ` James Johnston
  0 siblings, 1 reply; 7+ messages in thread
From: Duncan @ 2016-03-28 10:26 UTC (permalink / raw)
  To: linux-btrfs

James Johnston posted on Mon, 28 Mar 2016 04:41:24 +0000 as excerpted:

> After puzzling over the btrfs failure I reported here a week ago, I
> think there is a bad incompatibility between compression and RAID-1
> (maybe other RAID levels too?).  I think it is unsafe for users to use
> compression, at least with multiple devices until this is
> fixed/investigated further.  That seems like a drastic claim, but I know
> I will not be using it for now.  Otherwise, checksum errors scattered
> across multiple devices that *should* be recoverable will render the
> file system unusable, even to read data from.  (One alternative
> hypothesis might be that defragmentation causes the issue, since I used
> defragment to compress existing files.)
> 
> I finally was able to simplify this to a hopefully easy to reproduce
> test case, described in lengthier detail below.  In summary, suppose we
> start with an uncompressed btrfs file system on only one disk containing
> the root file system,
> such as created by a clean install of a Linux distribution.  I then:
> (1) enable compress=lzo in fstab, reboot, and then defragment the disk
> to compress all the existing files, (2) add a second drive to the array
> and balance for RAID-1, (3) reboot for good measure, (4) cause a high
> level of I/O errors, such as hot-removal of the second drive, OR simply
> a high level of bit rot (i.e. use dd to corrupt most of the disk, while
> either mounted or unmounted). This is guaranteed to cause the kernel to
> crash.

Described that way, my own experience confirms your tests, except that 
(1) I hadn't tested the no-compression case to know it was any different, 
and (2) in my case I was actually using btrfs raid1 mode and scrub to be 
able to continue to deal with a failing ssd out of a pair, for quite some 
while after I would have ordinarily had to replace it were I not using 
something like btrfs raid1 with checksummed file integrity and scrubbing 
errors with replacements from the good device.

Here's how it worked for me and why I ultimately agree with your 
conclusions, at least regarding compressed raid1 mode crashes due to too 
many failed checksum failures (since I have no reference to agree or 
disagree with the uncompressed case).

As I said above, I had one ssd failing, but was taking the opportunity 
while I had it to watch its behavior deeper into the failure than I 
normally would, and while I was at it, get familiar enough with btrfs 
scrub to repair errors that it became just another routine command for me 
(to the point that I even scripted up a custom scrub command complete 
with my normally used options, etc).  On the relatively small (largest 
was 24 GiB per device, paired device btrfs raid1) multiple btrfs on 
partitions on the two devices scrub was normally under a minute to run 
even when doing quite a few repairs, so it wasn't as if it was taking me 
the hours to days it can take at TB scale on spinning rust.

The failure mode of this particular ssd was premature failure of more and 
more sectors, about 3 MiB worth over several months based on the raw 
count of reallocated sectors in smartctl -A, but using scrub to rewrite 
them from the good device would normally work, forcing the firmware to 
remap that sector to one of the spares as scrub corrected the problem.

One not immediately intuitive thing I found with scrub, BTW, was that if 
it finished with unverified errors, I needed to rerun scrub again to do 
further repairs.  I've since confirmed with someone who can read code (I 
sort of do but more at the admin playing with patches level than the dev 
level) that my guess at the reason behind this behavior was correct.  
When a metadata node fails checksum verification and is repaired, the 
checksums that it in turn contained cannot be verified in that pass and 
show up as unverified errors.  A repeated scrub once those errors are 
fixed can verify and fix if necessary those additional nodes, and 
occasionally up to three or four runs were necessary to fully verify and 
repair all blocks, eliminating all unverified errors, at which point 
further scrubs found no further errors.

It occurred to me as I write this, that the problem I saw and you have 
confirmed with testing and now reported, may actually be related to some 
interaction between these unverified errors and compressed blocks.

Anyway, as it happens, my / filesystem is normally mounted ro except 
during updates and by the end I was scrubbing after updates, and even 
after extended power-downs, so it generally had only a few errors.

But /home (on an entirely separate filesystem, but a filesystem still on 
a pair of partitions, one on each of the same two ssds) would often have 
more, and because I have a particular program that I start with my X and 
KDE session that reads a bunch of files into cache as it starts up, I had 
a systemd service configured to start at boot and cat all the files in 
that particular directory to /dev/null, thus caching them so when I later 
started X and KDE (I don't run a *DM and thus login at the text CLI and 
startx, with a kde session, from the CLI) and thus this program, all the 
files it reads would already be in cache.

And that's where the problem actually was and how I can actually confirm 
your report.  If that service didn't run, that directory, including some 
new files with a relatively large chance of being written to bad parts of 
the ssd as they hadn't been repositioned via scrub to relatively sound 
areas yet, wouldn't have all its files read, and the relative number of 
checksum errors would normally remain below whatever point triggered the 
kernel crash.  If that service was allowed to run, it would read in all 
those files and the resulting errors would often crash the kernel.

So I quickly learned that if I powered up and the kernel crashed at that 
point, I could reboot with the emergency kernel parameter, which would 
tell systemd to give me a maintenance-mode root login prompt after doing 
its normal mounts but before starting the normal post-mount services, and 
I could run scrub from there.  That would normally repair things without 
triggering the crash, and when I had run scrub repeatedly if necessary to 
correct any unverified errors in the first runs, I could then exit 
emergency mode and let systemd start the normal services, including the 
service that read all these files off the now freshly scrubbed 
filesystem, without further issues.

Needless to say, after dealing with that a few times and figuring out 
what was actually triggering the crashes, I disabled that cache-ahead 
service and started doing scrubs before I loaded X and KDE, and thus the 
app that would read all those files I had been trying to pre-cache.

And it wasn't /too/ long after that, that I decided I had observed the 
slow failure and remapped sectors thing for long enough, and I was tired 
of doing scrubs more and more often to try to keep up with things, and I 
did a final scrub and then a btrfs replace of that failing ssd.  The new 
one (as well as the other one that never had a problem) were the same 
brand and model number, but have remained fine, both then and since.

But the point of all that is to confirm your experience.  At least with 
compression, once the number of checksum failures goes too high, even if 
it's supposedly reading from the good copy and fixing things as it goes, 
eventually a kernel crash is triggered.  A reboot and scrub before 
triggering so many checksum failures fixed things for me, so it was 
indeed simple checksum failures with good second copies, but something 
about the process would still crash the kernel if it saw too many of them 
in too short a period.

So it's definitely behavior I've confirmed with compression on.  I can't 
confirm that it doesn't happen with compression off as I've never tried 
that, but that would explain why it hasn't been more commonly reported 
and thus likely fixed by now.  And apparently the devs don't test the 
someone less common combination of both compression and high numbers of 
raid1 correctable checksum errors, or they would have probably detected 
and fixed the problem from that.

So thanks for the additional tests and narrowing it down to the 
compression on raid1 with many checksum errors case.  Now that you've 
found out how the problem can be replicated, I'd guess we'll have a fix 
patch in relatively short order. =:^)

That said, based on my own experience, I don't consider the problem dire 
enough to switch off compression on my btrfs raid1s here.  After all, I 
both figured out how to live with the problem on my failing ssd before I 
knew all this detail, and have eliminated the symptoms for the time being 
at least, as the devices I'm using now are currently reliable enough that 
I don't have to deal with this issue.

And in the even that I do encounter the problem again, in severe enough 
form that I can't even get a successful scrub in to fix it, possibly due 
to catastrophic failure of a device, I should still be able to simply 
remove that device and use degraded,ro mounts of the remaining device to 
get access to the data in ordered to copy it to a replacement filesystem.

Which is already how I intended to deal with a catastrophic device 
failure, should it happen, so no real change of plans at all.  In my case 
I don't have to have live failover, so shutdown and cold replacement, if 
necessary with degraded,ro mounting and using the old filesystem as a 
backup to restore to a new filesystem on the new device, then device-
adding the old device partitions to the new filesystem and reconverting 
to raid1, will be a bit of a hassle, but should otherwise be a reasonably 
straightforward recovery.  And that's perfectly sufficient for my 
purposes if necessary, even if I'd prefer to avoid that degraded-readonly 
situation in the first place, when it's possible.

But the fix for this one should go quite some way toward increasing btrfs 
raid1 robustness and will definitely be a noticeable step on the journey 
toward production-ready, for sure, and now that you've nailed it down so 
nicely, a fix should be quickly forthcoming. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
  2016-03-28 10:26 ` Duncan
@ 2016-03-28 14:34   ` James Johnston
  2016-03-29  2:23     ` Duncan
  2016-03-29 19:02     ` Mitch Fossen
  0 siblings, 2 replies; 7+ messages in thread
From: James Johnston @ 2016-03-28 14:34 UTC (permalink / raw)
  To: 'Duncan', linux-btrfs

Hi,

Thanks for the corroborating report - it does sound to me like you ran into the
same problem I've found.  (I don't suppose you ever captured any of the
crashes?  If they assert on the same thing as me then it's even stronger
evidence.)

> The failure mode of this particular ssd was premature failure of more and
> more sectors, about 3 MiB worth over several months based on the raw
> count of reallocated sectors in smartctl -A, but using scrub to rewrite
> them from the good device would normally work, forcing the firmware to
> remap that sector to one of the spares as scrub corrected the problem.

I wonder what the risk of a CRC collision was in your situation?

Certainly my test of "dd if=/dev/zero of=/dev/sdb" was very abusive, and I
wonder if the result after scrubbing is trustworthy, or if there was some
collisions.  But I wasn't checking to see if data coming out the other end was
OK - I was just trying to see if the kernel crashes or not (e.g. a USB stick
holding a bad btrfs file system should not crash a system).

> But /home (on an entirely separate filesystem, but a filesystem still on
> a pair of partitions, one on each of the same two ssds) would often have
> more, and because I have a particular program that I start with my X and
> KDE session that reads a bunch of files into cache as it starts up, I had
> a systemd service configured to start at boot and cat all the files in
> that particular directory to /dev/null, thus caching them so when I later
> started X and KDE (I don't run a *DM and thus login at the text CLI and
> startx, with a kde session, from the CLI) and thus this program, all the
> files it reads would already be in cache.
>
> <snip> If that service was allowed to run, it would read in all
> those files and the resulting errors would often crash the kernel.

This sounds oddly familiar to how I made it crash. :)

> So I quickly learned that if I powered up and the kernel crashed at that
> point, I could reboot with the emergency kernel parameter, which would
> tell systemd to give me a maintenance-mode root login prompt after doing
> its normal mounts but before starting the normal post-mount services, and
> I could run scrub from there.  That would normally repair things without
> triggering the crash, and when I had run scrub repeatedly if necessary to
> correct any unverified errors in the first runs, I could then exit
> emergency mode and let systemd start the normal services, including the
> service that read all these files off the now freshly scrubbed
> filesystem, without further issues.

That is one thing I did not test.  I only ever scrubbed after first doing the
"cat all files to null" test.  So in the case of compression, I never got that
far.  Probably someone should test the scrubbing more thoroughly (i.e. with
that abusive "dd" test I did) just to be sure that it is stable to confirm your
observations, and that the problem is only limited to ordinary file I/O on the
file system.

> And apparently the devs don't test the
> someone less common combination of both compression and high numbers of
> raid1 correctable checksum errors, or they would have probably detected
> and fixed the problem from that.

Well, I've only tested with RAID-1.  I don't know if:

1.  The problem occurs with other RAID levels like RAID-10, RAID5/6.

2.  The kernel crashes in non-duplicated levels.  In these cases, data loss is
    inevitable since the data is missing, but these losses should be handled
    cleanly, and not by crashing the kernel.  For example:

    a.  Checksum errors in RAID-0.
    b.  Checksum errors on a single hard drive (not multiple device array).

I guess more testing is needed, but I don't have time to do this more
exhaustive testing right now, especially for these other RAID levels I'm not
planning to use (as I'm doing this in my limited free time).  (For now, I can
just turn off compression & move on.)

Do any devs do regular regression testing for these sorts of edge cases once
they come up? (i.e. this problem won't come back, will it?)

> So thanks for the additional tests and narrowing it down to the
> compression on raid1 with many checksum errors case.  Now that you've
> found out how the problem can be replicated, I'd guess we'll have a fix
> patch in relatively short order. =:^)

Hopefully!  Like I said, it might not be limited to RAID-1 though.  I only
tested RAID-1.

> That said, based on my own experience, I don't consider the problem dire
> enough to switch off compression on my btrfs raid1s here.  After all, I
> both figured out how to live with the problem on my failing ssd before I
> knew all this detail, and have eliminated the symptoms for the time being
> at least, as the devices I'm using now are currently reliable enough that
> I don't have to deal with this issue.
> 
> And in the even that I do encounter the problem again, in severe enough
> form that I can't even get a successful scrub in to fix it, possibly due
> to catastrophic failure of a device, I should still be able to simply
> remove that device and use degraded,ro mounts of the remaining device to
> get access to the data in ordered to copy it to a replacement filesystem.

That sounds like it would work.  Assuming this bug doesn't eat data in the
process.  I have not tried scrubbing after encountering this bug.  The remaining
"good" device in the array ought to still be ok.  But I have not tested.  You
might want to test that.

The most severe form might be if the drive drops off the SATA bus, which from
what I read is not an uncommon failure mode.  In that case, you're probably
guaranteed to encounter this in short order and the system is going to go down.

I did at one point awhile back test that I could boot the system degraded after
it went down from hot-removing a drive.  This was ultimately successful (after
manually tweaking the boot process in grub/initramfs: unrelated issues), but I
don't recall scrubbing it afterwards.

Best regards,

James Johnston

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
  2016-03-28 14:34   ` James Johnston
@ 2016-03-29  2:23     ` Duncan
  2016-03-29 19:02     ` Mitch Fossen
  1 sibling, 0 replies; 7+ messages in thread
From: Duncan @ 2016-03-29  2:23 UTC (permalink / raw)
  To: linux-btrfs

James Johnston posted on Mon, 28 Mar 2016 14:34:14 +0000 as excerpted:

> Thanks for the corroborating report - it does sound to me like you ran
> into the same problem I've found.  (I don't suppose you ever captured
> any of the crashes?  If they assert on the same thing as me then it's
> even stronger evidence.)

No...  In fact, as I have compress=lzo on all my btrfs, until you found 
out that it didn't happen in the uncompressed case, I simply considered 
that part and parcel of btrfs not being fully stabilized and mature yet.  
I didn't even consider it a specific bug on its own, and thus didn't 
report it or trace it in any way, and simply worked around it, even tho I 
certainly found it frustrating.

>> The failure mode of this particular ssd was premature failure of more
>> and more sectors, about 3 MiB worth over several months based on the
>> raw count of reallocated sectors in smartctl -A, but using scrub to
>> rewrite them from the good device would normally work, forcing the
>> firmware to remap that sector to one of the spares as scrub corrected
>> the problem.
> 
> I wonder what the risk of a CRC collision was in your situation?
> 
> Certainly my test of "dd if=/dev/zero of=/dev/sdb" was very abusive, and
> I wonder if the result after scrubbing is trustworthy, or if there was
> some collisions.  But I wasn't checking to see if data coming out the
> other end was OK - I was just trying to see if the kernel crashes or not
> (e.g. a USB stick holding a bad btrfs file system should not crash a
> system).

I had absolutely no trouble with the scrubbed data, or at least none I 
attributed to that, tho I didn't have the data cross-hashed and cross-
check the post-scrub result against earlier hashes or anything, so a few 
CRC collisions could have certainly snuck thru.

But even were some to have done so, or even if they didn't in practice, 
if they could have in theory, just the standard crc checks are so far 
beyond what's built into a normal filesystem like the reiserfs that's 
still my second (and non-btrfs) level backup.  So it's not like I'm 
majorly concerned.  If I was paranoid, as I mentioned I could certainly 
be doing cross-checks against multiple hashes, but I survived without any 
sort of routine data integrity checking for years, and even a practical 
worst-case-scenario crc-collision is already an infinite percentage 
better than that (just as 1 is an infinite percentage of 0), so it's 
nothing I'm going to worry about unless I actually start seeing real 
cases of it.

>> So I quickly learned that if I powered up and the kernel crashed at
>> that point, I could reboot with the emergency kernel parameter, which
>> would tell systemd to give me a maintenance-mode root login prompt
>> after doing its normal mounts but before starting the normal post-mount
>> services, and I could run scrub from there.  That would normally repair
>> things without triggering the crash, and when I had run scrub
>> repeatedly if necessary to correct any unverified errors in the first
>> runs, I could then exit emergency mode and let systemd start the normal
>> services, including the service that read all these files off the now
>> freshly scrubbed filesystem, without further issues.
> 
> That is one thing I did not test.  I only ever scrubbed after first
> doing the "cat all files to null" test.  So in the case of compression,
> I never got that far.  Probably someone should test the scrubbing more
> thoroughly (i.e. with that abusive "dd" test I did) just to be sure that
> it is stable to confirm your observations, and that the problem is only
> limited to ordinary file I/O on the file system.

I suspect that when the devs duplicate the bug and ultimately trace it 
down, we'll know from the code-path whether scrub could have hit it or 
not, without actually testing the scrub case on its own.

And along with the fix it's a fair bet will be an fstests patch that will 
verify no regressions there once fixed, as well.

Once the fstests patch is in, it should be just a small tweak to test 
whether scrub's subject to the problem if it uses a different code-path, 
or not, and in fact once they find and verify with a fix the problem 
here, even if scrub doesn't use that code-path, I expect they'll be 
verifying scrub's own code-paths as well.

>> And apparently the devs don't test the someone less common combination
>> of both compression and high numbers of raid1 correctable checksum
>> errors, or they would have probably detected and fixed the problem from
>> that.
> 
> Well, I've only tested with RAID-1.  I don't know if:
> 
> 1.  The problem occurs with other RAID levels like RAID-10, RAID5/6.
> 
> 2.  The kernel crashes in non-duplicated levels.  In these cases, data
> loss is inevitable since the data is missing, but these losses should be
> handled cleanly, and not by crashing the kernel.

Good points.  Again, I expect the extent of the bug based on its code-
path and what actually uses it, should be readily apparent once the bug 
is traced, and that tests will be developed to cover the other code 
paths, once the scope of this specific affected code path is known, if 
necessary.

Tho it's quite possible that testing these sorts of things will help 
further narrow the problem space, and thus could help in actually tracing 
down the bug, in the first place.

> Do any devs do regular regression testing for these sorts of edge cases
> once they come up? (i.e. this problem won't come back, will it?)

As should (now) be obvious from the above, yes, definitely.  Most patches 
to specific bugs also include fstests (originally xfstests, I don't 
actually know if the name has actually changed or if the fstests 
references I see are simply using an informal generic name, now that 
xfstest covers all sorts of other filesystems including btrfs, as well) 
patches to ensure the same bugs or anything close enough to them to 
trigger the same test-fails, don't get reintroduced.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
  2016-03-28 14:34   ` James Johnston
  2016-03-29  2:23     ` Duncan
@ 2016-03-29 19:02     ` Mitch Fossen
  2016-04-01 18:53       ` mitch
  1 sibling, 1 reply; 7+ messages in thread
From: Mitch Fossen @ 2016-03-29 19:02 UTC (permalink / raw)
  To: James Johnston, Duncan, linux-btrfs

Hello,

Your experience looks similar to an issue that I've been running into
recently. I have a btrfs array in RAID0 with compression=lzo set.

The machine runs fine for awhile, then crashes at (seemingly) random
with an error message in the journal about a stuck CPU and an issue
with the kworker process.

There are also a bunch of files on it that have been corrupted and
throw csum errors when trying to access them.

Combine that with some scheduled jobs that run every night that
transfer files, and it's making more sense that this issue could be
the same as you encountered.

This happened on Scientific Linux 7.2 with kernel-ml (which I think is
on version 4.5 now) installed from elrepo and the latest btrfs-progs.

I also booted from an Ubuntu 15.10 USB drive and mounted the damaged
array and ran "find /home -type f -exec cat {} /dev/null \;" from it
and it looks like that has failed as well.

I'll try to get the journal output posted and see if that could help
narrow down the cause of the problem.

Let me know if there's anything else you want me to take a look at or
test on my machine that could help.

Thanks,

Mitch Fossen

On Mon, Mar 28, 2016 at 9:36 AM James Johnston
<johnstonj.public@codenest.com> wrote:
>
> Hi,
>
> Thanks for the corroborating report - it does sound to me like you ran into the
> same problem I've found.  (I don't suppose you ever captured any of the
> crashes?  If they assert on the same thing as me then it's even stronger
> evidence.)
>
> > The failure mode of this particular ssd was premature failure of more and
> > more sectors, about 3 MiB worth over several months based on the raw
> > count of reallocated sectors in smartctl -A, but using scrub to rewrite
> > them from the good device would normally work, forcing the firmware to
> > remap that sector to one of the spares as scrub corrected the problem.
>
> I wonder what the risk of a CRC collision was in your situation?
>
> Certainly my test of "dd if=/dev/zero of=/dev/sdb" was very abusive, and I
> wonder if the result after scrubbing is trustworthy, or if there was some
> collisions.  But I wasn't checking to see if data coming out the other end was
> OK - I was just trying to see if the kernel crashes or not (e.g. a USB stick
> holding a bad btrfs file system should not crash a system).
>
> > But /home (on an entirely separate filesystem, but a filesystem still on
> > a pair of partitions, one on each of the same two ssds) would often have
> > more, and because I have a particular program that I start with my X and
> > KDE session that reads a bunch of files into cache as it starts up, I had
> > a systemd service configured to start at boot and cat all the files in
> > that particular directory to /dev/null, thus caching them so when I later
> > started X and KDE (I don't run a *DM and thus login at the text CLI and
> > startx, with a kde session, from the CLI) and thus this program, all the
> > files it reads would already be in cache.
> >
> > <snip> If that service was allowed to run, it would read in all
> > those files and the resulting errors would often crash the kernel.
>
> This sounds oddly familiar to how I made it crash. :)
>
> > So I quickly learned that if I powered up and the kernel crashed at that
> > point, I could reboot with the emergency kernel parameter, which would
> > tell systemd to give me a maintenance-mode root login prompt after doing
> > its normal mounts but before starting the normal post-mount services, and
> > I could run scrub from there.  That would normally repair things without
> > triggering the crash, and when I had run scrub repeatedly if necessary to
> > correct any unverified errors in the first runs, I could then exit
> > emergency mode and let systemd start the normal services, including the
> > service that read all these files off the now freshly scrubbed
> > filesystem, without further issues.
>
> That is one thing I did not test.  I only ever scrubbed after first doing the
> "cat all files to null" test.  So in the case of compression, I never got that
> far.  Probably someone should test the scrubbing more thoroughly (i.e. with
> that abusive "dd" test I did) just to be sure that it is stable to confirm your
> observations, and that the problem is only limited to ordinary file I/O on the
> file system.
>
> > And apparently the devs don't test the
> > someone less common combination of both compression and high numbers of
> > raid1 correctable checksum errors, or they would have probably detected
> > and fixed the problem from that.
>
> Well, I've only tested with RAID-1.  I don't know if:
>
> 1.  The problem occurs with other RAID levels like RAID-10, RAID5/6.
>
> 2.  The kernel crashes in non-duplicated levels.  In these cases, data loss is
>     inevitable since the data is missing, but these losses should be handled
>     cleanly, and not by crashing the kernel.  For example:
>
>     a.  Checksum errors in RAID-0.
>     b.  Checksum errors on a single hard drive (not multiple device array).
>
> I guess more testing is needed, but I don't have time to do this more
> exhaustive testing right now, especially for these other RAID levels I'm not
> planning to use (as I'm doing this in my limited free time).  (For now, I can
> just turn off compression & move on.)
>
> Do any devs do regular regression testing for these sorts of edge cases once
> they come up? (i.e. this problem won't come back, will it?)
>
> > So thanks for the additional tests and narrowing it down to the
> > compression on raid1 with many checksum errors case.  Now that you've
> > found out how the problem can be replicated, I'd guess we'll have a fix
> > patch in relatively short order. =:^)
>
> Hopefully!  Like I said, it might not be limited to RAID-1 though.  I only
> tested RAID-1.
>
> > That said, based on my own experience, I don't consider the problem dire
> > enough to switch off compression on my btrfs raid1s here.  After all, I
> > both figured out how to live with the problem on my failing ssd before I
> > knew all this detail, and have eliminated the symptoms for the time being
> > at least, as the devices I'm using now are currently reliable enough that
> > I don't have to deal with this issue.
> >
> > And in the even that I do encounter the problem again, in severe enough
> > form that I can't even get a successful scrub in to fix it, possibly due
> > to catastrophic failure of a device, I should still be able to simply
> > remove that device and use degraded,ro mounts of the remaining device to
> > get access to the data in ordered to copy it to a replacement filesystem.
>
> That sounds like it would work.  Assuming this bug doesn't eat data in the
> process.  I have not tried scrubbing after encountering this bug.  The remaining
> "good" device in the array ought to still be ok.  But I have not tested.  You
> might want to test that.
>
> The most severe form might be if the drive drops off the SATA bus, which from
> what I read is not an uncommon failure mode.  In that case, you're probably
> guaranteed to encounter this in short order and the system is going to go down.
>
> I did at one point awhile back test that I could boot the system degraded after
> it went down from hot-removing a drive.  This was ultimately successful (after
> manually tweaking the boot process in grub/initramfs: unrelated issues), but I
> don't recall scrubbing it afterwards.
>
> Best regards,
>
> James Johnston
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
  2016-03-29 19:02     ` Mitch Fossen
@ 2016-04-01 18:53       ` mitch
  2016-04-01 20:54         ` James Johnston
  0 siblings, 1 reply; 7+ messages in thread
From: mitch @ 2016-04-01 18:53 UTC (permalink / raw)
  To: James Johnston, Duncan, linux-btrfs

I grabbed this part from the log after the machine crashed again
following trying to transfer a bunch of files that included ones with
csum errors, let me know if this looks like the same issue you were
having:


Mar 31 00:49:42 sl-server kernel: NMI watchdog: BUG: soft lockup -
CPU#21 stuck for 22s! [kworker/u67:5:80994]
Mar 31 00:49:42 sl-server kernel: Modules linked in: fuse xt_CHECKSUM
ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat
ebtable_broute ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw iptable_filter dm_mirror dm_region_hash
dm_log dm_mod kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel xfs aesni_intel lrw gf128mul glue_helper libcrc32c
ablk_helper cryptd joydev input_leds edac_mce_amd k10temp edac_core
fam15h_power sp5100_tco sg i2c_piix4 8250_fintek acpi_cpufreq shpchp
nfsd auth_rpcgss nfs_acl
Mar 31 00:49:42 sl-server kernel:  lockd grace sunrpc ip_tables btrfs
xor ata_generic pata_acpi raid6_pq sd_mod mgag200 crc32c_intel
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci
serio_raw pata_atiixp libahci igb drm ptp pps_core mpt3sas dca
raid_class libata i2c_algo_bit scsi_transport_sas fjes uas usb_storage
Mar 31 00:49:42 sl-server kernel: CPU: 21 PID: 80994 Comm:
kworker/u67:5 Not tainted 4.5.0-1.el7.elrepo.x86_64 #1
Mar 31 00:49:42 sl-server kernel: Hardware name: Supermicro
H8DG6/H8DGi/H8DG6/H8DGi, BIOS 3.5        11/25/2013
Mar 31 00:49:42 sl-server kernel: Workqueue: btrfs-endio
btrfs_endio_helper [btrfs]
Mar 31 00:49:42 sl-server kernel: task: ffff8817f6fa8000 ti:
ffff8800b7310000 task.ti: ffff8800b7310000
Mar 31 00:49:42 sl-server kernel: RIP:
0010:[<ffffffffa0347b13>]  [<ffffffffa0347b13>]
btrfs_decompress_buf2page+0x123/0x200 [btrfs]
Mar 31 00:49:42 sl-server kernel: RSP: 0018:ffff8800b7313be0  EFLAGS:
00000246
Mar 31 00:49:42 sl-server kernel: RAX: 0000000000000000 RBX:
0000000000000000 RCX: 0000000000000000
Mar 31 00:49:42 sl-server kernel: RDX: 0000000000000000 RSI:
ffffc9000e3d8000 RDI: ffff88144c7cc000
Mar 31 00:49:42 sl-server kernel: RBP: ffff8800b7313c48 R08:
ffff8810f0295000 R09: 0000000000000020
Mar 31 00:49:42 sl-server kernel: R10: ffff8810d2ba7869 R11:
0000000000010008 R12: ffff8817f6fa8000
Mar 31 00:49:42 sl-server kernel: R13: ffff8800b7313ce0 R14:
0000000000000008 R15: 0000000000001000
Mar 31 00:49:42 sl-server kernel: FS:  00007efce58fb740(0000)
GS:ffff881807d40000(0000) knlGS:0000000000000000
Mar 31 00:49:42 sl-server kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Mar 31 00:49:42 sl-server kernel: CR2: 00007f00caf249e8 CR3:
0000001062121000 CR4: 00000000000406e0
Mar 31 00:49:42 sl-server kernel: Stack:
Mar 31 00:49:42 sl-server kernel:  0000000000000020 000000000000f000
ffff8810f0295000 0000000087440000
Mar 31 00:49:42 sl-server kernel:  0000000000010008 ffffc9000e3d7000
ffffea005131f300 0000000000010000
Mar 31 00:49:42 sl-server kernel:  0000000000000797 0000000000002869
0000000000000869 ffff8810d2ba7000
Mar 31 00:49:42 sl-server kernel: Call Trace:
Mar 31 00:49:42 sl-server kernel:  [<ffffffffa0345dd2>]
lzo_decompress_biovec+0x202/0x300 [btrfs]
Mar 31 00:49:42 sl-server kernel:  [<ffffffffa0346bb6>]
end_compressed_bio_read+0x1f6/0x2f0 [btrfs]
Mar 31 00:49:42 sl-server kernel:  [<ffffffff81307b20>]
bio_endio+0x40/0x60
Mar 31 00:49:42 sl-server kernel:  [<ffffffffa02f1e5c>]
end_workqueue_fn+0x3c/0x40 [btrfs]
Mar 31 00:49:42 sl-server kernel:  [<ffffffffa032e5f0>]
normal_work_helper+0xc0/0x2c0 [btrfs]
Mar 31 00:49:42 sl-server kernel:  [<ffffffffa032e8c2>]
btrfs_endio_helper+0x12/0x20 [btrfs]
Mar 31 00:49:42 sl-server kernel:  [<ffffffff8109736f>]
process_one_work+0x14f/0x400
Mar 31 00:49:42 sl-server kernel:  [<ffffffff81097c55>]
worker_thread+0x125/0x4b0
Mar 31 00:49:42 sl-server kernel:  [<ffffffff81097b30>] ?
rescuer_thread+0x370/0x370
Mar 31 00:49:42 sl-server kernel:  [<ffffffff8109d7c8>]
kthread+0xd8/0xf0
Mar 31 00:49:42 sl-server kernel:  [<ffffffff8109d6f0>] ?
kthread_park+0x60/0x60
Mar 31 00:49:42 sl-server kernel:  [<ffffffff81703f4f>]
ret_from_fork+0x3f/0x70
Mar 31 00:49:42 sl-server kernel:  [<ffffffff8109d6f0>] ?
kthread_park+0x60/0x60
Mar 31 00:49:42 sl-server kernel: Code: c7 48 8b 45 c0 49 03 7d 00 4a
8d 34 38 e8 06 18 00 e1 41 83 ac 24 28 12 00 00 01 41 8b 84 24 28 12 00
00 85 c0 0f 88 bf 00 00 00 <48> 89 d8 49 03 45 00 49 01 df 49 29 de 48
01 5d d0 48 3d 00 10 
Mar 31 00:49:43 sl-server sh[1297]: abrt-dump-oops: Found oopses: 1
Mar 31 00:49:43 sl-server sh[1297]: abrt-dump-oops: Creating problem
directories
Mar 31 00:49:43 sl-server sh[1297]: abrt-dump-oops: Not going to make
dump directories world readable because PrivateReports is on

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)
  2016-04-01 18:53       ` mitch
@ 2016-04-01 20:54         ` James Johnston
  0 siblings, 0 replies; 7+ messages in thread
From: James Johnston @ 2016-04-01 20:54 UTC (permalink / raw)
  To: 'mitch', 'Duncan', linux-btrfs

> I grabbed this part from the log after the machine crashed again
> following trying to transfer a bunch of files that included ones with
> csum errors, let me know if this looks like the same issue you were
> having:
> 

Idk?  You hit a soft lockup, mine got a "kernel BUG at..."

Your stack trace diverges from mine after bio_endio.

James 




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-04-01 20:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-28  4:41 Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1) James Johnston
2016-03-28 10:26 ` Duncan
2016-03-28 14:34   ` James Johnston
2016-03-29  2:23     ` Duncan
2016-03-29 19:02     ` Mitch Fossen
2016-04-01 18:53       ` mitch
2016-04-01 20:54         ` James Johnston

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).