* File system corruption, btrfsck abort
@ 2017-04-25 17:50 Christophe de Dinechin
2017-04-27 14:58 ` Christophe de Dinechin
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Christophe de Dinechin @ 2017-04-25 17:50 UTC (permalink / raw)
To: linux-btrfs
Hi,
I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash.
The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there.
The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today.
The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this:
78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120
78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80
78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120
I don’t really know what to make of it.
The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL.
#0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125
#1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386
#2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417
#3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851
#4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841
I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place.
I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04).
Thanks
Christophe de Dinechin
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: File system corruption, btrfsck abort 2017-04-25 17:50 File system corruption, btrfsck abort Christophe de Dinechin @ 2017-04-27 14:58 ` Christophe de Dinechin 2017-04-27 15:12 ` Christophe de Dinechin 2017-04-28 0:45 ` Qu Wenruo 2017-04-28 3:58 ` Chris Murphy 2 siblings, 1 reply; 17+ messages in thread From: Christophe de Dinechin @ 2017-04-27 14:58 UTC (permalink / raw) To: linux-btrfs > On 25 Apr 2017, at 19:50, Christophe de Dinechin <dinechin@redhat.com> wrote: > The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: > > 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 > 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 > 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 > > I don’t really know what to make of it. I dig a bit deeper. We set rec->max_size = 0 in add_extent_rec_nolookup called from add_tree_backref, where we cleared the extent_record tmpl with a memset, so indeed, max_size is 0. However, we immediately after that do a lookup_cache_extent with a size of 1. So I wonder if at that stage, we should not set max_size to 1 for the newly created extent record. Opinions? Christophe ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-27 14:58 ` Christophe de Dinechin @ 2017-04-27 15:12 ` Christophe de Dinechin 0 siblings, 0 replies; 17+ messages in thread From: Christophe de Dinechin @ 2017-04-27 15:12 UTC (permalink / raw) To: linux-btrfs > On 27 Apr 2017, at 16:58, Christophe de Dinechin <dinechin@redhat.com> wrote: > >> On 25 Apr 2017, at 19:50, Christophe de Dinechin <dinechin@redhat.com> wrote: > >> The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: >> >> 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 >> 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 >> 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 >> >> I don’t really know what to make of it. > > I dig a bit deeper. We set rec->max_size = 0 in add_extent_rec_nolookup called from add_tree_backref, where we cleared the extent_record tmpl with a memset, so indeed, max_size is 0. However, we immediately after that do a lookup_cache_extent with a size of 1. So I wonder if at that stage, we should not set max_size to 1 for the newly created extent record. Well, for what it’s worth, it does not seem to help much: *** Error in `btrfs check': double free or corruption (!prev): 0x0000000007d9c430 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7925b)[0x7ffff6feb25b] /lib64/libc.so.6(+0x828ea)[0x7ffff6ff48ea] /lib64/libc.so.6(cfree+0x4c)[0x7ffff6ff831c] btrfs check[0x44d784] btrfs check[0x4531ac] btrfs check[0x45b24b] btrfs check[0x45b743] btrfs check[0x45c2b1] btrfs check(cmd_check+0xcad)[0x4602d4] btrfs check(main+0x8b)[0x40b7fb] /lib64/libc.so.6(__libc_start_main+0xf1)[0x7ffff6f92401] btrfs check(_start+0x2a)[0x40b4ba] ======= Memory map: ======== 00400000-004a4000 r-xp 00000000 08:35 25167142 /home/ddd/Work/btrfs-progs/btrfs 006a4000-006a8000 r--p 000a4000 08:35 25167142 /home/ddd/Work/btrfs-progs/btrfs 006a8000-006fb000 rw-p 000a8000 08:35 25167142 /home/ddd/Work/btrfs-progs/btrfs 006fb000-316a6000 rw-p 00000000 00:00 0 [heap] 7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0 7ffff0021000-7ffff4000000 ---p 00000000 00:00 0 7ffff6d5b000-7ffff6d71000 r-xp 00000000 08:33 3156890 /usr/lib64/libgcc_s-6.3.1-20161221.so.1 7ffff6d71000-7ffff6f70000 ---p 00016000 08:33 3156890 /usr/lib64/libgcc_s-6.3.1-20161221.so.1 7ffff6f70000-7ffff6f71000 r--p 00015000 08:33 3156890 /usr/lib64/libgcc_s-6.3.1-20161221.so.1 7ffff6f71000-7ffff6f72000 rw-p 00016000 08:33 3156890 /usr/lib64/libgcc_s-6.3.1-20161221.so.1 7ffff6f72000-7ffff712f000 r-xp 00000000 08:33 3154711 /usr/lib64/libc-2.24.so 7ffff712f000-7ffff732e000 ---p 001bd000 08:33 3154711 /usr/lib64/libc-2.24.so 7ffff732e000-7ffff7332000 r--p 001bc000 08:33 3154711 /usr/lib64/libc-2.24.so 7ffff7332000-7ffff7334000 rw-p 001c0000 08:33 3154711 /usr/lib64/libc-2.24.so 7ffff7334000-7ffff7338000 rw-p 00000000 00:00 0 7ffff7338000-7ffff7350000 r-xp 00000000 08:33 3155302 /usr/lib64/libpthread-2.24.so 7ffff7350000-7ffff7550000 ---p 00018000 08:33 3155302 /usr/lib64/libpthread-2.24.so 7ffff7550000-7ffff7551000 r--p 00018000 08:33 3155302 /usr/lib64/libpthread-2.24.so 7ffff7551000-7ffff7552000 rw-p 00019000 08:33 3155302 /usr/lib64/libpthread-2.24.so 7ffff7552000-7ffff7556000 rw-p 00000000 00:00 0 7ffff7556000-7ffff7578000 r-xp 00000000 08:33 3155132 /usr/lib64/liblzo2.so.2.0.0 7ffff7578000-7ffff7777000 ---p 00022000 08:33 3155132 /usr/lib64/liblzo2.so.2.0.0 7ffff7777000-7ffff7778000 r--p 00021000 08:33 3155132 /usr/lib64/liblzo2.so.2.0.0 7ffff7778000-7ffff7779000 rw-p 00000000 00:00 0 7ffff7779000-7ffff778e000 r-xp 00000000 08:33 3155608 /usr/lib64/libz.so.1.2.8 7ffff778e000-7ffff798d000 ---p 00015000 08:33 3155608 /usr/lib64/libz.so.1.2.8 7ffff798d000-7ffff798e000 r--p 00014000 08:33 3155608 /usr/lib64/libz.so.1.2.8 7ffff798e000-7ffff798f000 rw-p 00015000 08:33 3155608 /usr/lib64/libz.so.1.2.8 7ffff798f000-7ffff79cc000 r-xp 00000000 08:33 3153511 /usr/lib64/libblkid.so.1.1.0 7ffff79cc000-7ffff7bcc000 ---p 0003d000 08:33 3153511 /usr/lib64/libblkid.so.1.1.0 7ffff7bcc000-7ffff7bd0000 r--p 0003d000 08:33 3153511 /usr/lib64/libblkid.so.1.1.0 7ffff7bd0000-7ffff7bd1000 rw-p 00041000 08:33 3153511 /usr/lib64/libblkid.so.1.1.0 7ffff7bd1000-7ffff7bd2000 rw-p 00000000 00:00 0 7ffff7bd2000-7ffff7bd6000 r-xp 00000000 08:33 3154270 /usr/lib64/libuuid.so.1.3.0 7ffff7bd6000-7ffff7dd5000 ---p 00004000 08:33 3154270 /usr/lib64/libuuid.so.1.3.0 7ffff7dd5000-7ffff7dd6000 r--p 00003000 08:33 3154270 /usr/lib64/libuuid.so.1.3.0 7ffff7dd6000-7ffff7dd7000 rw-p 00000000 00:00 0 7ffff7dd7000-7ffff7dfc000 r-xp 00000000 08:33 3154536 /usr/lib64/ld-2.24.so 7ffff7fdb000-7ffff7fe0000 rw-p 00000000 00:00 0 7ffff7ff5000-7ffff7ff8000 rw-p 00000000 00:00 0 7ffff7ff8000-7ffff7ffa000 r--p 00000000 00:00 0 [vvar] 7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0 [vdso] 7ffff7ffc000-7ffff7ffd000 r--p 00025000 08:33 3154536 /usr/lib64/ld-2.24.so 7ffff7ffd000-7ffff7ffe000 rw-p 00026000 08:33 3154536 /usr/lib64/ld-2.24.so 7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0 7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] This seems to match the other scenario I was referring to, with an inconsistent list. > > Opinions? > > Christophe > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-25 17:50 File system corruption, btrfsck abort Christophe de Dinechin 2017-04-27 14:58 ` Christophe de Dinechin @ 2017-04-28 0:45 ` Qu Wenruo 2017-04-28 8:47 ` Christophe de Dinechin 2017-04-28 3:58 ` Chris Murphy 2 siblings, 1 reply; 17+ messages in thread From: Qu Wenruo @ 2017-04-28 0:45 UTC (permalink / raw) To: Christophe de Dinechin, linux-btrfs At 04/26/2017 01:50 AM, Christophe de Dinechin wrote: > Hi, > > > I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash. > > The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. According to the bugzilla, the btrfs-progs seems to be too old in btrfs standard. What about using the latest btrfs-progs v4.10.2? Furthermore for v4.10.2, btrfs check provides a new mode called lowmem. You could try "btrfs check --mode=lowmem" to see if such problem can be avoided. For the kernel bug, it seems to be related to wrongly inserted delayed ref, but I can totally be wrong. Thanks, Qu > > The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. > > The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: > > 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 > 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 > 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 > > I don’t really know what to make of it. > > The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL. > > #0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125 > #1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386 > #2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417 > #3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851 > #4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841 > > I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place. > > I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). > > > Thanks > Christophe de Dinechin > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-28 0:45 ` Qu Wenruo @ 2017-04-28 8:47 ` Christophe de Dinechin 2017-05-02 0:17 ` Qu Wenruo 0 siblings, 1 reply; 17+ messages in thread From: Christophe de Dinechin @ 2017-04-28 8:47 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs > On 28 Apr 2017, at 02:45, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: > > > > At 04/26/2017 01:50 AM, Christophe de Dinechin wrote: >> Hi, >> I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash. >> The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. > > According to the bugzilla, the btrfs-progs seems to be too old in btrfs standard. > What about using the latest btrfs-progs v4.10.2? I tried 4.10.1-1 https://bugzilla.redhat.com/show_bug.cgi?id=1435567#c4. I am currently debugging with a build from the master branch as of Tuesday (commit bd0ab27afbf14370f9f0da1f5f5ecbb0adc654c1), which is 4.10.2 There was no change in behavior. Runs are split about evenly between list crash and abort. I added instrumentation and tried a fix, which brings me a tiny bit further, until I hit a message from delete_duplicate_records: Ok we have overlapping extents that aren't completely covered by each other, this is going to require more careful thought. The extents are [52428800-16384] and [52432896-16384] > Furthermore for v4.10.2, btrfs check provides a new mode called lowmem. > You could try "btrfs check --mode=lowmem" to see if such problem can be avoided. I will try that, but what makes you think this is a memory-related condition? The machine has 16G of RAM, isn’t that enough for an fsck? > > For the kernel bug, it seems to be related to wrongly inserted delayed ref, but I can totally be wrong. For now, I’m focusing on the “repair” part as much as I can, because I assume the kernel bug is there anyway, so someone else is bound to hit this problem. Thanks Christophe > > Thanks, > Qu >> The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. >> The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: >> 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 >> 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 >> 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 >> I don’t really know what to make of it. >> The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL. >> #0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125 >> #1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386 >> #2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417 >> #3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851 >> #4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841 >> I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place. >> I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). >> Thanks >> Christophe de Dinechin >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-28 8:47 ` Christophe de Dinechin @ 2017-05-02 0:17 ` Qu Wenruo 2017-05-03 14:21 ` Christophe de Dinechin 0 siblings, 1 reply; 17+ messages in thread From: Qu Wenruo @ 2017-05-02 0:17 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: linux-btrfs At 04/28/2017 04:47 PM, Christophe de Dinechin wrote: > >> On 28 Apr 2017, at 02:45, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >> >> >> >> At 04/26/2017 01:50 AM, Christophe de Dinechin wrote: >>> Hi, >>> I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash. >>> The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. >> >> According to the bugzilla, the btrfs-progs seems to be too old in btrfs standard. > >> What about using the latest btrfs-progs v4.10.2? > > I tried 4.10.1-1 https://bugzilla.redhat.com/show_bug.cgi?id=1435567#c4. > > I am currently debugging with a build from the master branch as of Tuesday (commit bd0ab27afbf14370f9f0da1f5f5ecbb0adc654c1), which is 4.10.2 > > There was no change in behavior. Runs are split about evenly between list crash and abort. > > I added instrumentation and tried a fix, which brings me a tiny bit further, until I hit a message from delete_duplicate_records: > > Ok we have overlapping extents that aren't completely covered by each > other, this is going to require more careful thought. The extents are > [52428800-16384] and [52432896-16384] Then I think lowmem mode may have better chance to handle it without crash. > >> Furthermore for v4.10.2, btrfs check provides a new mode called lowmem. >> You could try "btrfs check --mode=lowmem" to see if such problem can be avoided. > > I will try that, but what makes you think this is a memory-related condition? The machine has 16G of RAM, isn’t that enough for an fsck? Not for memory usage, but in fact lowmem mode is a completely rework, so I just want to see how good or bad the new lowmem mode handles it. Thanks, Qu > >> >> For the kernel bug, it seems to be related to wrongly inserted delayed ref, but I can totally be wrong. > > For now, I’m focusing on the “repair” part as much as I can, because I assume the kernel bug is there anyway, so someone else is bound to hit this problem. > > > Thanks > Christophe > >> >> Thanks, >> Qu >>> The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. >>> The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: >>> 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 >>> 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 >>> 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 >>> I don’t really know what to make of it. >>> The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL. >>> #0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125 >>> #1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386 >>> #2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417 >>> #3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851 >>> #4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841 >>> I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place. >>> I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). >>> Thanks >>> Christophe de Dinechin >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-05-02 0:17 ` Qu Wenruo @ 2017-05-03 14:21 ` Christophe de Dinechin 2017-05-04 12:33 ` Christophe de Dinechin 2017-05-05 0:18 ` Qu Wenruo 0 siblings, 2 replies; 17+ messages in thread From: Christophe de Dinechin @ 2017-05-03 14:21 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs > On 2 May 2017, at 02:17, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: > > > > At 04/28/2017 04:47 PM, Christophe de Dinechin wrote: >>> On 28 Apr 2017, at 02:45, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>> >>> >>> >>> At 04/26/2017 01:50 AM, Christophe de Dinechin wrote: >>>> Hi, >>>> I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash. >>>> The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. >>> >>> According to the bugzilla, the btrfs-progs seems to be too old in btrfs standard. >>> What about using the latest btrfs-progs v4.10.2? >> I tried 4.10.1-1 https://bugzilla.redhat.com/show_bug.cgi?id=1435567#c4. >> I am currently debugging with a build from the master branch as of Tuesday (commit bd0ab27afbf14370f9f0da1f5f5ecbb0adc654c1), which is 4.10.2 >> There was no change in behavior. Runs are split about evenly between list crash and abort. >> I added instrumentation and tried a fix, which brings me a tiny bit further, until I hit a message from delete_duplicate_records: >> Ok we have overlapping extents that aren't completely covered by each >> other, this is going to require more careful thought. The extents are >> [52428800-16384] and [52432896-16384] > > Then I think lowmem mode may have better chance to handle it without crash. I tried it and got: [root@rescue ~]# /usr/local/bin/btrfsck --mode=lowmem --repair /dev/sda4 enabling repair mode ERROR: low memory mode doesn't support repair yet The problem only occurred in —repair mode anyway. > >>> Furthermore for v4.10.2, btrfs check provides a new mode called lowmem. >>> You could try "btrfs check --mode=lowmem" to see if such problem can be avoided. >> I will try that, but what makes you think this is a memory-related condition? The machine has 16G of RAM, isn’t that enough for an fsck? > > Not for memory usage, but in fact lowmem mode is a completely rework, so I just want to see how good or bad the new lowmem mode handles it. Is there a prototype with lowmem and repair? Thanks Christophe > > Thanks, > Qu > >>> >>> For the kernel bug, it seems to be related to wrongly inserted delayed ref, but I can totally be wrong. >> For now, I’m focusing on the “repair” part as much as I can, because I assume the kernel bug is there anyway, so someone else is bound to hit this problem. >> Thanks >> Christophe >>> >>> Thanks, >>> Qu >>>> The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. >>>> The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: >>>> 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 >>>> 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 >>>> 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 >>>> I don’t really know what to make of it. >>>> The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL. >>>> #0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125 >>>> #1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386 >>>> #2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417 >>>> #3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851 >>>> #4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841 >>>> I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place. >>>> I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). >>>> Thanks >>>> Christophe de Dinechin >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-05-03 14:21 ` Christophe de Dinechin @ 2017-05-04 12:33 ` Christophe de Dinechin 2017-05-05 0:18 ` Qu Wenruo 1 sibling, 0 replies; 17+ messages in thread From: Christophe de Dinechin @ 2017-05-04 12:33 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs > On 3 May 2017, at 16:21, Christophe de Dinechin <dinechin@redhat.com> wrote: > >> >> On 2 May 2017, at 02:17, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >> >> >> >> At 04/28/2017 04:47 PM, Christophe de Dinechin wrote: >>>> On 28 Apr 2017, at 02:45, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>> >>>> >>>> >>>> At 04/26/2017 01:50 AM, Christophe de Dinechin wrote: >>>>> Hi, >>>>> I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash. >>>>> The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. >>>> >>>> According to the bugzilla, the btrfs-progs seems to be too old in btrfs standard. >>>> What about using the latest btrfs-progs v4.10.2? >>> I tried 4.10.1-1 https://bugzilla.redhat.com/show_bug.cgi?id=1435567#c4. >>> I am currently debugging with a build from the master branch as of Tuesday (commit bd0ab27afbf14370f9f0da1f5f5ecbb0adc654c1), which is 4.10.2 >>> There was no change in behavior. Runs are split about evenly between list crash and abort. >>> I added instrumentation and tried a fix, which brings me a tiny bit further, until I hit a message from delete_duplicate_records: >>> Ok we have overlapping extents that aren't completely covered by each >>> other, this is going to require more careful thought. The extents are >>> [52428800-16384] and [52432896-16384] >> >> Then I think lowmem mode may have better chance to handle it without crash. > > I tried it and got: > > [root@rescue ~]# /usr/local/bin/btrfsck --mode=lowmem --repair /dev/sda4 > enabling repair mode > ERROR: low memory mode doesn't support repair yet > > The problem only occurred in —repair mode anyway. For what it’s worth, without the --repair option, it gets stuck. I stopped it after 24 hours, it had printed: [root@rescue ~]# /usr/local/bin/btrfsck --mode=lowmem /dev/sda4 Checking filesystem on /dev/sda4 UUID: 26a0c84c-d2ac-4da8-b880-684f2ea48a22 checking extents checksum verify failed on 52428800 found E3ADA767 wanted 7C506C03 checksum verify failed on 52428800 found E3ADA767 wanted 7C506C03 checksum verify failed on 52428800 found E3ADA767 wanted 7C506C03 checksum verify failed on 52428800 found E3ADA767 wanted 7C506C03 Csum didn't match ERROR: extent [52428800 16384] lost referencer (owner: 7, level: 0) checksum verify failed on 52445184 found 8D1BE62F wanted 00000000 checksum verify failed on 52445184 found 8D1BE62F wanted 00000000 checksum verify failed on 52445184 found 8D1BE62F wanted 00000000 checksum verify failed on 52445184 found 8D1BE62F wanted 00000000 bytenr mismatch, want=52445184, have=2199023255552 ERROR: extent [52445184 16384] lost referencer (owner: 2, level: 0) ERROR: extent[52432896 16384] backref lost (owner: 2, level: 0) ERROR: check leaf failed root 2 bytenr 52432896 level 0, force continue check Any tips for further debugging this? Christophe > > >> >>>> Furthermore for v4.10.2, btrfs check provides a new mode called lowmem. >>>> You could try "btrfs check --mode=lowmem" to see if such problem can be avoided. >>> I will try that, but what makes you think this is a memory-related condition? The machine has 16G of RAM, isn’t that enough for an fsck? >> >> Not for memory usage, but in fact lowmem mode is a completely rework, so I just want to see how good or bad the new lowmem mode handles it. > > Is there a prototype with lowmem and repair? > > > Thanks > Christophe > >> >> Thanks, >> Qu >> >>>> >>>> For the kernel bug, it seems to be related to wrongly inserted delayed ref, but I can totally be wrong. >>> For now, I’m focusing on the “repair” part as much as I can, because I assume the kernel bug is there anyway, so someone else is bound to hit this problem. >>> Thanks >>> Christophe >>>> >>>> Thanks, >>>> Qu >>>>> The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. >>>>> The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: >>>>> 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 >>>>> 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 >>>>> 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 >>>>> I don’t really know what to make of it. >>>>> The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL. >>>>> #0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125 >>>>> #1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386 >>>>> #2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417 >>>>> #3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851 >>>>> #4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841 >>>>> I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place. >>>>> I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). >>>>> Thanks >>>>> Christophe de Dinechin >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-05-03 14:21 ` Christophe de Dinechin 2017-05-04 12:33 ` Christophe de Dinechin @ 2017-05-05 0:18 ` Qu Wenruo 1 sibling, 0 replies; 17+ messages in thread From: Qu Wenruo @ 2017-05-05 0:18 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: linux-btrfs At 05/03/2017 10:21 PM, Christophe de Dinechin wrote: > >> On 2 May 2017, at 02:17, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >> >> >> >> At 04/28/2017 04:47 PM, Christophe de Dinechin wrote: >>>> On 28 Apr 2017, at 02:45, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >>>> >>>> >>>> >>>> At 04/26/2017 01:50 AM, Christophe de Dinechin wrote: >>>>> Hi, >>>>> I”ve been trying to run btrfs as my primary work filesystem for about 3-4 months now on Fedora 25 systems. I ran a few times into filesystem corruptions. At least one I attributed to a damaged disk, but the last one is with a brand new 3T disk that reports no SMART errors. Worse yet, in at least three cases, the filesystem corruption caused btrfsck to crash. >>>>> The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. >>>> >>>> According to the bugzilla, the btrfs-progs seems to be too old in btrfs standard. >>>> What about using the latest btrfs-progs v4.10.2? >>> I tried 4.10.1-1 https://bugzilla.redhat.com/show_bug.cgi?id=1435567#c4. >>> I am currently debugging with a build from the master branch as of Tuesday (commit bd0ab27afbf14370f9f0da1f5f5ecbb0adc654c1), which is 4.10.2 >>> There was no change in behavior. Runs are split about evenly between list crash and abort. >>> I added instrumentation and tried a fix, which brings me a tiny bit further, until I hit a message from delete_duplicate_records: >>> Ok we have overlapping extents that aren't completely covered by each >>> other, this is going to require more careful thought. The extents are >>> [52428800-16384] and [52432896-16384] >> >> Then I think lowmem mode may have better chance to handle it without crash. > > I tried it and got: > > [root@rescue ~]# /usr/local/bin/btrfsck --mode=lowmem --repair /dev/sda4 > enabling repair mode > ERROR: low memory mode doesn't support repair yet > > The problem only occurred in —repair mode anyway. > > >> >>>> Furthermore for v4.10.2, btrfs check provides a new mode called lowmem. >>>> You could try "btrfs check --mode=lowmem" to see if such problem can be avoided. >>> I will try that, but what makes you think this is a memory-related condition? The machine has 16G of RAM, isn’t that enough for an fsck? >> >> Not for memory usage, but in fact lowmem mode is a completely rework, so I just want to see how good or bad the new lowmem mode handles it. > > Is there a prototype with lowmem and repair? Yes, Su Yue submitted a patchset for it, but still repair is only supported for fs tree contents. https://www.spinics.net/lists/linux-btrfs/msg63316.html Repairing other trees, especially extent tree, is not supported yet. Thanks, Qu > > > Thanks > Christophe > >> >> Thanks, >> Qu >> >>>> >>>> For the kernel bug, it seems to be related to wrongly inserted delayed ref, but I can totally be wrong. >>> For now, I’m focusing on the “repair” part as much as I can, because I assume the kernel bug is there anyway, so someone else is bound to hit this problem. >>> Thanks >>> Christophe >>>> >>>> Thanks, >>>> Qu >>>>> The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. >>>>> The cause of the abort is that we call set_extent_dirty from check_extent_refs with rec->max_size == 0. I’ve instrumented to try to see where we set this to 0 (see https://github.com/c3d/btrfs-progs/tree/rhbz1435567), and indeed, we do sometimes see max_size set to 0 in a few locations. My instrumentation shows this: >>>>> 78655 [1.792241:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139eb80 max_size 16384 tmpl 0x7fffffffd120 >>>>> 78657 [1.792242:0x451cb8] MAX_SIZE_ZERO: Set max size 0 for rec 0x139ec50 from tmpl 0x7fffffffcf80 >>>>> 78660 [1.792244:0x451fe0] MAX_SIZE_ZERO: Add extent rec 0x139ed50 max_size 16384 tmpl 0x7fffffffd120 >>>>> I don’t really know what to make of it. >>>>> The cause of the SIGSEGV is that we try to free a list entry that has its next set to NULL. >>>>> #0 list_del (entry=0x555555db0420) at /usr/src/debug/btrfs-progs-v4.10.1/kernel-lib/list.h:125 >>>>> #1 free_all_extent_backrefs (rec=0x555555db0350) at cmds-check.c:5386 >>>>> #2 maybe_free_extent_rec (extent_cache=0x7fffffffd990, rec=0x555555db0350) at cmds-check.c:5417 >>>>> #3 0x00005555555b308f in check_block (flags=<optimized out>, buf=0x55557b87cdf0, extent_cache=0x7fffffffd990, root=0x55555587d570) at cmds-check.c:5851 >>>>> #4 run_next_block (root=root@entry=0x55555587d570, bits=bits@entry=0x5555558841 >>>>> I don’t know if the two problems are related, but they seem to be pretty consistent on this specific disk, so I think that we have a good opportunity to improve btrfsck to make it more robust to this specific form of corruption. But I don’t want to hapazardly modify a code I don’t really understand. So if anybody could make a suggestion on what the right strategy should be when we have max_size == 0, or how to avoid it in the first place. >>>>> I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). >>>>> Thanks >>>>> Christophe de Dinechin >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-25 17:50 File system corruption, btrfsck abort Christophe de Dinechin 2017-04-27 14:58 ` Christophe de Dinechin 2017-04-28 0:45 ` Qu Wenruo @ 2017-04-28 3:58 ` Chris Murphy [not found] ` <2CE52079-1B96-4FB3-8CEF-05FC6D3CB183@redhat.com> 2 siblings, 1 reply; 17+ messages in thread From: Chris Murphy @ 2017-04-28 3:58 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: Btrfs BTRFS On Tue, Apr 25, 2017 at 11:50 AM, Christophe de Dinechin <dinechin@redhat.com> wrote: > > The last filesystem corruption is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1444821. The dmesg log is in there. And also from the bug: >How reproducible: Seen at least 4 times on 3 different disks and 2 different systems. I've been using Fedora and Btrfs on 1/2 dozen different kinds of hardware since around Fedora 13. The oldest file systems are about 2 years old. I've not seen file system corruption. So I'd say there's some kind of workload that's helping to trigger it or it's hardware related; that it's happening on multiple systems makes me wonder if it's power related. > > The btrfsck crash is here: https://bugzilla.redhat.com/show_bug.cgi?id=1435567. I have two crash modes: either an abort or a SIGSEGV. I checked that both still happens on master as of today. The btrfs check crash is another matter. I've seen it crash many times, but the more recent versions are more reliable and haven't seen a crash lately. > I don’t know if this is relevant at all, but all the machines that failed that way were used to run VMs with KVM/QEMU. DIsk activity tends to be somewhat intense on occasions, since the VMs running there are part of a personal Jenkins ring that automatically builds various projects. Nominally, there are between three and five guests running (Windows XP, WIndows 10, macOS, Fedora25, Ubuntu 16.04). I do run VM's quite often with all of my setups but rarely two concurrently and never three or more. So, hmmm. And are the VM's backed by a qemu image on Btrfs? Or LVM? -- Chris Murphy ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <2CE52079-1B96-4FB3-8CEF-05FC6D3CB183@redhat.com>]
* Re: File system corruption, btrfsck abort [not found] ` <2CE52079-1B96-4FB3-8CEF-05FC6D3CB183@redhat.com> @ 2017-04-28 20:09 ` Chris Murphy 2017-04-29 8:46 ` Christophe de Dinechin 0 siblings, 1 reply; 17+ messages in thread From: Chris Murphy @ 2017-04-28 20:09 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: Btrfs BTRFS On Fri, Apr 28, 2017 at 3:10 AM, Christophe de Dinechin <dinechin@redhat.com> wrote: > > QEMU qcow2. Host is BTRFS. Guests are BTRFS, LVM, Ext4, NTFS (winXP and > win10) and HFS+ (macOS Sierra). I think I had 7 VMs installed, planned to > restore another 8 from backups before my previous disk crash. I usually have > at least 2 running, often as many as 5 (fedora, ubuntu, winXP, win10, macOS) > to cover my software testing needs. That is quite a torture test for any file system but more so Btrfs. How are the qcow2 files being created? What's the qemu-img create command? In particular i'm wondering if these qcow2 files are cow or nocow; if they're compressed by Btrfs; and how many fragments they have with filefrag. When I was using qcow2 for backing I used qemu-img create -f qcow2 -o preallocation=falloc,nocow=on,lazy_refcounts=on But then later I started using fallocated raw files with chattr +C applied. And these days I'm just using LVM thin volumes. The journaled file systems in a guest cause a ton of backing file fragmentation unless nocow is used on Btrfs. I've seen hundreds of thousands of extents for a single backing file for a Windows guest. -- Chris Murphy ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-28 20:09 ` Chris Murphy @ 2017-04-29 8:46 ` Christophe de Dinechin 2017-04-29 19:13 ` Chris Murphy 2017-04-29 19:18 ` Chris Murphy 0 siblings, 2 replies; 17+ messages in thread From: Christophe de Dinechin @ 2017-04-29 8:46 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS > On 28 Apr 2017, at 22:09, Chris Murphy <lists@colorremedies.com> wrote: > > On Fri, Apr 28, 2017 at 3:10 AM, Christophe de Dinechin > <dinechin@redhat.com> wrote: > >> >> QEMU qcow2. Host is BTRFS. Guests are BTRFS, LVM, Ext4, NTFS (winXP and >> win10) and HFS+ (macOS Sierra). I think I had 7 VMs installed, planned to >> restore another 8 from backups before my previous disk crash. I usually have >> at least 2 running, often as many as 5 (fedora, ubuntu, winXP, win10, macOS) >> to cover my software testing needs. > > That is quite a torture test for any file system but more so Btrfs. Sorry, but could you elaborate why it’s worse for btrfs? > How are the qcow2 files being created? In most cases, default qcow2 configuration as given by virt-manager. > What's the qemu-img create > command? In particular i'm wondering if these qcow2 files are cow or > nocow; if they're compressed by Btrfs; and how many fragments they > have with filefrag. I suspect they are cow. I’ll check (on the other machine with a similar setup) when I’m back home. > > When I was using qcow2 for backing I used > > qemu-img create -f qcow2 -o preallocation=falloc,nocow=on,lazy_refcounts=on > > But then later I started using fallocated raw files with chattr +C > applied. And these days I'm just using LVM thin volumes. The journaled > file systems in a guest cause a ton of backing file fragmentation > unless nocow is used on Btrfs. I've seen hundreds of thousands of > extents for a single backing file for a Windows guest. Are there btrfs commands I could run on a read-only filesystem that would give me this information? Thanks Christophe > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-29 8:46 ` Christophe de Dinechin @ 2017-04-29 19:13 ` Chris Murphy 2017-05-03 14:17 ` Christophe de Dinechin 2017-04-29 19:18 ` Chris Murphy 1 sibling, 1 reply; 17+ messages in thread From: Chris Murphy @ 2017-04-29 19:13 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: Btrfs BTRFS On Sat, Apr 29, 2017 at 2:46 AM, Christophe de Dinechin <dinechin@redhat.com> wrote: > >> On 28 Apr 2017, at 22:09, Chris Murphy <lists@colorremedies.com> wrote: >> >> On Fri, Apr 28, 2017 at 3:10 AM, Christophe de Dinechin >> <dinechin@redhat.com> wrote: >> >>> >>> QEMU qcow2. Host is BTRFS. Guests are BTRFS, LVM, Ext4, NTFS (winXP and >>> win10) and HFS+ (macOS Sierra). I think I had 7 VMs installed, planned to >>> restore another 8 from backups before my previous disk crash. I usually have >>> at least 2 running, often as many as 5 (fedora, ubuntu, winXP, win10, macOS) >>> to cover my software testing needs. >> >> That is quite a torture test for any file system but more so Btrfs. > > Sorry, but could you elaborate why it’s worse for btrfs? Copy on write. Four of your five guests use non-cow filesystems, so any overwrite, think journal writes, are new extent writes in Btrfs. Nothing is overwritten in Btrfs. Only after the write completes are the stale extents released. So you get a lot of fragmentation, and all of these tasks you're doing become very metadata heavy workloads. However, what you're doing should work. The consequence should only be one of performance, not file system integrity. So your configuration is useful for testing and making Btrfs better. > >> How are the qcow2 files being created? > > In most cases, default qcow2 configuration as given by virt-manager. > >> What's the qemu-img create >> command? In particular i'm wondering if these qcow2 files are cow or >> nocow; if they're compressed by Btrfs; and how many fragments they >> have with filefrag. > > I suspect they are cow. I’ll check (on the other machine with a similar setup) when I’m back home. Check the qcow2 files with filefrag and see how many extents they have. I'll bet they're massively fragmented. >> When I was using qcow2 for backing I used >> >> qemu-img create -f qcow2 -o preallocation=falloc,nocow=on,lazy_refcounts=on >> >> But then later I started using fallocated raw files with chattr +C >> applied. And these days I'm just using LVM thin volumes. The journaled >> file systems in a guest cause a ton of backing file fragmentation >> unless nocow is used on Btrfs. I've seen hundreds of thousands of >> extents for a single backing file for a Windows guest. > > Are there btrfs commands I could run on a read-only filesystem that would give me this information? lsattr -- Chris Murphy ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-29 19:13 ` Chris Murphy @ 2017-05-03 14:17 ` Christophe de Dinechin 2017-05-03 14:49 ` Austin S. Hemmelgarn 2017-05-03 17:43 ` Chris Murphy 0 siblings, 2 replies; 17+ messages in thread From: Christophe de Dinechin @ 2017-05-03 14:17 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS > On 29 Apr 2017, at 21:13, Chris Murphy <lists@colorremedies.com> wrote: > > On Sat, Apr 29, 2017 at 2:46 AM, Christophe de Dinechin > <dinechin@redhat.com> wrote: >> >>> On 28 Apr 2017, at 22:09, Chris Murphy <lists@colorremedies.com> wrote: >>> >>> On Fri, Apr 28, 2017 at 3:10 AM, Christophe de Dinechin >>> <dinechin@redhat.com> wrote: >>> >>>> >>>> QEMU qcow2. Host is BTRFS. Guests are BTRFS, LVM, Ext4, NTFS (winXP and >>>> win10) and HFS+ (macOS Sierra). I think I had 7 VMs installed, planned to >>>> restore another 8 from backups before my previous disk crash. I usually have >>>> at least 2 running, often as many as 5 (fedora, ubuntu, winXP, win10, macOS) >>>> to cover my software testing needs. >>> >>> That is quite a torture test for any file system but more so Btrfs. >> >> Sorry, but could you elaborate why it’s worse for btrfs? > > > Copy on write. Four of your five guests use non-cow filesystems, so > any overwrite, think journal writes, are new extent writes in Btrfs. > Nothing is overwritten in Btrfs. Only after the write completes are > the stale extents released. So you get a lot of fragmentation, and all > of these tasks you're doing become very metadata heavy workloads. Makes sense. Thanks for explaining. > However, what you're doing should work. The consequence should only be > one of performance, not file system integrity. So your configuration > is useful for testing and making Btrfs better. Yes. I just received a new machine, which is intended to become my primary host. That one I installed with ext4, so that I can keep pushing btrfs on my other two Linux hosts. Since I don’t care much about performance of the VMs either (they are build bots for a Jenkins setup), I can leave them in the current sub-optimal configuration. > >>> How are the qcow2 files being created? >> >> In most cases, default qcow2 configuration as given by virt-manager. >> >>> What's the qemu-img create >>> command? In particular i'm wondering if these qcow2 files are cow or >>> nocow; if they're compressed by Btrfs; and how many fragments they >>> have with filefrag. >> >> I suspect they are cow. I’ll check (on the other machine with a similar setup) when I’m back home. > > Check the qcow2 files with filefrag and see how many extents they > have. I'll bet they're massively fragmented. Indeed: fedora25.qcow2: 28358 extents found mac_hdd.qcow2: 79493 extents found ubuntu14.04-64.qcow2: 35069 extents found ubuntu14.04.qcow2: 240 extents found ubuntu16.04-32.qcow2: 81 extents found ubuntu16.04-64.qcow2: 15060 extents found ubuntu16.10-64.qcow2: 228 extents found win10.qcow2: 3438997 extents found winxp.qcow2: 66657 extents found I have no idea why my Win10 guest is so much worse than the others. It’s currently one of the least used, at least it’s not yet operating regularly in my build ring… But I had noticed that the installation of Visual Studio had taken quite a bit of time. > >>> When I was using qcow2 for backing I used >>> >>> qemu-img create -f qcow2 -o preallocation=falloc,nocow=on,lazy_refcounts=on >>> >>> But then later I started using fallocated raw files with chattr +C >>> applied. And these days I'm just using LVM thin volumes. The journaled >>> file systems in a guest cause a ton of backing file fragmentation >>> unless nocow is used on Btrfs. I've seen hundreds of thousands of >>> extents for a single backing file for a Windows guest. >> >> Are there btrfs commands I could run on a read-only filesystem that would give me this information? > > lsattr Hmmm. Does that even work on BTRFS? I get this, even after doing a chattr +C on one of the files. ------------------- fedora25.qcow2 ------------------- mac_hdd.qcow2 ------------------- ubuntu14.04-64.qcow2 ------------------- ubuntu14.04.qcow2 ------------------- ubuntu16.04-32.qcow2 ------------------- ubuntu16.04-64.qcow2 ------------------- ubuntu16.10-64.qcow2 ------------------- win10.qcow2 ------------------- winxp.qcow2 Thanks Christophe > > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-05-03 14:17 ` Christophe de Dinechin @ 2017-05-03 14:49 ` Austin S. Hemmelgarn 2017-05-03 17:43 ` Chris Murphy 1 sibling, 0 replies; 17+ messages in thread From: Austin S. Hemmelgarn @ 2017-05-03 14:49 UTC (permalink / raw) To: Christophe de Dinechin, Chris Murphy; +Cc: Btrfs BTRFS On 2017-05-03 10:17, Christophe de Dinechin wrote: > >> On 29 Apr 2017, at 21:13, Chris Murphy <lists@colorremedies.com> wrote: >> >> On Sat, Apr 29, 2017 at 2:46 AM, Christophe de Dinechin >> <dinechin@redhat.com> wrote: >>> >>>> On 28 Apr 2017, at 22:09, Chris Murphy <lists@colorremedies.com> wrote: >>>> >>>> On Fri, Apr 28, 2017 at 3:10 AM, Christophe de Dinechin >>>> <dinechin@redhat.com> wrote: >>>> >>>>> >>>>> QEMU qcow2. Host is BTRFS. Guests are BTRFS, LVM, Ext4, NTFS (winXP and >>>>> win10) and HFS+ (macOS Sierra). I think I had 7 VMs installed, planned to >>>>> restore another 8 from backups before my previous disk crash. I usually have >>>>> at least 2 running, often as many as 5 (fedora, ubuntu, winXP, win10, macOS) >>>>> to cover my software testing needs. >>>> >>>> That is quite a torture test for any file system but more so Btrfs. >>> >>> Sorry, but could you elaborate why it’s worse for btrfs? >> >> >> Copy on write. Four of your five guests use non-cow filesystems, so >> any overwrite, think journal writes, are new extent writes in Btrfs. >> Nothing is overwritten in Btrfs. Only after the write completes are >> the stale extents released. So you get a lot of fragmentation, and all >> of these tasks you're doing become very metadata heavy workloads. > > Makes sense. Thanks for explaining. > > >> However, what you're doing should work. The consequence should only be >> one of performance, not file system integrity. So your configuration >> is useful for testing and making Btrfs better. > > Yes. I just received a new machine, which is intended to become my primary host. That one I installed with ext4, so that I can keep pushing btrfs on my other two Linux hosts. Since I don’t care much about performance of the VMs either (they are build bots for a Jenkins setup), I can leave them in the current sub-optimal configuration. On the note of performance, you can make things slightly better by defragmenting on a regular (weekly is what I would suggest) basis. Make sure to defrag inside the guest first, then defrag the disk image file itself on the host if you do this though, as that will help ensure an optimal layout. FWIW, tools like Ansible or Puppet are great for coordinating this. > >> >>>> How are the qcow2 files being created? >>> >>> In most cases, default qcow2 configuration as given by virt-manager. >>> >>>> What's the qemu-img create >>>> command? In particular i'm wondering if these qcow2 files are cow or >>>> nocow; if they're compressed by Btrfs; and how many fragments they >>>> have with filefrag. >>> >>> I suspect they are cow. I’ll check (on the other machine with a similar setup) when I’m back home. >> >> Check the qcow2 files with filefrag and see how many extents they >> have. I'll bet they're massively fragmented. > > Indeed: > > fedora25.qcow2: 28358 extents found > mac_hdd.qcow2: 79493 extents found > ubuntu14.04-64.qcow2: 35069 extents found > ubuntu14.04.qcow2: 240 extents found > ubuntu16.04-32.qcow2: 81 extents found > ubuntu16.04-64.qcow2: 15060 extents found > ubuntu16.10-64.qcow2: 228 extents found > win10.qcow2: 3438997 extents found > winxp.qcow2: 66657 extents found > > I have no idea why my Win10 guest is so much worse than the others. It’s currently one of the least used, at least it’s not yet operating regularly in my build ring… But I had noticed that the installation of Visual Studio had taken quite a bit of time. Windows 10 does a lot more background processing than XP, and a lot of it hits the disk (although most of what you are seeing is probably side effects from the automatically scheduled defrag job that Windows 10 seems to have). It also appears to have a different allocator in the NTFS driver which prefers to spread data under certain circumstances, and VM's appear to be one such situation. > >> >>>> When I was using qcow2 for backing I used >>>> >>>> qemu-img create -f qcow2 -o preallocation=falloc,nocow=on,lazy_refcounts=on >>>> >>>> But then later I started using fallocated raw files with chattr +C >>>> applied. And these days I'm just using LVM thin volumes. The journaled >>>> file systems in a guest cause a ton of backing file fragmentation >>>> unless nocow is used on Btrfs. I've seen hundreds of thousands of >>>> extents for a single backing file for a Windows guest. >>> >>> Are there btrfs commands I could run on a read-only filesystem that would give me this information? >> >> lsattr > > Hmmm. Does that even work on BTRFS? I get this, even after doing a chattr +C on one of the files. > > ------------------- fedora25.qcow2 > ------------------- mac_hdd.qcow2 > ------------------- ubuntu14.04-64.qcow2 > ------------------- ubuntu14.04.qcow2 > ------------------- ubuntu16.04-32.qcow2 > ------------------- ubuntu16.04-64.qcow2 > ------------------- ubuntu16.10-64.qcow2 > ------------------- win10.qcow2 > ------------------- winxp.qcow2 These files wouldn't have been created with the NOCOW attribute by default, as QEMU doesn't know about it. To convert them, you would have to create a new empty file, set that attribute, then use something like cp or dd to copy the data into the new file, then rename it over-top of the old one. Setting these NOCOW may not help as much as it does for pre-allocated raw image files though. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-05-03 14:17 ` Christophe de Dinechin 2017-05-03 14:49 ` Austin S. Hemmelgarn @ 2017-05-03 17:43 ` Chris Murphy 1 sibling, 0 replies; 17+ messages in thread From: Chris Murphy @ 2017-05-03 17:43 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: Chris Murphy, Btrfs BTRFS On Wed, May 3, 2017 at 8:17 AM, Christophe de Dinechin <dinechin@redhat.com> wrote: >> Check the qcow2 files with filefrag and see how many extents they >> have. I'll bet they're massively fragmented. > > Indeed: > > fedora25.qcow2: 28358 extents found > mac_hdd.qcow2: 79493 extents found > ubuntu14.04-64.qcow2: 35069 extents found > ubuntu14.04.qcow2: 240 extents found > ubuntu16.04-32.qcow2: 81 extents found > ubuntu16.04-64.qcow2: 15060 extents found > ubuntu16.10-64.qcow2: 228 extents found > win10.qcow2: 3438997 extents found > winxp.qcow2: 66657 extents found > > I have no idea why my Win10 guest is so much worse than the others. It’s currently one of the least used, at least it’s not yet operating regularly in my build ring… But I had noticed that the installation of Visual Studio had taken quite a bit of time. I see the same pathological behavior. I don't know if it's Windows, NTFS, or some combination of how Windows+NTFS flushes, and how that flush gets treated as it passes from guest to host. And I don't know if qcow2 itself exacerbates it. But it seems fairly clear most every Windows guest write becomes an extent on Btrfs. If we consider 3438997 extents, and each extent has an extent data item, with ~200 items per 16KiB leaf, that's 17194 leaves, and about 270MiB of metadata. For one file; and that's just the extent data metadata. It doesn't include csums. But anyway there's a lot of churn not just in the extent data getting written out but how much metadata is affected by each write and the obsoleting of extents. Pretty much everything on Btrfs is a write. Even a delete is first a write and only later is space released. > Hmmm. Does that even work on BTRFS? I get this, even after doing a chattr +C on one of the files. > > ------------------- fedora25.qcow2 > ------------------- mac_hdd.qcow2 > ------------------- ubuntu14.04-64.qcow2 > ------------------- ubuntu14.04.qcow2 > ------------------- ubuntu16.04-32.qcow2 > ------------------- ubuntu16.04-64.qcow2 > ------------------- ubuntu16.10-64.qcow2 > ------------------- win10.qcow2 > ------------------- winxp.qcow2 It only works on zero length files. It has to be set at the time the file is created, which is what -o nocow=on does with qemu-img. If you wanted to do this with raw files and make it behave on Btrfs pretty much like it does on any other file system: touch windows.raw chattr +C windows.raw fallocate -l 50g windows.raw It's not possible to retroactively make a cow file nocow, or a nocow file cow. You can copy it to a new location such that it inherits +C (like a new directory). And you can also create a new nocow file, and then cat the old one into the new one. I haven't tried it but presumably you can use either 'qemu-img convert' or 'qemu-img dd' to migrate the data inside a cow qcow2 into a nocow qcow2. I don't know if you'd do the touch > chattr > qemu-image; or if you'd have qemu-img create a new one with -o nocow=on and then use the dd command. -- Chris Murphy ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: File system corruption, btrfsck abort 2017-04-29 8:46 ` Christophe de Dinechin 2017-04-29 19:13 ` Chris Murphy @ 2017-04-29 19:18 ` Chris Murphy 1 sibling, 0 replies; 17+ messages in thread From: Chris Murphy @ 2017-04-29 19:18 UTC (permalink / raw) To: Christophe de Dinechin; +Cc: Chris Murphy, Btrfs BTRFS On Sat, Apr 29, 2017 at 2:46 AM, Christophe de Dinechin <dinechin@redhat.com> wrote: > Are there btrfs commands I could run on a read-only filesystem that would give me this information? qemu-img info <file> will give you the status of lazy refcounts. lsattr will show a capital C in the 3rd to last position if it's nocow filefrag -v will show many extents with the "unwritten" flag if the file is fallocated. $ lsattr ------------------- ./Desktop ------------------- ./Downloads ------------------- ./Templates ------------------- ./Public ------------------- ./Documents ------------------- ./Music ------------------- ./Pictures ------------------- ./Videos --------c---------- ./tmp ##enable compression ------------------- ./Applications ----------------C-- ./hello.qcow2 ##this is nocow -- Chris Murphy ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-05-05 0:18 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-25 17:50 File system corruption, btrfsck abort Christophe de Dinechin
2017-04-27 14:58 ` Christophe de Dinechin
2017-04-27 15:12 ` Christophe de Dinechin
2017-04-28 0:45 ` Qu Wenruo
2017-04-28 8:47 ` Christophe de Dinechin
2017-05-02 0:17 ` Qu Wenruo
2017-05-03 14:21 ` Christophe de Dinechin
2017-05-04 12:33 ` Christophe de Dinechin
2017-05-05 0:18 ` Qu Wenruo
2017-04-28 3:58 ` Chris Murphy
[not found] ` <2CE52079-1B96-4FB3-8CEF-05FC6D3CB183@redhat.com>
2017-04-28 20:09 ` Chris Murphy
2017-04-29 8:46 ` Christophe de Dinechin
2017-04-29 19:13 ` Chris Murphy
2017-05-03 14:17 ` Christophe de Dinechin
2017-05-03 14:49 ` Austin S. Hemmelgarn
2017-05-03 17:43 ` Chris Murphy
2017-04-29 19:18 ` Chris Murphy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).