* Unocorrectable errors with RAID1 @ 2017-01-16 11:10 Christoph Groth 2017-01-16 13:24 ` Austin S. Hemmelgarn 2017-01-16 22:45 ` Goldwyn Rodrigues 0 siblings, 2 replies; 20+ messages in thread From: Christoph Groth @ 2017-01-16 11:10 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 7876 bytes --] Hi, I’ve been using a btrfs RAID1 of two hard disks since early 2012 on my home server. The machine has been working well overall, but recently some problems with the file system surfaced. Since I do have backups, I do not worry about the data, but I post here to better understand what happened. Also I cannot exclude that my case is useful in some way to btrfs development. First some information about the system: root@mim:~# uname -a Linux mim 4.6.0-1-amd64 #1 SMP Debian 4.6.3-1 (2016-07-04) x86_64 GNU/Linux root@mim:~# btrfs --version btrfs-progs v4.7.3 root@mim:~# btrfs fi show Label: none uuid: 2da00153-f9ea-4d6c-a6cc-10c913d22686 Total devices 2 FS bytes used 345.97GiB devid 1 size 465.29GiB used 420.06GiB path /dev/sda2 devid 2 size 465.29GiB used 420.04GiB path /dev/sdb2 root@mim:~# btrfs fi df / Data, RAID1: total=417.00GiB, used=344.62GiB Data, single: total=8.00MiB, used=0.00B System, RAID1: total=40.00MiB, used=68.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID1: total=3.00GiB, used=1.35GiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=464.00MiB, used=0.00B root@mim:~# dmesg | grep -i btrfs [ 4.165859] Btrfs loaded [ 4.481712] BTRFS: device fsid 2da00153-f9ea-4d6c-a6cc-10c913d22686 devid 1 transid 2075354 /dev/sda2 [ 4.482025] BTRFS: device fsid 2da00153-f9ea-4d6c-a6cc-10c913d22686 devid 2 transid 2075354 /dev/sdb2 [ 4.521090] BTRFS info (device sdb2): disk space caching is enabled [ 4.628506] BTRFS info (device sdb2): bdev /dev/sdb2 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 [ 4.628521] BTRFS info (device sdb2): bdev /dev/sda2 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 [ 18.315694] BTRFS info (device sdb2): disk space caching is enabled The disks themselves have been turning for almost 5 years by now, but their SMART health is still fully satisfactory. I noticed that something was wrong because printing stopped to work. So I did a scrub that detected 0 "correctable errors" and 6 "uncorrectable" errors. The relevant bits from kern.log are: Jan 11 11:05:56 mim kernel: [159873.938579] BTRFS warning (device sdb2): checksum error at logical 180829634560 on dev /dev/sdb2, sector 353143968, root 5, inode 10014144, offset 221184, length 4096, links 1 (path: usr/lib/x86_64-linux-gnu/libcups.so.2) Jan 11 11:05:57 mim kernel: [159874.857132] BTRFS warning (device sdb2): checksum error at logical 180829634560 on dev /dev/sda2, sector 353182880, root 5, inode 10014144, offset 221184, length 4096, links 1 (path: usr/lib/x86_64-linux-gnu/libcups.so.2) Jan 11 11:28:42 mim kernel: [161240.083721] BTRFS warning (device sdb2): checksum error at logical 260254629888 on dev /dev/sda2, sector 508309824, root 5, inode 9990924, offset 6676480, length 4096, links 1 (path: var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) Jan 11 11:28:42 mim kernel: [161240.235837] BTRFS warning (device sdb2): checksum error at logical 260254638080 on dev /dev/sda2, sector 508309840, root 5, inode 9990924, offset 6684672, length 4096, links 1 (path: var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) Jan 11 11:37:21 mim kernel: [161759.725120] BTRFS warning (device sdb2): checksum error at logical 260254629888 on dev /dev/sdb2, sector 508270912, root 5, inode 9990924, offset 6676480, length 4096, links 1 (path: var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) Jan 11 11:37:21 mim kernel: [161759.750251] BTRFS warning (device sdb2): checksum error at logical 260254638080 on dev /dev/sdb2, sector 508270928, root 5, inode 9990924, offset 6684672, length 4096, links 1 (path: var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) As you can see each disk has the same three errors, and there are no other errors. Random bad blocks cannot explain this situation. I asked on #btrfs and someone suggested that these errors are likely due to RAM problems. This may indeed be the case, since the machine has no ECC. I managed to fix these errors by replacing the broken files with good copies. Scrubbing shows no errors now: root@mim:~# btrfs scrub status / scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686 scrub started at Sat Jan 14 12:52:03 2017 and finished after 01:49:10 total bytes scrubbed: 699.17GiB with 0 errors However, there are further problems. When trying to archive the full filesystem I noticed that some files/directories cannot be read. (The problem is localized to some ".git" directory that I don’t need.) Any attempt to read the broken files (or to delete them) does not work: $ du -sh .git du: cannot access '.git/objects/28/ea2aae3fe57ab4328adaa8b79f3c1cf005dd8d': No such file or directory du: cannot access '.git/objects/28/fd95a5e9d08b6684819ce6e3d39d99e2ecccd5': Stale file handle du: cannot access '.git/objects/28/52e887ed436ed2c549b20d4f389589b7b58e09': Stale file handle du: cannot access '.git/objects/info': Stale file handle du: cannot access '.git/objects/pack': Stale file handle During the above command the following lines were added to kern.log: Jan 16 09:41:34 mim kernel: [132206.957566] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.957924] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.958505] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.958971] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.959534] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.959874] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.960523] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 Jan 16 09:41:34 mim kernel: [132206.960943] BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 So I tried to repair the file system by running "btrfs check --repair", but this doesn’t work: (initramfs) btrfs --version btrfs-progs v4.7.3 (initramfs) btrfs check --repair /dev/sda2 UUID: ... checking extents incorrect offsets 2527 2543 items overlap, can't fix cmds-check.c:4297: fix_item_offset: Assertion `ret` failed. btrfs[0x41a8b4] btrfs[0x41a8db] btrfs[0x42428b] btrfs[0x424f83] btrfs[0x4259cd] btrfs(cmd_check+0x1111)[0x427d6d] btrfs(main+0x12f)[0x40a341] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fd98859d2b1] btrfs(_start+0x2a)[0x40a37a] I now have the following questions: * So scrubbing is not enough to check the health of a btrfs file system? It’s also necessary to read all the files? * Any ideas what coud have caused the "stale file handle" errors? Is there any way to fix them? Of course RAM errors can in principle have _any_ consequences, but I would have hoped that even without ECC RAM it’s practically inpossible to end up with an unrepairable file system. Perhaps I simply had very bad luck. * I believe that btrfs RAID1 is considered reasonably safe for production use by now. I want to replace that home server with a new machine (still without ECC). Is it a good idea to use btrfs for the main file system? I would certainly hope so! :-) Thanks for your time, Christoph [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 11:10 Unocorrectable errors with RAID1 Christoph Groth @ 2017-01-16 13:24 ` Austin S. Hemmelgarn 2017-01-16 15:42 ` Christoph Groth 2017-01-16 22:45 ` Goldwyn Rodrigues 1 sibling, 1 reply; 20+ messages in thread From: Austin S. Hemmelgarn @ 2017-01-16 13:24 UTC (permalink / raw) To: Christoph Groth, linux-btrfs On 2017-01-16 06:10, Christoph Groth wrote: > Hi, > > I’ve been using a btrfs RAID1 of two hard disks since early 2012 on my > home server. The machine has been working well overall, but recently > some problems with the file system surfaced. Since I do have backups, I > do not worry about the data, but I post here to better understand what > happened. Also I cannot exclude that my case is useful in some way to > btrfs development. > > First some information about the system: > > root@mim:~# uname -a > Linux mim 4.6.0-1-amd64 #1 SMP Debian 4.6.3-1 (2016-07-04) x86_64 GNU/Linux > root@mim:~# btrfs --version > btrfs-progs v4.7.3 You get bonus points for being up-to-date both with the kernel and the userspace tools. > root@mim:~# btrfs fi show > Label: none uuid: 2da00153-f9ea-4d6c-a6cc-10c913d22686 > Total devices 2 FS bytes used 345.97GiB > devid 1 size 465.29GiB used 420.06GiB path /dev/sda2 > devid 2 size 465.29GiB used 420.04GiB path /dev/sdb2 > > root@mim:~# btrfs fi df / > Data, RAID1: total=417.00GiB, used=344.62GiB > Data, single: total=8.00MiB, used=0.00B > System, RAID1: total=40.00MiB, used=68.00KiB > System, single: total=4.00MiB, used=0.00B > Metadata, RAID1: total=3.00GiB, used=1.35GiB > Metadata, single: total=8.00MiB, used=0.00B > GlobalReserve, single: total=464.00MiB, used=0.00B Just a general comment on this, you might want to consider running a full balance on this filesystem, you've got a huge amount of slack space in the data chunks (over 70GiB), and significant space in the Metadata chunks that isn't accounted for by the GlobalReserve, as well as a handful of empty single profile chunks which are artifacts from some old versions of mkfs. This isn't of course essential, but keeping ahead of such things does help sometimes when you have issues. > root@mim:~# dmesg | grep -i btrfs > [ 4.165859] Btrfs loaded > [ 4.481712] BTRFS: device fsid 2da00153-f9ea-4d6c-a6cc-10c913d22686 > devid 1 transid 2075354 /dev/sda2 > [ 4.482025] BTRFS: device fsid 2da00153-f9ea-4d6c-a6cc-10c913d22686 > devid 2 transid 2075354 /dev/sdb2 > [ 4.521090] BTRFS info (device sdb2): disk space caching is enabled > [ 4.628506] BTRFS info (device sdb2): bdev /dev/sdb2 errs: wr 0, rd > 0, flush 0, corrupt 3, gen 0 > [ 4.628521] BTRFS info (device sdb2): bdev /dev/sda2 errs: wr 0, rd > 0, flush 0, corrupt 3, gen 0 > [ 18.315694] BTRFS info (device sdb2): disk space caching is enabled > > The disks themselves have been turning for almost 5 years by now, but > their SMART health is still fully satisfactory. > > I noticed that something was wrong because printing stopped to work. So > I did a scrub that detected 0 "correctable errors" and 6 "uncorrectable" > errors. The relevant bits from kern.log are: > > Jan 11 11:05:56 mim kernel: [159873.938579] BTRFS warning (device sdb2): > checksum error at logical 180829634560 on dev /dev/sdb2, sector > 353143968, root 5, inode 10014144, offset 221184, length 4096, links 1 > (path: usr/lib/x86_64-linux-gnu/libcups.so.2) > Jan 11 11:05:57 mim kernel: [159874.857132] BTRFS warning (device sdb2): > checksum error at logical 180829634560 on dev /dev/sda2, sector > 353182880, root 5, inode 10014144, offset 221184, length 4096, links 1 > (path: usr/lib/x86_64-linux-gnu/libcups.so.2) > Jan 11 11:28:42 mim kernel: [161240.083721] BTRFS warning (device sdb2): > checksum error at logical 260254629888 on dev /dev/sda2, sector > 508309824, root 5, inode 9990924, offset 6676480, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > Jan 11 11:28:42 mim kernel: [161240.235837] BTRFS warning (device sdb2): > checksum error at logical 260254638080 on dev /dev/sda2, sector > 508309840, root 5, inode 9990924, offset 6684672, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > Jan 11 11:37:21 mim kernel: [161759.725120] BTRFS warning (device sdb2): > checksum error at logical 260254629888 on dev /dev/sdb2, sector > 508270912, root 5, inode 9990924, offset 6676480, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > Jan 11 11:37:21 mim kernel: [161759.750251] BTRFS warning (device sdb2): > checksum error at logical 260254638080 on dev /dev/sdb2, sector > 508270928, root 5, inode 9990924, offset 6684672, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > > As you can see each disk has the same three errors, and there are no > other errors. Random bad blocks cannot explain this situation. I asked > on #btrfs and someone suggested that these errors are likely due to RAM > problems. This may indeed be the case, since the machine has no ECC. I > managed to fix these errors by replacing the broken files with good > copies. Scrubbing shows no errors now: > > root@mim:~# btrfs scrub status / > scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686 > scrub started at Sat Jan 14 12:52:03 2017 and finished after > 01:49:10 > total bytes scrubbed: 699.17GiB with 0 errors > > However, there are further problems. When trying to archive the full > filesystem I noticed that some files/directories cannot be read. (The > problem is localized to some ".git" directory that I don’t need.) Any > attempt to read the broken files (or to delete them) does not work: > > $ du -sh .git > du: cannot access > '.git/objects/28/ea2aae3fe57ab4328adaa8b79f3c1cf005dd8d': No such file > or directory > du: cannot access > '.git/objects/28/fd95a5e9d08b6684819ce6e3d39d99e2ecccd5': Stale file handle > du: cannot access > '.git/objects/28/52e887ed436ed2c549b20d4f389589b7b58e09': Stale file handle > du: cannot access '.git/objects/info': Stale file handle > du: cannot access '.git/objects/pack': Stale file handle > > During the above command the following lines were added to kern.log: > > Jan 16 09:41:34 mim kernel: [132206.957566] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.957924] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.958505] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.958971] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.959534] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.959874] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.960523] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.960943] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > > So I tried to repair the file system by running "btrfs check --repair", > but this doesn’t work: > > (initramfs) btrfs --version > btrfs-progs v4.7.3 > (initramfs) btrfs check --repair /dev/sda2 > UUID: ... > checking extents > incorrect offsets 2527 2543 > items overlap, can't fix > cmds-check.c:4297: fix_item_offset: Assertion `ret` failed. > btrfs[0x41a8b4] > btrfs[0x41a8db] > btrfs[0x42428b] > btrfs[0x424f83] > btrfs[0x4259cd] > btrfs(cmd_check+0x1111)[0x427d6d] > btrfs(main+0x12f)[0x40a341] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fd98859d2b1] > btrfs(_start+0x2a)[0x40a37a] > > I now have the following questions: > > * So scrubbing is not enough to check the health of a btrfs file > system? It’s also necessary to read all the files? Scrubbing checks data integrity, but not the state of the data. IOW, you're checking that the data and metadata match with the checksums, but not necessarily that the filesystem itself is valid. > > * Any ideas what coud have caused the "stale file handle" errors? Is > there any way to fix them? Of course RAM errors can in principle have > _any_ consequences, but I would have hoped that even without ECC RAM > it’s practically inpossible to end up with an unrepairable file > system. Perhaps I simply had very bad luck. -ESTALE is _supposed_ to be a networked filesystem only thing. BTRFS returns it somewhere, and I've been meaning to track down where (because there is almost certainly a more correct error code to return there), I just haven't had time to do so. As far as RAM, it absolutely is possible for bad RAM or even just transient memory errors to cause filesystem corruption. The disk itself stores exactly what it was told to (in theory), so if it was told to store bad data, it stores bad data. I've lost at least 3 filesystems over the past 5 years just due to bad memory, although I've been particularly unlucky in that respect. There are a few things you can do to mitigate the risk of not using ECC RAM though: * Reboot regularly, at least weekly, and possibly more frequently. * Keep the system cool, warmer components are more likely to have transient errors. * Prefer fewer numbers of memory modules when possible. Fewer modules means less total area that could be hit by cosmic rays or other high-energy radiation (the main cause of most transient errors). > > * I believe that btrfs RAID1 is considered reasonably safe for > production use by now. I want to replace that home server with a new > machine (still without ECC). Is it a good idea to use btrfs for the > main file system? I would certainly hope so! :-) FWIW, this wasn't exactly an issue with BTRFS, any other filesystem would have failed similarly, although others likely would have done more damage (instead of failing to load libcups due to -EIO, you would have seen seemingly random segfaults from apps using it when they tried to use the corrupted data). In fact, if it weren't for the fact that you're using BTRFS, it likely would have taken longer for you to figure out what had happened. If you were using ext4 (or XFS, or almost any other filesystem except for ZFS), you likely would have had no indication that anything was wrong other than printing not working until you re-installed whatever package included libcups. As far as raid1 mode in particular, I consider it stable, and quite a few other people do, but even the most stable software has issues from time to time, but I have not lost a single filesystem using raid1 mode to a filesystem bug since at least kernel 3.16. I have lost a few to hardware issues, but if I hadn't been using BTRFS I wouldn't have figured out nearly as quickly that I had said hardware issues. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 13:24 ` Austin S. Hemmelgarn @ 2017-01-16 15:42 ` Christoph Groth 2017-01-16 16:29 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 20+ messages in thread From: Christoph Groth @ 2017-01-16 15:42 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: linux-btrfs Austin S. Hemmelgarn wrote: > On 2017-01-16 06:10, Christoph Groth wrote: >> root@mim:~# btrfs fi df / >> Data, RAID1: total=417.00GiB, used=344.62GiB >> Data, single: total=8.00MiB, used=0.00B >> System, RAID1: total=40.00MiB, used=68.00KiB >> System, single: total=4.00MiB, used=0.00B >> Metadata, RAID1: total=3.00GiB, used=1.35GiB >> Metadata, single: total=8.00MiB, used=0.00B >> GlobalReserve, single: total=464.00MiB, used=0.00B > Just a general comment on this, you might want to consider > running a full balance on this filesystem, you've got a huge > amount of slack space in the data chunks (over 70GiB), and > significant space in the Metadata chunks that isn't accounted > for by the GlobalReserve, as well as a handful of empty single > profile chunks which are artifacts from some old versions of > mkfs. This isn't of course essential, but keeping ahead of such > things does help sometimes when you have issues. Thanks! So slack is the difference between "total" and "used"? I saw that the manpage of "btrfs balance" explains this a bit in its "examples" section. Are you aware of any more in-depth documentation? Or one has to look at the source at this level? I ran btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft / btrfs balance start -dusage=25 -musage=25 / This resulted in root@mim:~# btrfs fi df / Data, RAID1: total=365.00GiB, used=344.61GiB System, RAID1: total=32.00MiB, used=64.00KiB Metadata, RAID1: total=2.00GiB, used=1.35GiB GlobalReserve, single: total=460.00MiB, used=0.00B I hope that one day there will be a daemon that silently performs all the necessary btrfs maintenance in the background when system load is low! >> * So scrubbing is not enough to check the health of a btrfs >> file system? It’s also necessary to read all the files? > Scrubbing checks data integrity, but not the state of the data. > IOW, you're checking that the data and metadata match with the > checksums, but not necessarily that the filesystem itself is > valid. I see, but what should one then do to detect problems such as mine as soon as possible? Periodically calculate hashes for all files? I’ve never seen a recommendation to do that for btrfs. > There are a few things you can do to mitigate the risk of not > using ECC RAM though: > * Reboot regularly, at least weekly, and possibly more > frequently. > * Keep the system cool, warmer components are more likely to > have transient errors. > * Prefer fewer numbers of memory modules when possible. Fewer > modules means less total area that could be hit by cosmic rays > or other high-energy radiation (the main cause of most transient > errors). Thanks for the advice, I think I buy the regular reboots. As a consequence of my problem I think I’ll stop using RAID1 on the file server, since this only protects against dead disks, which evidently is only part of the problem. Instead, I’ll make sure that the laptop that syncs with the server has a SSD that is big enough to hold all the data that is on the server as well (1 TB SSDs are affordable now). This way, instead of disk-level redundancy, I’ll have machine-level redundancy. When something like the current problem hits one of the two machines, I should still have a usable second machine with all the data on it. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 15:42 ` Christoph Groth @ 2017-01-16 16:29 ` Austin S. Hemmelgarn 2017-01-17 4:50 ` Janos Toth F. 2017-01-17 9:18 ` Christoph Groth 0 siblings, 2 replies; 20+ messages in thread From: Austin S. Hemmelgarn @ 2017-01-16 16:29 UTC (permalink / raw) To: Christoph Groth; +Cc: linux-btrfs On 2017-01-16 10:42, Christoph Groth wrote: > Austin S. Hemmelgarn wrote: >> On 2017-01-16 06:10, Christoph Groth wrote: > >>> root@mim:~# btrfs fi df / >>> Data, RAID1: total=417.00GiB, used=344.62GiB >>> Data, single: total=8.00MiB, used=0.00B >>> System, RAID1: total=40.00MiB, used=68.00KiB >>> System, single: total=4.00MiB, used=0.00B >>> Metadata, RAID1: total=3.00GiB, used=1.35GiB >>> Metadata, single: total=8.00MiB, used=0.00B >>> GlobalReserve, single: total=464.00MiB, used=0.00B > >> Just a general comment on this, you might want to consider running a >> full balance on this filesystem, you've got a huge amount of slack >> space in the data chunks (over 70GiB), and significant space in the >> Metadata chunks that isn't accounted for by the GlobalReserve, as well >> as a handful of empty single profile chunks which are artifacts from >> some old versions of mkfs. This isn't of course essential, but >> keeping ahead of such things does help sometimes when you have issues. > > Thanks! So slack is the difference between "total" and "used"? I saw > that the manpage of "btrfs balance" explains this a bit in its > "examples" section. Are you aware of any more in-depth documentation? > Or one has to look at the source at this level? There's not really much in the way of great documentation that I know of. I can however cover the basics here: BTRFS uses a 2 level allocation system. At the higher level, you have chunks. These are just big blocks of space on the disk that get used for only one type of lower level allocation (Data, Metadata, or System). Data chunks are normally 1GB, Metadata 256MB, and System depends on the size of the FS when it was created. Within these chunks, BTRFS then allocates individual blocks just like any other filesystem. When there is no free space in any existing chunks for a new block that needs allocated, a new chunk is allocated. Newly allocated chunks may be larger (if the filesystem is really big) or smaller (if the FS doesn't have much free space left at the chunk level) than the default. In the event that BTRFS can't allocate a new chunk because there's no room, a couple of different things could happen. If the chunk to be allocated was a data chunk, you get -ENOSPC (usually, sometimes you might get other odd results) in the userspace application that triggered the allocation. However, if BTRFS needs room for metadata, then it will try to use the GlobalReserve instead. This is a special area within the metadata chunks that's reserved for internal operations and trying to get out of free space exhaustion situations. If that fails, then the filesystem is functionally dead, reads will still work, and you might be able to write very small amounts of data at a time, but it's not possible from a practical perspective to recover a filesystem in such a situation. The 'total' value in fi df output is the total space allocated to chunks of that type, while the 'used' value is how much is actually being used. It's worth noting that since GlobalReserve is a part of the Metadata chunks, the total there is part of the total for Metadata, but not the used value (so in an ideal situation with no slack space at the block level, you would still see a difference between metadata total and used equal to the global reserve total). What balancing does is send everything back through the allocator, which in turn back-fills chunks that are only partially full, and removes ones that are now empty. In normal usage, it's not absolutely needed. From a practical perspective though, it's generally a good idea to keep the slack space (the difference between total and used) within chunks to a minimum to try and avoid getting the filesystem stuck with no free space at the chunk level. > > I ran > > btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft / > btrfs balance start -dusage=25 -musage=25 / > > This resulted in > > root@mim:~# btrfs fi df / > Data, RAID1: total=365.00GiB, used=344.61GiB > System, RAID1: total=32.00MiB, used=64.00KiB > Metadata, RAID1: total=2.00GiB, used=1.35GiB > GlobalReserve, single: total=460.00MiB, used=0.00B This is a much saner looking FS, you've only got about 20GB of slack in Data chunks, and less than 1GB in metadata, which is reasonable given the size of the FS and how much data you have on it. Ideal values for both are actually hard to determine, as having no slack in the chunks actually hurts performance a bit, and the ideal values depend on how much your workloads hit each type of chunk. > > I hope that one day there will be a daemon that silently performs all > the necessary btrfs maintenance in the background when system load is low! FWIW, while there isn't a daemon yet that does this, it's a perfect thing for a cronjob. The general maintenance regimen that I use for most of my filesystems is: * Run 'btrfs balance start -dusage=20 -musage=20' daily. This will complete really fast on most filesystems, and keeps the slack-space relatively under-control (and has the nice bonus that it helps defragment free space. * Run a full scrub on all filesystems weekly. This catches silent corruption of the data, and will fix it if possible. * Run a full defrag on all filesystems monthly. This should be run before the balance (reasons are complicated and require more explanation than you probably care for). I would run this at least weekly though on HDD's, as they tend to be more negatively impacted by fragmentation. There are a couple of other things I also do (fstrim and punching holes in large files to make them sparse), but they're not really BTRFS specific. Overall, with a decent SSD (I usually use Crucial MX series SSD's in my personal systems), these have near zero impact most of the time, and with decent HDD's, you should have limited issues as long as you run on only one FS at a time. > >>> * So scrubbing is not enough to check the health of a btrfs file >>> system? It’s also necessary to read all the files? > >> Scrubbing checks data integrity, but not the state of the data. IOW, >> you're checking that the data and metadata match with the checksums, >> but not necessarily that the filesystem itself is valid. > > I see, but what should one then do to detect problems such as mine as > soon as possible? Periodically calculate hashes for all files? I’ve > never seen a recommendation to do that for btrfs. Scrub will verify that the data is the same as when the kernel calculated the block checksum. That's really the best that can be done. In your case, it couldn't correct the errors because both copies of the corrupted blocks were bad (this points at an issue with either RAM or the storage controller BTW, not the disks themselves). Had one of the copies been valid, it would have intelligently detected which one was bad and fixed things. It's worth noting that the combination of checksumming and scrub actually provides more stringent data integrity guarantees than any other widely used filesystem except ZFS. As far as general monitoring, in addition to scrubbing (and obviously watching SMART status) you want to check the output of 'btrfs device stats' for non-zero error counters (these are cumulative counters that are only reset when the user says to do so, so right now they'll show aggregate data for the life of the FS), and if you're paranoid, watch that the mount options on the FS don't change (some monitoring software such as Monit makes this insanely easy to do), as the FS will go read-only if a severe error is detected (stuff like a failed read at the device level, not just checksum errors). > >> There are a few things you can do to mitigate the risk of not using >> ECC RAM though: >> * Reboot regularly, at least weekly, and possibly more frequently. >> * Keep the system cool, warmer components are more likely to have >> transient errors. >> * Prefer fewer numbers of memory modules when possible. Fewer modules >> means less total area that could be hit by cosmic rays or other >> high-energy radiation (the main cause of most transient errors). > > Thanks for the advice, I think I buy the regular reboots. > > As a consequence of my problem I think I’ll stop using RAID1 on the file > server, since this only protects against dead disks, which evidently is > only part of the problem. Instead, I’ll make sure that the laptop that > syncs with the server has a SSD that is big enough to hold all the data > that is on the server as well (1 TB SSDs are affordable now). This way, > instead of disk-level redundancy, I’ll have machine-level redundancy. > When something like the current problem hits one of the two machines, I > should still have a usable second machine with all the data on it. I actually have a similar situation, I've got a laptop that I back-up to a personal server system. In my case though, I've take a much higher-level approach, the backup storage is in fact GlusterFS (a clustered filesystem) running on top of BTRFS on 3 different systems (the server, plus a pair of Intel NUC's that are just dedicated SAN systems). If I didn't have the hardware to do this or cared about performance more (I'm lucky if I get 20MB/s write speed, but most of the issue is that I went cheap on the NUC's), I would probably still be using BTRFS in raid1 mode on the server despite keeping a copy on the laptop, simply because that provides an extra layer of protection on the server side. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 16:29 ` Austin S. Hemmelgarn @ 2017-01-17 4:50 ` Janos Toth F. 2017-01-17 12:25 ` Austin S. Hemmelgarn 2017-01-17 9:18 ` Christoph Groth 1 sibling, 1 reply; 20+ messages in thread From: Janos Toth F. @ 2017-01-17 4:50 UTC (permalink / raw) To: Btrfs BTRFS > BTRFS uses a 2 level allocation system. At the higher level, you have > chunks. These are just big blocks of space on the disk that get used for > only one type of lower level allocation (Data, Metadata, or System). Data > chunks are normally 1GB, Metadata 256MB, and System depends on the size of > the FS when it was created. Within these chunks, BTRFS then allocates > individual blocks just like any other filesystem. This always seems to confuse me when I try to get an abstract idea about de-/fragmentation of Btrfs. Can meta-/data be fragmented on both levels? And if so, can defrag and/or balance "cure" both levels of fragmentation (if any)? But how? May be several defrag and balance runs, repeated until returns diminish (or at least you consider them meaningless and/or unnecessary)? > What balancing does is send everything back through the allocator, which in > turn back-fills chunks that are only partially full, and removes ones that > are now empty. Does't this have a potential chance of introducing (additional) extent-level fragmentation? > FWIW, while there isn't a daemon yet that does this, it's a perfect thing > for a cronjob. The general maintenance regimen that I use for most of my > filesystems is: > * Run 'btrfs balance start -dusage=20 -musage=20' daily. This will complete > really fast on most filesystems, and keeps the slack-space relatively > under-control (and has the nice bonus that it helps defragment free space. > * Run a full scrub on all filesystems weekly. This catches silent > corruption of the data, and will fix it if possible. > * Run a full defrag on all filesystems monthly. This should be run before > the balance (reasons are complicated and require more explanation than you > probably care for). I would run this at least weekly though on HDD's, as > they tend to be more negatively impacted by fragmentation. I wonder if one should always run a full balance instead of a full scrub, since balance should also read (and thus theoretically verify) the meta-/data (does it though? I would expect it to check the chekcsums, but who knows...? may be it's "optimized" to skip that step?) and also perform the "consolidation" of the chunk level. I wish there was some more "integrated" solution for this: a balance-like operation which consolidates the chunks and also de-fragments the file extents at the same time while passively uncovers (and fixes if necessary and possible) any checksum mismatches / data errors, so that balance and defrag can't work against each-other and the overall work is minimized (compared to several full runs or many different commands). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 4:50 ` Janos Toth F. @ 2017-01-17 12:25 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 20+ messages in thread From: Austin S. Hemmelgarn @ 2017-01-17 12:25 UTC (permalink / raw) To: Janos Toth F., Btrfs BTRFS On 2017-01-16 23:50, Janos Toth F. wrote: >> BTRFS uses a 2 level allocation system. At the higher level, you have >> chunks. These are just big blocks of space on the disk that get used for >> only one type of lower level allocation (Data, Metadata, or System). Data >> chunks are normally 1GB, Metadata 256MB, and System depends on the size of >> the FS when it was created. Within these chunks, BTRFS then allocates >> individual blocks just like any other filesystem. > > This always seems to confuse me when I try to get an abstract idea > about de-/fragmentation of Btrfs. > Can meta-/data be fragmented on both levels? And if so, can defrag > and/or balance "cure" both levels of fragmentation (if any)? > But how? May be several defrag and balance runs, repeated until > returns diminish (or at least you consider them meaningless and/or > unnecessary)? Defrag operates only at the block level. It won't allocate chunks unless it has to, and it won't remove chunks unless they become empty from it moving things around (although that's not likely to happen most of the time). Balance functionally operates at both levels, but it doesn't really do any defragmentation. Balance _may_ merge extents sometimes, but I'm not sure of this. It will compact allocations and therefore functionally defragment free space within chunks (though not necessarily at the chunk-level itself). Defrag run with the same options _should_ have no net effect after the first run, the two exceptions being if the filesystem is close to full or if the data set is being modified live while the defrag is happening. Balance run with the same options will eventually hit a point where it doesn't do anything (or only touches one chunk of each type but doesn't actually give any benefit). If you're just using the usage filters or doing a full balance, this point is the second run. If you're using other filters, it's functionally not possible to determine when that point will be without low-level knowledge of the chunk layout. For an idle filesystem, if you run defrag then a full balance, that will get you a near optimal layout. Running them in the reverse order will get you a different layout that may be less optimal than running defrag first because defrag may move data in such a way that new chunks get allocated. Repeated runs of defrag and balance will in more than 95% of cases provide no extra benefit. > > >> What balancing does is send everything back through the allocator, which in >> turn back-fills chunks that are only partially full, and removes ones that >> are now empty. > > Does't this have a potential chance of introducing (additional) > extent-level fragmentation? In theory, yes. IIRC, extents can't cross a chunk boundary. Beyond that packing constraint, balance shouldn't fragment things further. > >> FWIW, while there isn't a daemon yet that does this, it's a perfect thing >> for a cronjob. The general maintenance regimen that I use for most of my >> filesystems is: >> * Run 'btrfs balance start -dusage=20 -musage=20' daily. This will complete >> really fast on most filesystems, and keeps the slack-space relatively >> under-control (and has the nice bonus that it helps defragment free space. >> * Run a full scrub on all filesystems weekly. This catches silent >> corruption of the data, and will fix it if possible. >> * Run a full defrag on all filesystems monthly. This should be run before >> the balance (reasons are complicated and require more explanation than you >> probably care for). I would run this at least weekly though on HDD's, as >> they tend to be more negatively impacted by fragmentation. > > I wonder if one should always run a full balance instead of a full > scrub, since balance should also read (and thus theoretically verify) > the meta-/data (does it though? I would expect it to check the > chekcsums, but who knows...? may be it's "optimized" to skip that > step?) and also perform the "consolidation" of the chunk level. Scrub uses fewer resources than balance. Balance has to read _and_ re-write all data in the FS regardless of the state of the data. Scrub only needs to read the data if it's good, and if it's bad it only (for raid1) has to re-write the replica that's bad, not both of them. In fact, the only practical reason to run balance on a regular basis at all is to compact allocations and defragment free space. This is why I only have it balance chunks that are less than 1/5 full. > > I wish there was some more "integrated" solution for this: a > balance-like operation which consolidates the chunks and also > de-fragments the file extents at the same time while passively > uncovers (and fixes if necessary and possible) any checksum mismatches > / data errors, so that balance and defrag can't work against > each-other and the overall work is minimized (compared to several full > runs or many different commands). More than 90% of the time, the performance difference between the absolute optimal layout and the one generated by just running defrag then balancing is so small that it's insignificant. The closer to the optimal layout you get, the lower the returns for optimizing further (and this applies to any filesystem in fact). In essence, it's a bit like the traveling salesman problem, any arbitrary solution probably isn't optimal, but it's generally close enough to not matter. As far as scrub fitting into all of this, I'd personally rather have a daemon that slowly (less than 1% bandwidth usage) scrubs the FS over time in the background and logs and fixes errors it encounters (similar to how filesystem scrubbing works in many clustered filesystems) instead of always having to manually invoke it and jump through hoops to keep the bandwidth usage reasonable. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 16:29 ` Austin S. Hemmelgarn 2017-01-17 4:50 ` Janos Toth F. @ 2017-01-17 9:18 ` Christoph Groth 2017-01-17 12:32 ` Austin S. Hemmelgarn 1 sibling, 1 reply; 20+ messages in thread From: Christoph Groth @ 2017-01-17 9:18 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 3217 bytes --] Austin S. Hemmelgarn wrote: > There's not really much in the way of great documentation that I > know of. I can however cover the basics here: > > (...) Thanks for this explanation. I'm sure it will be also useful to others. > If the chunk to be allocated was a data chunk, you get -ENOSPC > (usually, sometimes you might get other odd results) in the > userspace application that triggered the allocation. It seems that the available space reported by the system df command corresponds roughly to the size of the block device minus all the "used" space as reported by "btrfs fi df". If I understand what you wrote correctly this means that when writing a huge file it may happen that the system df will report enough free space, but btrfs will raise ENOSPC. However, it should be possible to keep writing small files even at this point (assuming that there's enough space for the metadata). Or will btrfs split the huge file into small pieces to fit it into the fragmented free space in the chunks? Such a situation should be avoided of course. I'm asking out of curiosity. >>>> * So scrubbing is not enough to check the health of a btrfs >>>> file system? It’s also necessary to read all the files? >> >>> Scrubbing checks data integrity, but not the state of the >>> data. IOW, you're checking that the data and metadata match >>> with the checksums, but not necessarily that the filesystem >>> itself is valid. >> >> I see, but what should one then do to detect problems such as >> mine as soon as possible? Periodically calculate hashes for >> all files? I’ve never seen a recommendation to do that for >> btrfs. > Scrub will verify that the data is the same as when the kernel > calculated the block checksum. That's really the best that can > be done. In your case, it couldn't correct the errors because > both copies of the corrupted blocks were bad (this points at an > issue with either RAM or the storage controller BTW, not the > disks themselves). Had one of the copies been valid, it would > have intelligently detected which one was bad and fixed things. I think I understand the problem with the three corrupted blocks that I was able to fix by replacing the files. But there is also the strange "Stale file handle" error with some other files that was not found by scrubbing, and also does not seem to appear in the output of "btrfs dev stats", which is BTW [/dev/sda2].write_io_errs 0 [/dev/sda2].read_io_errs 0 [/dev/sda2].flush_io_errs 0 [/dev/sda2].corruption_errs 3 [/dev/sda2].generation_errs 0 [/dev/sdb2].write_io_errs 0 [/dev/sdb2].read_io_errs 0 [/dev/sdb2].flush_io_errs 0 [/dev/sdb2].corruption_errs 3 [/dev/sdb2].generation_errs 0 (The 2 times 3 corruption errors seem to be the uncorrectable errors that I could fix by replacing the files.) To get the "stale file handle" error I need to try to read the affected file. That's why I was wondering whether reading all the files periodically is indeed a useful maintenance procedure with btrfs. "btrfs check" does find the problem, but it can be only run on an unmounted file system. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 9:18 ` Christoph Groth @ 2017-01-17 12:32 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 20+ messages in thread From: Austin S. Hemmelgarn @ 2017-01-17 12:32 UTC (permalink / raw) To: Christoph Groth; +Cc: linux-btrfs On 2017-01-17 04:18, Christoph Groth wrote: > Austin S. Hemmelgarn wrote: > >> There's not really much in the way of great documentation that I know >> of. I can however cover the basics here: >> >> (...) > > Thanks for this explanation. I'm sure it will be also useful to others. Glad I could help. > >> If the chunk to be allocated was a data chunk, you get -ENOSPC >> (usually, sometimes you might get other odd results) in the userspace >> application that triggered the allocation. > > It seems that the available space reported by the system df command > corresponds roughly to the size of the block device minus all the "used" > space as reported by "btrfs fi df". That's correct. > > If I understand what you wrote correctly this means that when writing a > huge file it may happen that the system df will report enough free > space, but btrfs will raise ENOSPC. However, it should be possible to > keep writing small files even at this point (assuming that there's > enough space for the metadata). Or will btrfs split the huge file into > small pieces to fit it into the fragmented free space in the chunks? OK, so the first bit to understanding this is that an extent in a file can't be larger than a chunk. This means that if you have space for 3 1GB data chunks located in 3 different places on the storage device, you can still write a 3GB file to the filesystem, it will just end up with 3 1GB extents. The issues with ENOSPC come in when almost all of your space is allocated to chunks and one type gets full. In such a situation, if you have metadata space, you can keep writing to the FS, but big writes may fail, and you'll eventually end up in a situation where you need to delete things to free up space. > > Such a situation should be avoided of course. I'm asking out of curiosity. > >>>>> * So scrubbing is not enough to check the health of a btrfs file >>>>> system? It’s also necessary to read all the files? >>> >>>> Scrubbing checks data integrity, but not the state of the data. IOW, >>>> you're checking that the data and metadata match with the checksums, >>>> but not necessarily that the filesystem itself is valid. >>> >>> I see, but what should one then do to detect problems such as mine as >>> soon as possible? Periodically calculate hashes for all files? I’ve >>> never seen a recommendation to do that for btrfs. > >> Scrub will verify that the data is the same as when the kernel >> calculated the block checksum. That's really the best that can be >> done. In your case, it couldn't correct the errors because both copies >> of the corrupted blocks were bad (this points at an issue with either >> RAM or the storage controller BTW, not the disks themselves). Had one >> of the copies been valid, it would have intelligently detected which >> one was bad and fixed things. > > I think I understand the problem with the three corrupted blocks that I > was able to fix by replacing the files. > > But there is also the strange "Stale file handle" error with some other > files that was not found by scrubbing, and also does not seem to appear > in the output of "btrfs dev stats", which is BTW > > [/dev/sda2].write_io_errs 0 > [/dev/sda2].read_io_errs 0 > [/dev/sda2].flush_io_errs 0 > [/dev/sda2].corruption_errs 3 > [/dev/sda2].generation_errs 0 > [/dev/sdb2].write_io_errs 0 > [/dev/sdb2].read_io_errs 0 > [/dev/sdb2].flush_io_errs 0 > [/dev/sdb2].corruption_errs 3 > [/dev/sdb2].generation_errs 0 > > (The 2 times 3 corruption errors seem to be the uncorrectable errors > that I could fix by replacing the files.) Yep, those correspond directly to the uncorrectable errors you mentioned in your original post. > > To get the "stale file handle" error I need to try to read the affected > file. That's why I was wondering whether reading all the files > periodically is indeed a useful maintenance procedure with btrfs. In the cases I've seen, no it isn't all that useful. As far as the whole ESTALE thing, that's almost certainly a bug and you either shouldn't be getting an error there, or you shouldn't be getting that error code there. > > "btrfs check" does find the problem, but it can be only run on an > unmounted file system. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 11:10 Unocorrectable errors with RAID1 Christoph Groth 2017-01-16 13:24 ` Austin S. Hemmelgarn @ 2017-01-16 22:45 ` Goldwyn Rodrigues 2017-01-17 8:44 ` Christoph Groth 1 sibling, 1 reply; 20+ messages in thread From: Goldwyn Rodrigues @ 2017-01-16 22:45 UTC (permalink / raw) To: Christoph Groth, linux-btrfs On 01/16/2017 05:10 AM, Christoph Groth wrote: > Hi, > > I’ve been using a btrfs RAID1 of two hard disks since early 2012 on my > home server. The machine has been working well overall, but recently > some problems with the file system surfaced. Since I do have backups, I > do not worry about the data, but I post here to better understand what > happened. Also I cannot exclude that my case is useful in some way to > btrfs development. > > First some information about the system: > > root@mim:~# uname -a > Linux mim 4.6.0-1-amd64 #1 SMP Debian 4.6.3-1 (2016-07-04) x86_64 GNU/Linux > root@mim:~# btrfs --version > btrfs-progs v4.7.3 > root@mim:~# btrfs fi show > Label: none uuid: 2da00153-f9ea-4d6c-a6cc-10c913d22686 > Total devices 2 FS bytes used 345.97GiB > devid 1 size 465.29GiB used 420.06GiB path /dev/sda2 > devid 2 size 465.29GiB used 420.04GiB path /dev/sdb2 > > root@mim:~# btrfs fi df / > Data, RAID1: total=417.00GiB, used=344.62GiB > Data, single: total=8.00MiB, used=0.00B > System, RAID1: total=40.00MiB, used=68.00KiB > System, single: total=4.00MiB, used=0.00B > Metadata, RAID1: total=3.00GiB, used=1.35GiB > Metadata, single: total=8.00MiB, used=0.00B > GlobalReserve, single: total=464.00MiB, used=0.00B > root@mim:~# dmesg | grep -i btrfs > [ 4.165859] Btrfs loaded > [ 4.481712] BTRFS: device fsid 2da00153-f9ea-4d6c-a6cc-10c913d22686 > devid 1 transid 2075354 /dev/sda2 > [ 4.482025] BTRFS: device fsid 2da00153-f9ea-4d6c-a6cc-10c913d22686 > devid 2 transid 2075354 /dev/sdb2 > [ 4.521090] BTRFS info (device sdb2): disk space caching is enabled > [ 4.628506] BTRFS info (device sdb2): bdev /dev/sdb2 errs: wr 0, rd > 0, flush 0, corrupt 3, gen 0 > [ 4.628521] BTRFS info (device sdb2): bdev /dev/sda2 errs: wr 0, rd > 0, flush 0, corrupt 3, gen 0 > [ 18.315694] BTRFS info (device sdb2): disk space caching is enabled > > The disks themselves have been turning for almost 5 years by now, but > their SMART health is still fully satisfactory. > > I noticed that something was wrong because printing stopped to work. So > I did a scrub that detected 0 "correctable errors" and 6 "uncorrectable" > errors. The relevant bits from kern.log are: > > Jan 11 11:05:56 mim kernel: [159873.938579] BTRFS warning (device sdb2): > checksum error at logical 180829634560 on dev /dev/sdb2, sector > 353143968, root 5, inode 10014144, offset 221184, length 4096, links 1 > (path: usr/lib/x86_64-linux-gnu/libcups.so.2) > Jan 11 11:05:57 mim kernel: [159874.857132] BTRFS warning (device sdb2): > checksum error at logical 180829634560 on dev /dev/sda2, sector > 353182880, root 5, inode 10014144, offset 221184, length 4096, links 1 > (path: usr/lib/x86_64-linux-gnu/libcups.so.2) > Jan 11 11:28:42 mim kernel: [161240.083721] BTRFS warning (device sdb2): > checksum error at logical 260254629888 on dev /dev/sda2, sector > 508309824, root 5, inode 9990924, offset 6676480, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > Jan 11 11:28:42 mim kernel: [161240.235837] BTRFS warning (device sdb2): > checksum error at logical 260254638080 on dev /dev/sda2, sector > 508309840, root 5, inode 9990924, offset 6684672, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > Jan 11 11:37:21 mim kernel: [161759.725120] BTRFS warning (device sdb2): > checksum error at logical 260254629888 on dev /dev/sdb2, sector > 508270912, root 5, inode 9990924, offset 6676480, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > Jan 11 11:37:21 mim kernel: [161759.750251] BTRFS warning (device sdb2): > checksum error at logical 260254638080 on dev /dev/sdb2, sector > 508270928, root 5, inode 9990924, offset 6684672, length 4096, links 1 > (path: > var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_main_binary-amd64_Packages) > > > As you can see each disk has the same three errors, and there are no > other errors. Random bad blocks cannot explain this situation. I asked > on #btrfs and someone suggested that these errors are likely due to RAM > problems. This may indeed be the case, since the machine has no ECC. I > managed to fix these errors by replacing the broken files with good > copies. Scrubbing shows no errors now: > > root@mim:~# btrfs scrub status / > scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686 > scrub started at Sat Jan 14 12:52:03 2017 and finished after > 01:49:10 > total bytes scrubbed: 699.17GiB with 0 errors > > However, there are further problems. When trying to archive the full > filesystem I noticed that some files/directories cannot be read. (The > problem is localized to some ".git" directory that I don’t need.) Any > attempt to read the broken files (or to delete them) does not work: > > $ du -sh .git > du: cannot access > '.git/objects/28/ea2aae3fe57ab4328adaa8b79f3c1cf005dd8d': No such file > or directory > du: cannot access > '.git/objects/28/fd95a5e9d08b6684819ce6e3d39d99e2ecccd5': Stale file handle > du: cannot access > '.git/objects/28/52e887ed436ed2c549b20d4f389589b7b58e09': Stale file handle > du: cannot access '.git/objects/info': Stale file handle > du: cannot access '.git/objects/pack': Stale file handle > > During the above command the following lines were added to kern.log: > > Jan 16 09:41:34 mim kernel: [132206.957566] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.957924] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.958505] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.958971] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.959534] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.959874] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.960523] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > Jan 16 09:41:34 mim kernel: [132206.960943] BTRFS critical (device > sda2): corrupt leaf, slot offset bad: block=192561152,root=1, slot=15 > > So I tried to repair the file system by running "btrfs check --repair", > but this doesn’t work: > > (initramfs) btrfs --version > btrfs-progs v4.7.3 > (initramfs) btrfs check --repair /dev/sda2 > UUID: ... > checking extents > incorrect offsets 2527 2543 > items overlap, can't fix > cmds-check.c:4297: fix_item_offset: Assertion `ret` failed. > btrfs[0x41a8b4] > btrfs[0x41a8db] > btrfs[0x42428b] > btrfs[0x424f83] > btrfs[0x4259cd] > btrfs(cmd_check+0x1111)[0x427d6d] > btrfs(main+0x12f)[0x40a341] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fd98859d2b1] > btrfs(_start+0x2a)[0x40a37a] > Would you be able to upload a btrfs-image for me to examine. This is a core ctree error where most probably item size is incorrectly registered. Thanks, -- -- Goldwyn ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-16 22:45 ` Goldwyn Rodrigues @ 2017-01-17 8:44 ` Christoph Groth 2017-01-17 11:32 ` Goldwyn Rodrigues 0 siblings, 1 reply; 20+ messages in thread From: Christoph Groth @ 2017-01-17 8:44 UTC (permalink / raw) To: Goldwyn Rodrigues; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 339 bytes --] Goldwyn Rodrigues wrote: > Would you be able to upload a btrfs-image for me to > examine. This is a core ctree error where most probably item > size is incorrectly registered. Sure, I can do that. I'd like to use the -s option, will this be fine? Is there some preferred place for the upload? If not, I can use personal webspace. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 8:44 ` Christoph Groth @ 2017-01-17 11:32 ` Goldwyn Rodrigues 2017-01-17 20:25 ` Christoph Groth 0 siblings, 1 reply; 20+ messages in thread From: Goldwyn Rodrigues @ 2017-01-17 11:32 UTC (permalink / raw) To: Christoph Groth; +Cc: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 541 bytes --] On 01/17/2017 02:44 AM, Christoph Groth wrote: > Goldwyn Rodrigues wrote: > >> Would you be able to upload a btrfs-image for me to examine. This is a >> core ctree error where most probably item size is incorrectly registered. > > Sure, I can do that. I'd like to use the -s option, will this be fine? Yes, I think that should be fine. > Is there some preferred place for the upload? If not, I can use > personal webspace. No, there is no preferred place. As far as I can download it, it is fine. -- Goldwyn [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 11:32 ` Goldwyn Rodrigues @ 2017-01-17 20:25 ` Christoph Groth 2017-01-17 21:52 ` Chris Murphy 2017-01-17 22:57 ` Unocorrectable errors with RAID1 Goldwyn Rodrigues 0 siblings, 2 replies; 20+ messages in thread From: Christoph Groth @ 2017-01-17 20:25 UTC (permalink / raw) To: Goldwyn Rodrigues; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 902 bytes --] Goldwyn Rodrigues wrote: > On 01/17/2017 02:44 AM, Christoph Groth wrote: >> Goldwyn Rodrigues wrote: >> >>> Would you be able to upload a btrfs-image for me to >>> examine. This is a >>> core ctree error where most probably item size is incorrectly >>> registered. >> >> Sure, I can do that. I'd like to use the -s option, will this >> be fine? > > Yes, I think that should be fine. Unfortunately, giving -s causes btrfs-image to segfault. I tried both btrfs-progs 4.7.3 and 4.4. I also tried different compression levels. Without -s it works, but since this file system contains the complete digital life of our family, I would rather not share even the file names. Any ideas on what could be done? If you need help to debug the problem with btrfs-image, please tell me what I should do. I can keep the broken file system around until an image can be created at some later time. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 20:25 ` Christoph Groth @ 2017-01-17 21:52 ` Chris Murphy 2017-01-17 23:10 ` Christoph Groth 2017-01-17 22:57 ` Unocorrectable errors with RAID1 Goldwyn Rodrigues 1 sibling, 1 reply; 20+ messages in thread From: Chris Murphy @ 2017-01-17 21:52 UTC (permalink / raw) To: Christoph Groth; +Cc: Goldwyn Rodrigues, Btrfs BTRFS On Tue, Jan 17, 2017 at 1:25 PM, Christoph Groth <christoph@grothesque.org> wrote: > Goldwyn Rodrigues wrote: >> >> On 01/17/2017 02:44 AM, Christoph Groth wrote: >>> >>> Goldwyn Rodrigues wrote: >>> >>>> Would you be able to upload a btrfs-image for me to examine. This is a >>>> core ctree error where most probably item size is incorrectly >>>> registered. >>> >>> >>> Sure, I can do that. I'd like to use the -s option, will this be fine? >> >> >> Yes, I think that should be fine. > > > Unfortunately, giving -s causes btrfs-image to segfault. I tried both > btrfs-progs 4.7.3 and 4.4. I also tried different compression levels. > > Without -s it works, but since this file system contains the complete > digital life of our family, I would rather not share even the file names. > > Any ideas on what could be done? If you need help to debug the problem with > btrfs-image, please tell me what I should do. I can keep the broken file > system around until an image can be created at some later time. Try 4.9, or even 4.8.5, tons of bugs have been fixed since 4.7.3 although I don't know off hand if this particular bug is fixed. I did recently do a btrfs-image with btrfs-progs v4.9 with -s and did not get a segfault. -- Chris Murphy ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 21:52 ` Chris Murphy @ 2017-01-17 23:10 ` Christoph Groth 2017-01-18 7:13 ` gdb log of crashed "btrfs-image -s" Christoph Groth 0 siblings, 1 reply; 20+ messages in thread From: Christoph Groth @ 2017-01-17 23:10 UTC (permalink / raw) To: Chris Murphy; +Cc: Goldwyn Rodrigues, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 1416 bytes --] Chris Murphy wrote: > On Tue, Jan 17, 2017 at 1:25 PM, Christoph Groth > <christoph@grothesque.org> wrote: >> Any ideas on what could be done? If you need help to debug the >> problem with >> btrfs-image, please tell me what I should do. I can keep the >> broken file >> system around until an image can be created at some later time. > > Try 4.9, or even 4.8.5, tons of bugs have been fixed since 4.7.3 > although I don't know off hand if this particular bug is > fixed. I did > recently do a btrfs-image with btrfs-progs v4.9 with -s and did > not > get a segfault. I compiled btrfs-image.static from btrfs-tools 4.9 (from git) and started it from Debian testing's initramfs. The exact command that I use is: /mnt/btrfs-image.static -c3 -s /dev/sda2 /mnt/mim-s.bim It runs for a couple of seconds (enough to write 20263936 bytes of output) and then quits with *** Error in `/mnt/btrfs-image.static`: double free or corruption (!prev): 0x00000000009f0940 *** ====== Backtrace: ====== [0x45fb97] [0x465442] [0x465c1e] [0x402694] [0x402dcb] [0x4031fe] [0x4050ff] [0x405783] [0x44cb73] [0x44cdfe] [0x400b2a] (I had to type the above off the other screen, but I double checked that there are no errors.) The executable that I used can be downloaded from http://groth.fr/btrfs-image.static Its md5sum is 48abbc82ac6d3c0cb88cba1e5edb85fd. I hope that this can help someone to see what's going on. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* gdb log of crashed "btrfs-image -s" 2017-01-17 23:10 ` Christoph Groth @ 2017-01-18 7:13 ` Christoph Groth 2017-01-18 11:49 ` Goldwyn Rodrigues 0 siblings, 1 reply; 20+ messages in thread From: Christoph Groth @ 2017-01-18 7:13 UTC (permalink / raw) To: Chris Murphy; +Cc: Goldwyn Rodrigues, Btrfs BTRFS [-- Attachment #1.1: Type: text/plain, Size: 1601 bytes --] Christoph Groth wrote: > Chris Murphy wrote: >> On Tue, Jan 17, 2017 at 1:25 PM, Christoph Groth >> <christoph@grothesque.org> wrote: >>> Any ideas on what could be done? If you need help to debug >>> the problem with >>> btrfs-image, please tell me what I should do. I can keep the >>> broken file >>> system around until an image can be created at some later >>> time. >> >> Try 4.9, or even 4.8.5, tons of bugs have been fixed since >> 4.7.3 >> although I don't know off hand if this particular bug is >> fixed. I did >> recently do a btrfs-image with btrfs-progs v4.9 with -s and did >> not >> get a segfault. > > I compiled btrfs-image.static from btrfs-tools 4.9 (from git) > and started it from Debian testing's initramfs. The exact > command that I use is: > > /mnt/btrfs-image.static -c3 -s /dev/sda2 /mnt/mim-s.bim > > It runs for a couple of seconds (enough to write 20263936 bytes > of output) and then quits with > > *** Error in `/mnt/btrfs-image.static`: double free or > corruption (!prev): 0x00000000009f0940 *** > ====== Backtrace: ====== > [0x45fb97] > [0x465442] > [0x465c1e] > [0x402694] > [0x402dcb] > [0x4031fe] > [0x4050ff] > [0x405783] > [0x44cb73] > [0x44cdfe] > [0x400b2a] > > (I had to type the above off the other screen, but I double > checked that there are no errors.) > > The executable that I used can be downloaded from > http://groth.fr/btrfs-image.static > Its md5sum is 48abbc82ac6d3c0cb88cba1e5edb85fd. > > I hope that this can help someone to see what's going on. I ran the same executable under gdb from a live system. The log is attached. [-- Attachment #1.2: btrfs-image.log --] [-- Type: application/octet-stream, Size: 4353 bytes --] root@xubuntu:/media/xubuntu/wd1t# gdb btrfs-image.static GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from btrfs-image.static...done. (gdb) run -c3 -s /dev/sda2 /media/xubuntu/wd1t/mim-s.bim Starting program: /media/xubuntu/wd1t/btrfs-image.static -c3 -s /dev/sda2 /media/xubuntu/wd1t/mim-s.bim [New LWP 2334] [New LWP 2335] [New LWP 2336] [New LWP 2337] *** Error in `/media/xubuntu/wd1t/btrfs-image.static': free(): invalid next size (normal): 0x0000000000762570 *** ======= Backtrace: ========= [0x45fb97] [0x465442] [0x465c1e] [0x402dea] [0x4031fe] [0x4050ff] [0x405783] [0x44cb73] [0x44cdfe] [0x400b2a] ======= Memory map: ======== 00400000-00521000 r-xp 00000000 08:31 689 /media/xubuntu/wd1t/btrfs-image.static 00721000-00728000 rw-p 00121000 08:31 689 /media/xubuntu/wd1t/btrfs-image.static 00728000-0085b000 rw-p 00000000 00:00 0 [heap] 7fffe0000000-7fffe01aa000 rw-p 00000000 00:00 0 7fffe01aa000-7fffe4000000 ---p 00000000 00:00 0 7fffe4000000-7fffe4025000 rw-p 00000000 00:00 0 7fffe4025000-7fffe8000000 ---p 00000000 00:00 0 7fffe8000000-7fffe8186000 rw-p 00000000 00:00 0 7fffe8186000-7fffec000000 ---p 00000000 00:00 0 7fffec000000-7fffec195000 rw-p 00000000 00:00 0 7fffec195000-7ffff0000000 ---p 00000000 00:00 0 7ffff0000000-7ffff01b0000 rw-p 00000000 00:00 0 7ffff01b0000-7ffff4000000 ---p 00000000 00:00 0 7ffff5ff6000-7ffff5ff7000 rw-p 00000000 00:00 0 7ffff5ff7000-7ffff5ff8000 ---p 00000000 00:00 0 7ffff5ff8000-7ffff67f8000 rw-p 00000000 00:00 0 7ffff67f8000-7ffff67f9000 ---p 00000000 00:00 0 7ffff67f9000-7ffff6ff9000 rw-p 00000000 00:00 0 7ffff6ff9000-7ffff6ffa000 ---p 00000000 00:00 0 7ffff6ffa000-7ffff77fa000 rw-p 00000000 00:00 0 7ffff77fa000-7ffff77fb000 ---p 00000000 00:00 0 7ffff77fb000-7ffff7ffb000 rw-p 00000000 00:00 0 7ffff7ffb000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar] 7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso] 7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Thread 1 "btrfs-image.sta" received signal SIGABRT, Aborted. 0x00000000004521de in raise () (gdb) bt #0 0x00000000004521de in raise () #1 0x00000000004523aa in abort () #2 0x000000000045fb9c in __libc_message () #3 0x0000000000465442 in malloc_printerr () #4 0x0000000000465c1e in _int_free () #5 0x0000000000402dea in sanitize_name (slot=<optimized out>, key=<synthetic pointer>, src=<optimized out>, dst=0x76c690 "4\246", <incomplete sequence \367\261>, md=<optimized out>) at image/main.c:574 #6 zero_items (src=0x760450, dst=0x76c690 "4\246", <incomplete sequence \367\261>, md=<optimized out>) at image/main.c:602 #7 copy_buffer (src=0x760450, dst=0x76c690 "4\246", <incomplete sequence \367\261>, md=<optimized out>) at image/main.c:645 #8 flush_pending (md=md@entry=0x7fffffffddc0, done=done@entry=0) at image/main.c:983 #9 0x00000000004031fe in add_extent (start=start@entry=192593920, size=size@entry=4096, md=md@entry=0x7fffffffddc0, data=data@entry=0) at image/main.c:1025 #10 0x00000000004050ff in copy_from_extent_tree (path=0x7fffffffe390, metadump=0x7fffffffddc0) at image/main.c:1280 #11 create_metadump (input=input@entry=0x7fffffffe851 "/dev/sda2", out=out@entry=0x731be0, num_threads=num_threads@entry=4, compress_level=compress_level@entry=3, sanitize=sanitize@entry=1, walk_trees=walk_trees@entry=0) at image/main.c:1370 #12 0x0000000000405783 in main (argc=<optimized out>, argv=0x7fffffffe5d8) at image/main.c:2855 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: gdb log of crashed "btrfs-image -s" 2017-01-18 7:13 ` gdb log of crashed "btrfs-image -s" Christoph Groth @ 2017-01-18 11:49 ` Goldwyn Rodrigues 2017-01-18 20:11 ` Christoph Groth 0 siblings, 1 reply; 20+ messages in thread From: Goldwyn Rodrigues @ 2017-01-18 11:49 UTC (permalink / raw) To: Christoph Groth, Chris Murphy; +Cc: Btrfs BTRFS [-- Attachment #1.1: Type: text/plain, Size: 2335 bytes --] On 01/18/2017 01:13 AM, Christoph Groth wrote: > Christoph Groth wrote: >> Chris Murphy wrote: >>> On Tue, Jan 17, 2017 at 1:25 PM, Christoph Groth >>> <christoph@grothesque.org> wrote: >>>> Any ideas on what could be done? If you need help to debug the >>>> problem with >>>> btrfs-image, please tell me what I should do. I can keep the broken >>>> file >>>> system around until an image can be created at some later time. >>> >>> Try 4.9, or even 4.8.5, tons of bugs have been fixed since 4.7.3 >>> although I don't know off hand if this particular bug is fixed. I did >>> recently do a btrfs-image with btrfs-progs v4.9 with -s and did not >>> get a segfault. >> >> I compiled btrfs-image.static from btrfs-tools 4.9 (from git) and >> started it from Debian testing's initramfs. The exact command that I >> use is: >> >> /mnt/btrfs-image.static -c3 -s /dev/sda2 /mnt/mim-s.bim >> >> It runs for a couple of seconds (enough to write 20263936 bytes of >> output) and then quits with >> >> *** Error in `/mnt/btrfs-image.static`: double free or corruption >> (!prev): 0x00000000009f0940 *** >> ====== Backtrace: ====== >> [0x45fb97] >> [0x465442] >> [0x465c1e] >> [0x402694] >> [0x402dcb] >> [0x4031fe] >> [0x4050ff] >> [0x405783] >> [0x44cb73] >> [0x44cdfe] >> [0x400b2a] >> >> (I had to type the above off the other screen, but I double checked >> that there are no errors.) >> >> The executable that I used can be downloaded from >> http://groth.fr/btrfs-image.static >> Its md5sum is 48abbc82ac6d3c0cb88cba1e5edb85fd. >> >> I hope that this can help someone to see what's going on. > > I ran the same executable under gdb from a live system. The log is > attached. > Thanks Christoph for the backtrace. I am unable to reproduce it, but looking at your backtrace, I found a bug. Would you be able to give it a try and check if it fixes the problem? diff --git a/image/main.c b/image/main.c index 58dcecb..0158844 100644 --- a/image/main.c +++ b/image/main.c @@ -550,7 +550,7 @@ static void sanitize_name(struct metadump_struct *md, u8 *dst, return; } - memcpy(eb->data, dst, eb->len); + memcpy(eb->data, src->data, src->len); switch (key->type) { case BTRFS_DIR_ITEM_KEY: -- Goldwyn [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: gdb log of crashed "btrfs-image -s" 2017-01-18 11:49 ` Goldwyn Rodrigues @ 2017-01-18 20:11 ` Christoph Groth 2017-01-23 12:09 ` Goldwyn Rodrigues 0 siblings, 1 reply; 20+ messages in thread From: Christoph Groth @ 2017-01-18 20:11 UTC (permalink / raw) To: Goldwyn Rodrigues; +Cc: Chris Murphy, Btrfs BTRFS [-- Attachment #1.1: Type: text/plain, Size: 452 bytes --] Goldwyn Rodrigues wrote: > Thanks Christoph for the backtrace. I am unable to reproduce it, > but looking at your backtrace, I found a bug. Would you be able > to give it a try and check if it fixes the problem? I applied your patch to v4.9, and compiled the static binaries. Unfortunately, it still segfaults. (Perhaps your fix is correct, and there's a second problem?) I attach a new backtrace. Do let me know if I can help in another way. [-- Attachment #1.2: btrfs-image2.log --] [-- Type: application/octet-stream, Size: 4392 bytes --] root@xubuntu:~# gdb /mnt/btrfs-image.static GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /mnt/btrfs-image.static...done. (gdb) run -s -c3 /dev/sda2 /mnt/mim.bim Starting program: /mnt/btrfs-image.static -s -c3 /dev/sda2 /mnt/mim.bim [New LWP 2334] [New LWP 2335] [New LWP 2336] [New LWP 2337] *** Error in `/mnt/btrfs-image.static': double free or corruption (out): 0x0000000000772f70 *** ======= Backtrace: ========= [0x45fba7] [0x465452] [0x465c2e] [0x402694] [0x402dce] [0x403201] [0x405102] [0x405786] [0x44cb83] [0x44ce0e] [0x400b2a] ======= Memory map: ======== 00400000-00521000 r-xp 00000000 08:21 689 /mnt/btrfs-image.static 00721000-00728000 rw-p 00121000 08:21 689 /mnt/btrfs-image.static 00728000-007e4000 rw-p 00000000 00:00 0 [heap] 7fffe0000000-7fffe017e000 rw-p 00000000 00:00 0 7fffe017e000-7fffe4000000 ---p 00000000 00:00 0 7fffe4000000-7fffe4025000 rw-p 00000000 00:00 0 7fffe4025000-7fffe8000000 ---p 00000000 00:00 0 7fffe8000000-7fffe81a6000 rw-p 00000000 00:00 0 7fffe81a6000-7fffec000000 ---p 00000000 00:00 0 7fffec000000-7fffec17c000 rw-p 00000000 00:00 0 7fffec17c000-7ffff0000000 ---p 00000000 00:00 0 7ffff0000000-7ffff019a000 rw-p 00000000 00:00 0 7ffff019a000-7ffff4000000 ---p 00000000 00:00 0 7ffff5ff6000-7ffff5ff7000 rw-p 00000000 00:00 0 7ffff5ff7000-7ffff5ff8000 ---p 00000000 00:00 0 7ffff5ff8000-7ffff67f8000 rw-p 00000000 00:00 0 7ffff67f8000-7ffff67f9000 ---p 00000000 00:00 0 7ffff67f9000-7ffff6ff9000 rw-p 00000000 00:00 0 7ffff6ff9000-7ffff6ffa000 ---p 00000000 00:00 0 7ffff6ffa000-7ffff77fa000 rw-p 00000000 00:00 0 7ffff77fa000-7ffff77fb000 ---p 00000000 00:00 0 7ffff77fb000-7ffff7ffb000 rw-p 00000000 00:00 0 7ffff7ffb000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar] 7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso] 7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Thread 1 "btrfs-image.sta" received signal SIGABRT, Aborted. 0x00000000004521ee in raise () (gdb) bt #0 0x00000000004521ee in raise () #1 0x00000000004523ba in abort () #2 0x000000000045fbac in __libc_message () #3 0x0000000000465452 in malloc_printerr () #4 0x0000000000465c2e in _int_free () #5 0x0000000000402694 in sanitize_inode_ref (md=md@entry=0x7fffffffde00, eb=eb@entry=0x771ee0, slot=slot@entry=16, ext=ext@entry=0) at image/main.c:522 #6 0x0000000000402dce in sanitize_name (slot=16, key=<synthetic pointer>, src=0x764cf0, dst=0x76bed0 "4\246", <incomplete sequence \367\261>, md=0x7fffffffde00) at image/main.c:561 #7 zero_items (src=0x764cf0, dst=0x76bed0 "4\246", <incomplete sequence \367\261>, md=<optimized out>) at image/main.c:602 #8 copy_buffer (src=0x764cf0, dst=0x76bed0 "4\246", <incomplete sequence \367\261>, md=<optimized out>) at image/main.c:645 #9 flush_pending (md=md@entry=0x7fffffffde00, done=done@entry=0) at image/main.c:983 #10 0x0000000000403201 in add_extent (start=start@entry=192589824, size=size@entry=4096, md=md@entry=0x7fffffffde00, data=data@entry=0) at image/main.c:1025 #11 0x0000000000405102 in copy_from_extent_tree (path=0x7fffffffe3d0, metadump=0x7fffffffde00) at image/main.c:1280 #12 create_metadump (input=input@entry=0x7fffffffe87f "/dev/sda2", out=out@entry=0x731be0, num_threads=num_threads@entry=4, compress_level=compress_level@entry=3, sanitize=sanitize@entry=1, walk_trees=walk_trees@entry=0) at image/main.c:1370 #13 0x0000000000405786 in main (argc=<optimized out>, argv=0x7fffffffe618) at image/main.c:2855 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: gdb log of crashed "btrfs-image -s" 2017-01-18 20:11 ` Christoph Groth @ 2017-01-23 12:09 ` Goldwyn Rodrigues 0 siblings, 0 replies; 20+ messages in thread From: Goldwyn Rodrigues @ 2017-01-23 12:09 UTC (permalink / raw) To: Christoph Groth; +Cc: Chris Murphy, Btrfs BTRFS [-- Attachment #1.1: Type: text/plain, Size: 881 bytes --] On 01/18/2017 02:11 PM, Christoph Groth wrote: > Goldwyn Rodrigues wrote: >> Thanks Christoph for the backtrace. I am unable to reproduce it, but >> looking at your backtrace, I found a bug. Would you be able to give it >> a try and check if it fixes the problem? > > I applied your patch to v4.9, and compiled the static binaries. > Unfortunately, it still segfaults. (Perhaps your fix is correct, and > there's a second problem?) I attach a new backtrace. Do let me know if > I can help in another way. I looked hard, and could not find the reason of a failure here. The bakctrace of the new one is a little different than previous one, but I am not sure why it crashes. Until I have a reproduction scneario, I may not be able to fix this. How about a core? However, a core will have values which you are trying to mask with sanitize. -- Goldwyn [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 20:25 ` Christoph Groth 2017-01-17 21:52 ` Chris Murphy @ 2017-01-17 22:57 ` Goldwyn Rodrigues 2017-01-17 23:22 ` Christoph Groth 1 sibling, 1 reply; 20+ messages in thread From: Goldwyn Rodrigues @ 2017-01-17 22:57 UTC (permalink / raw) To: Christoph Groth; +Cc: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 1119 bytes --] On 01/17/2017 02:25 PM, Christoph Groth wrote: > Goldwyn Rodrigues wrote: >> On 01/17/2017 02:44 AM, Christoph Groth wrote: >>> Goldwyn Rodrigues wrote: >>> >>>> Would you be able to upload a btrfs-image for me to examine. This is a >>>> core ctree error where most probably item size is incorrectly >>>> registered. >>> >>> Sure, I can do that. I'd like to use the -s option, will this be fine? >> >> Yes, I think that should be fine. > > Unfortunately, giving -s causes btrfs-image to segfault. I tried both > btrfs-progs 4.7.3 and 4.4. I also tried different compression levels. > > Without -s it works, but since this file system contains the complete > digital life of our family, I would rather not share even the file names. > > Any ideas on what could be done? If you need help to debug the problem > with btrfs-image, please tell me what I should do. I can keep the > broken file system around until an image can be created at some later time. As Chris mentioned, try a later version. If you are familiar with git, you could even try the devel version. -- Goldwyn [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Unocorrectable errors with RAID1 2017-01-17 22:57 ` Unocorrectable errors with RAID1 Goldwyn Rodrigues @ 2017-01-17 23:22 ` Christoph Groth 0 siblings, 0 replies; 20+ messages in thread From: Christoph Groth @ 2017-01-17 23:22 UTC (permalink / raw) To: Goldwyn Rodrigues; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 302 bytes --] Goldwyn Rodrigues wrote: > As Chris mentioned, try a later version. If you are familiar > with git, you could even try the devel version. Looking at the commits in current devel (2f4a73f9a612876116) since v4.9, there doesn't seem to be anything relevant, but I can retry, if you think it's worth. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2017-01-23 12:11 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-01-16 11:10 Unocorrectable errors with RAID1 Christoph Groth 2017-01-16 13:24 ` Austin S. Hemmelgarn 2017-01-16 15:42 ` Christoph Groth 2017-01-16 16:29 ` Austin S. Hemmelgarn 2017-01-17 4:50 ` Janos Toth F. 2017-01-17 12:25 ` Austin S. Hemmelgarn 2017-01-17 9:18 ` Christoph Groth 2017-01-17 12:32 ` Austin S. Hemmelgarn 2017-01-16 22:45 ` Goldwyn Rodrigues 2017-01-17 8:44 ` Christoph Groth 2017-01-17 11:32 ` Goldwyn Rodrigues 2017-01-17 20:25 ` Christoph Groth 2017-01-17 21:52 ` Chris Murphy 2017-01-17 23:10 ` Christoph Groth 2017-01-18 7:13 ` gdb log of crashed "btrfs-image -s" Christoph Groth 2017-01-18 11:49 ` Goldwyn Rodrigues 2017-01-18 20:11 ` Christoph Groth 2017-01-23 12:09 ` Goldwyn Rodrigues 2017-01-17 22:57 ` Unocorrectable errors with RAID1 Goldwyn Rodrigues 2017-01-17 23:22 ` Christoph Groth
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).