From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59048C433DF for ; Fri, 21 Aug 2020 10:01:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 28B81207DA for ; Fri, 21 Aug 2020 10:01:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727846AbgHUKBM (ORCPT ); Fri, 21 Aug 2020 06:01:12 -0400 Received: from mx01.isnic.is ([193.4.58.133]:53466 "EHLO mx01.isnic.is" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726983AbgHUKBK (ORCPT ); Fri, 21 Aug 2020 06:01:10 -0400 X-Greylist: delayed 563 seconds by postgrey-1.27 at vger.kernel.org; Fri, 21 Aug 2020 06:01:05 EDT Received: from localhost (wg-client01.isnic.is [185.93.159.98]) by mx01.isnic.is (Postfix) with ESMTPS id D270AC694; Fri, 21 Aug 2020 09:51:40 +0000 (UTC) Date: Fri, 21 Aug 2020 09:51:38 +0000 From: =?iso-8859-1?B?RGF27fA=?= Steinn Geirsson To: linux-btrfs@vger.kernel.org Subject: btrfs filesystem errors Message-ID: <20200821095138.GC1481@disp4150> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="19uQFt6ulqmgNgg1" Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org --19uQFt6ulqmgNgg1 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi all, One of our btrfs filesystems is giving us problems. This filesystem is used= to store backups, so we will most likely recreate the filesystem and take new backups, but we are very interested in understanding what went wrong. For now I've left the filesystem as-is. The machine is running fully upgraded debian buster root@ht-backup01:/home/ansible# uname -a Linux ht-backup01.isnic.is 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 = (2020-06-07) x86_64 GNU/Linux root@ht-backup01:/home/ansible# btrfs --version btrfs-progs v4.20.1=20 The layout of the filesystem is RAID10 with 6 component devices: root@ht-backup01:/var/log/apt# btrfs fi df /var/urbackup Data, RAID10: total=3D8.37TiB, used=3D8.24TiB System, RAID10: total=3D7.88MiB, used=3D848.00KiB Metadata, RAID10: total=3D212.34GiB, used=3D210.97GiB GlobalReserve, single: total=3D512.00MiB, used=3D0.00B root@ht-backup01:/var/log/apt# btrfs filesystem show /var/urbackup Label: none uuid: be13f041-ba50-4d37-860b-7acfcfe96248 Total devices 6 FS bytes used 8.45TiB devid 1 size 9.10TiB used 2.86TiB path /dev/sda devid 2 size 9.10TiB used 2.86TiB path /dev/sdb devid 3 size 9.10TiB used 2.86TiB path /dev/sdc devid 4 size 9.10TiB used 2.86TiB path /dev/sdd devid 5 size 9.10TiB used 2.86TiB path /dev/sde devid 6 size 9.10TiB used 2.86TiB path /dev/sdf Here's an overview of events and actions taken so far. A couple of days ago while running a heavy read workload, a process crashed due to an IO error. Testing reading the same subvolume or any file under it resulted in IO errors: root@ht-backup01:/var/log/apt# ls /var/urbackup/consus.isnic.is/200712-1900= =20 ls: cannot access '/var/urbackup/consus.isnic.is/200712-1900': Input/output= error A look at the kernel log revealed a bunch of checksum errors. The earliest ones look like this, there are more but they look the same with different inode numbers and checksums. I can send the full kernel log (quite large) if needed). [2146111.297911] BTRFS info (device sda): no csum found for inode 124415604= start 12288 [2146111.297913] BTRFS error (device sda): parent transid verify failed on = 16717660405760 wanted 168871 found 168866 [2146111.298084] BTRFS error (device sda): parent transid verify failed on = 16717660405760 wanted 168871 found 168866 [2146111.298255] BTRFS info (device sda): no csum found for inode 124415604= start 16384 [2146111.298258] BTRFS info (device sda): no csum found for inode 124415604= start 20480 [2146111.298261] BTRFS info (device sda): no csum found for inode 124415604= start 24576 [2146111.298278] BTRFS info (device sda): no csum found for inode 124415604= start 28672 [2146111.298280] BTRFS info (device sda): no csum found for inode 124415604= start 32768 [2146111.298283] BTRFS info (device sda): no csum found for inode 124415604= start 36864 [2146111.309761] btrfs_print_data_csum_error: 61 callbacks suppressed [2146111.309768] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 0 csum 0x6576cb90 expected csum 0x00000000 mirror 2 [2146111.309822] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 4096 csum 0x74736b53 expected csum 0x00000000 mirror 2 [2146111.309865] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 8192 csum 0xb98e6445 expected csum 0x00000000 mirror 2 [2146111.309893] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 12288 csum 0xe04aa9a0 expected csum 0x00000000 mirror 2 [2146111.309918] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 16384 csum 0xc996dda7 expected csum 0x00000000 mirror 2 [2146111.309943] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 20480 csum 0xe414f94c expected csum 0x00000000 mirror 2 [2146111.309969] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 24576 csum 0xb80afd23 expected csum 0x00000000 mirror 2 [2146111.310966] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 0 csum 0x6576cb90 expected csum 0x00000000 mirror 2 [2146111.310976] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 4096 csum 0x74736b53 expected csum 0x00000000 mirror 2 [2146111.311022] BTRFS warning (device sda): csum failed root 5 ino 1244156= 04 off 8192 csum 0xb98e6445 expected csum 0x00000000 mirror 2 [2146162.968521] verify_parent_transid: 178 callbacks suppressed [2146162.968526] BTRFS error (device sda): parent transid verify failed on = 16717271908352 wanted 168871 found 168868 [2146162.982356] BTRFS error (device sda): parent transid verify failed on = 16717271908352 wanted 168871 found 168868 [2146162.983157] BTRFS error (device sda): parent transid verify failed on = 16717271908352 wanted 168871 found 168868 [2146162.984230] BTRFS error (device sda): parent transid verify failed on = 16717271908352 wanted 168871 found 168868 [2146163.285382] BTRFS error (device sda): parent transid verify failed on = 16717271908352 wanted 168871 found 168868 [2146163.285945] BTRFS error (device sda): parent transid verify failed on = 16717271908352 wanted 168871 found 168868 First action was to try a scrub, but the scrub was immediately marked canceled without operator action: root@ht-backup01:/var/log/apt# btrfs scrub start /var/urbackup/ scrub started on /var/urbackup/, fsid be13f041-ba50-4d37-860b-7acfcfe96248 = (pid=3D2475) root@ht-backup01:/var/log/apt# btrfs scrub status /var/urbackup/ scrub status for be13f041-ba50-4d37-860b-7acfcfe96248 scrub started at Tue Aug 18 11:32:15 2020 and was aborted after 00:= 00:00 total bytes scrubbed: 0.00B with 0 errors Despite the checksum errors in kernel log, device stats show no IO errors: root@ht-backup01:/home/ansible# btrfs device stats /var/urbackup/ [/dev/sda].write_io_errs 0 [/dev/sda].read_io_errs 0 [/dev/sda].flush_io_errs 0 [/dev/sda].corruption_errs 0 [/dev/sda].generation_errs 0 [/dev/sdb].write_io_errs 0 [/dev/sdb].read_io_errs 0 [/dev/sdb].flush_io_errs 0 [/dev/sdb].corruption_errs 0 [/dev/sdb].generation_errs 0 [/dev/sdc].write_io_errs 0 [/dev/sdc].read_io_errs 0 [/dev/sdc].flush_io_errs 0 [/dev/sdc].corruption_errs 0 [/dev/sdc].generation_errs 0 [/dev/sdd].write_io_errs 0 [/dev/sdd].read_io_errs 0 [/dev/sdd].flush_io_errs 0 [/dev/sdd].corruption_errs 0 [/dev/sdd].generation_errs 0 [/dev/sde].write_io_errs 0 [/dev/sde].read_io_errs 0 [/dev/sde].flush_io_errs 0 [/dev/sde].corruption_errs 0 [/dev/sde].generation_errs 0 [/dev/sdf].write_io_errs 0 [/dev/sdf].read_io_errs 0 [/dev/sdf].flush_io_errs 0 [/dev/sdf].corruption_errs 0 [/dev/sdf].generation_errs 0 SMART data also looks fine, and there are no IO errors from the underlying block devices in the kernel log. Following advice from the #btrfs IRC on freenode, I unmounted the filesystem and ran "btrfs check --readonly /dev/sda" the stdout/stderr from the piped file looks like: [1/7] checking root items [2/7] checking extents [3/7] checking free space cache block group 12376079204352 has wrong amount of free space, free space cache= has 5619712 block group has 5718016 failed to load free space cache for block group 12376079204352 block group 12377186500608 has wrong amount of free space, free space cache= has 5767168 block group has 5832704 failed to load free space cache for block group 12377186500608 [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) Opening filesystem to check... Checking filesystem on /dev/sda UUID: be13f041-ba50-4d37-860b-7acfcfe96248 found 9280877776896 bytes used, no error found total csum bytes: 8846572604 total tree bytes: 226491777024 total fs tree bytes: 212365148160 total extent tree bytes: 4458266624 btree space waste bytes: 41516609810 file data blocks allocated: 1279139722285056 referenced 38321097904128 I remounted the filesystem, at which point the following lines were added to kernel log: [2238865.763060] BTRFS info (device sda): disk space caching is enabled [2238865.763064] BTRFS info (device sda): has skinny extents [2238889.870492] BTRFS info (device sda): start tree-log replay [2238901.447456] BTRFS warning (device sda): block group 12376079204352 has= wrong amount of free space [2238901.447460] BTRFS warning (device sda): failed to load free space cach= e for block group 12376079204352, rebuilding it now [2238901.717617] BTRFS warning (device sda): block group 12377186500608 has= wrong amount of free space [2238901.717621] BTRFS warning (device sda): failed to load free space cach= e for block group 12377186500608, rebuilding it now At this point, the subvolume and files that were previously unreadable seem= ed to be working. I tried reading a couple of them and succeeded. At this point I ran a scrub on the filesystem, which ran normally and finished with no errors: root@ht-backup01:/home/ansible# btrfs scrub status /var/urbackup/ scrub status for be13f041-ba50-4d37-860b-7acfcfe96248 scrub started at Wed Aug 19 13:13:16 2020 and finished after 04:16:41 total bytes scrubbed: 16.89TiB with 0 errors I started our backup server again, but overnight our backup jobs failed with "no space available" errors. In the kernel log the following messages had appeared: [2350368.220399] ------------[ cut here ]------------ [2350368.220401] BTRFS: Transaction aborted (error -28) [2350368.220462] WARNING: CPU: 4 PID: 14252 at fs/btrfs/extent-tree.c:6805 = __btrfs_free_extent.isra.75+0x26c/0x8c0 [btrfs] [2350368.220463] Modules linked in: fuse ufs qnx4 hfsplus hfs minix vfat ms= dos fat jfs xfs ipmi_ssif nft_counter intel_rapl skx_edac nfit libnvdimm x8= 6_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ast irqbypass cr= ct10dif_pclmul crc32_pclmul ttm ghash_clmulni_intel intel_cstate drm_kms_he= lper intel_uncore drm intel_rapl_perf pcspkr iTCO_wdt mei_me ipmi_si i2c_al= go_bit iTCO_vendor_support pcc_cpufreq joydev sg mei ioatdma wmi ipmi_devin= tf ipmi_msghandler evdev acpi_power_meter acpi_pad nft_ct nf_conntrack butt= on nf_defrag_ipv6 nf_defrag_ipv4 nf_tables_set nf_tables nfnetlink ip_table= s x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto ecb btrfs zstd_decompre= ss zstd_compress xxhash dm_mod raid10 raid456 async_raid6_recov async_memcp= y async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic [2350368.220489] raid1 raid0 multipath linear md_mod hid_generic usbhid hi= d sd_mod crc32c_intel ahci xhci_pci libahci xhci_hcd libata ixgbe i40e aesn= i_intel nvme usbcore aes_x86_64 scsi_mod crypto_simd cryptd glue_helper nvm= e_core dca mdio lpc_ich i2c_i801 mfd_core usb_common [2350368.220504] CPU: 4 PID: 14252 Comm: kworker/4:2 Tainted: G W = 4.19.0-9-amd64 #1 Debian 4.19.118-2+deb10u1 [2350368.220505] Hardware name: Supermicro SYS-5029P-WTR/X11SPW-TF, BIOS 2.= 1a 09/17/2018 [2350368.220528] Workqueue: events do_async_commit [btrfs] [2350368.220550] RIP: 0010:__btrfs_free_extent.isra.75+0x26c/0x8c0 [btrfs] [2350368.220551] Code: 24 48 8b 40 50 f0 48 0f ba a8 00 17 00 00 02 72 1b 4= 1 83 fd fb 0f 84 6b 25 09 00 44 89 ee 48 c7 c7 e0 81 7c c0 e8 4e 1d b6 ec <= 0f> 0b 48 8b 3c 24 44 89 e9 ba 95 1a 00 00 48 c7 c6 a0 da 7b c0 e8 [2350368.220553] RSP: 0018:ffffba5647217b18 EFLAGS: 00010286 [2350368.220554] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000= 0000006 [2350368.220555] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff96dc1= 0d166b0 [2350368.220556] RBP: 00000a4ed7910000 R08: 000000000006f2ba R09: 000000000= 0000004 [2350368.220557] R10: 0000000000000000 R11: 0000000000000001 R12: ffff96d8e= 7b50540 [2350368.220558] R13: 00000000ffffffe4 R14: 0000000000000000 R15: 000000000= 0000002 [2350368.220559] FS: 0000000000000000(0000) GS:ffff96dc10d00000(0000) knlG= S:0000000000000000 [2350368.220560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2350368.220561] CR2: 00007fb20412a9c0 CR3: 0000000353a0a002 CR4: 000000000= 07606e0 [2350368.220562] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 000000000= 0000000 [2350368.220563] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 000000000= 0000400 [2350368.220564] PKRU: 55555554 [2350368.220564] Call Trace: [2350368.220594] ? btrfs_merge_delayed_refs+0x21e/0x310 [btrfs] [2350368.220613] __btrfs_run_delayed_refs+0x70a/0xfe0 [btrfs] [2350368.220633] btrfs_run_delayed_refs+0x6a/0x180 [btrfs] [2350368.220652] btrfs_write_dirty_block_groups+0x165/0x3a0 [btrfs] [2350368.220670] ? btrfs_run_delayed_refs+0xac/0x180 [btrfs] [2350368.220691] commit_cowonly_roots+0x242/0x2f0 [btrfs] [2350368.220712] btrfs_commit_transaction+0x314/0x890 [btrfs] [2350368.220716] ? __switch_to_asm+0x41/0x70 [2350368.220718] ? __switch_to+0x8c/0x440 [2350368.220720] ? __switch_to_asm+0x35/0x70 [2350368.220741] do_async_commit+0x22/0x30 [btrfs] [2350368.220745] process_one_work+0x1a7/0x3a0 [2350368.220748] worker_thread+0x30/0x390 [2350368.220751] ? create_worker+0x1a0/0x1a0 [2350368.220753] kthread+0x112/0x130 [2350368.220755] ? kthread_bind+0x30/0x30 [2350368.220757] ret_from_fork+0x35/0x40 [2350368.220759] ---[ end trace 29a36f403bc46833 ]--- [2350368.220761] BTRFS: error (device sda) in __btrfs_free_extent:6805: err= no=3D-28 No space left [2350368.220800] BTRFS info (device sda): forced readonly [2350368.220802] BTRFS: error (device sda) in btrfs_run_delayed_refs:2935: = errno=3D-28 No space left [2350368.238585] BTRFS warning (device sda): Skipping commit of aborted tra= nsaction. [2350368.238587] BTRFS: error (device sda) in cleanup_transaction:1860: err= no=3D-28 No space left df reports plenty of free space: root@ht-backup01:/home/ansible# df -h /var/urbackup/ Filesystem Size Used Avail Use% Mounted on /dev/sda 28T 8.6T 13T 41% /var/urbackup At this point, I believe this filesystem can not be trusted, even if some tooling managed to repair it to a seemingly working state. However, I am interested in knowing what happened so we can prevent the same situation in the future. The filesystem was created on this machine only a couple of months ago. The machine has not been reinstalled during that period (it's been on debian buster the entire time). This is a physical supermicro server hosted in our primary datacenter with redundant power. To my knowledge it has never had an unclean shutdown. Any ideas? -Dav=ED=F0 --19uQFt6ulqmgNgg1 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEvylfYbt7o3c60Grm/+HlKLuPmJoFAl8/mSoACgkQ/+HlKLuP mJrydgf+O8Y6EHFUspFbjLDy1oMWY2K0cKIH5EiMFpcOaza1juR3PkVwWwdm6EFo sysvbQ5VYVkdmPY8uFL4o+l4kRe6RZr+daUCBcrp6aykcjaIMJ1d4UNtl7atMHQM Rq8Ymqv5gXKEQVQA9tkoI81EYhcW+MHoXVnkl6MppK3UgpV3OW/eOlGizl6OYRL8 AyUALWsVOepUPTT0QCNlEN/qd0dR5dPP99h7YvVmUEXiirI3V6jNBQEmtKs2KsCC ub98S46gRrJhBKfs2zLQc3Vqk4alQsXDQAOf3vICQ2KXbUiZzSLjwjlQzl86Lz/T 5IQgbLANuT2MOZFFCInqJR0Jyjm3mQ== =ozgZ -----END PGP SIGNATURE----- --19uQFt6ulqmgNgg1--