From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C285C388F9 for ; Tue, 27 Oct 2020 19:55:32 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A32E62074B for ; Tue, 27 Oct 2020 19:55:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="yv7IzQli"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b="RziCeB+6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A32E62074B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=armlinux.org.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:List-Subscribe:List-Help:List-Post:List-Archive:List-Unsubscribe :List-Id:In-Reply-To:MIME-Version:References:Message-ID:Subject:To:From:Date: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=FtsIFwbtAOSuRSlamLgV+LLUMkqNkD479a/L+ukmZcY=; b=yv7IzQli+db/M1rB6eojoyleou fmiOfCcOnXZT1MAaVCBzliBNBjB8A20YKoaeGXVpXfUXpB7CWRbLyjjK8hxZbP+ep19fg4XhWQP82 SaIaJfwxTB+Kt+U7o/q9K8usVyo2H3PFUyuMRA2rzuI3zZsybjsjsIuANzVB/tfTgbe8G5JZyep7l k4aSvMhoh9sS7j8r5y6PVit+CMMGIxylNykxdBTv6soMWujiuVUqYC7ZKngEN42e2nag46lkwHMdI BHiIT1E03ELm2nnQlPCj47td69zYxN3iNspszayDFmKPvZH1n5lFAVKXRYEY2QWTEkuYrcVMFOsqf 6hqgP8uQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kXV3X-0007Ag-7i; Tue, 27 Oct 2020 19:54:55 +0000 Received: from pandora.armlinux.org.uk ([2001:4d48:ad52:32c8:5054:ff:fe00:142]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kXV3S-0006sW-Sw for linux-arm-kernel@lists.infradead.org; Tue, 27 Oct 2020 19:54:53 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:To:From:Date:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=hJdr4L5IUfPtvuhnuiO1raO2F8uuVXs20NMKitNsZDc=; b=RziCeB+66SyQ8K8YXRfEw7OYx x1khQZuXkoJQ5H9gAdPRTumv8G7YHEilyN/EQwTgm8nZWVn+cPs+Z+0d0jhxpSuOHL3FF5uAj8ff2 Pi1CPLwSzKvaw92U2eQFiGWYPQNGTyAYZQWWfE7CVlBbUQty9yTuTRxfBdj4kORbngtuHuAZpfWUq +Yox2EBKmVmW3e7DIUQmkeyPRxskekgbqQkikJDYArSPkgAzcAseBpREpDpqp9ut9yf6/FS9zrNH9 kyc6g3t/Qg2pn/oJwwtklkJHqcJdjCyPtsK45/qxUaDSkhJcA3t+dMp4WPdqs4heDgZjAQQW8K0pU 04irsmjmA==; Received: from shell.armlinux.org.uk ([fd8f:7570:feb6:1:5054:ff:fe00:4ec]:51736) by pandora.armlinux.org.uk with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kXV0U-0001Xn-8Z for linux-arm-kernel@lists.infradead.org; Tue, 27 Oct 2020 19:51:46 +0000 Received: from linux by shell.armlinux.org.uk with local (Exim 4.92) (envelope-from ) id 1kXV0U-0004HW-1X for linux-arm-kernel@lists.infradead.org; Tue, 27 Oct 2020 19:51:46 +0000 Date: Tue, 27 Oct 2020 19:51:46 +0000 From: Russell King - ARM Linux admin To: linux-arm-kernel@lists.infradead.org Subject: Re: aarch64: ext4 metadata integrity regression in kernels >= 5.5 ? Message-ID: <20201027195145.GE1605@shell.armlinux.org.uk> References: <20200712092231.GQ1551@shell.armlinux.org.uk> <20200712100739.GR1551@shell.armlinux.org.uk> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200712100739.GR1551@shell.armlinux.org.uk> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201027_155451_332349_26EA3D6E X-CRM114-Status: GOOD ( 44.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sun, Jul 12, 2020 at 11:07:39AM +0100, Russell King - ARM Linux admin wrote: > On Sun, Jul 12, 2020 at 10:22:31AM +0100, Russell King - ARM Linux admin wrote: > > Some will know that during the last six months, I've been seeing > > problems on the LX2160A rev 1 with corrupted checksums on a EXT4 > > FS on a NVMe recently. I'm not certain exactly which kernels are > > affected, but I know that 5.1 seems to be fine, and 5.5, possibly > > 5.4 onwards seem affected, maybe earlier. > > > > The symptom is that the kernel will run for some random amount of > > time (between a few days and a few months) and then EXT4 will > > complain with "iget: checksum invalid" on the root filesystem either > > during a logrotate or a mandb rebuild. > > > > Upon investigation with debugfs and hexdump, it appeared that a single > > EXT4 inode in one sector contained an invalid 32-bit checksum. EXT4 > > splits the 32-bit checksum into two 16-bit halves and stores them in > > separate locations in the inode, consequently any read or update of > > the checksum requires two separate reads or writes. > > > > The problem initially seemed to correlate with powering the platform > > down as the trigger, and it was suggested that the NVMe was at fault. > > However, a recent case disproved that theory when the problem appeared > > to self-correct itself after using "hdparm -f" on the drive, and the > > problem going away - e2fsck found no errors on the filesystem, and I > > could remount the filesystem in read/write mode. "hdparm -f" syncs > > the device and flushes the kernel cache, which it also does when you > > use "hdparm -t" to measure disk performance. > > > > My next question was whether it was being caused by PCIe ordering > > issues. I've since upgraded the machine to a LX2160A rev 2, which has > > yet to show any symptoms of this. > > > > However, the reason for this email is a troubling development with this > > problem: > > > > [7478798.720368] EXT4-fs error (device mmcblk0p1): ext4_lookup:1707: inode #157096: comm mandb: iget: checksum invalid > > [7478798.729925] Aborting journal on device mmcblk0p1-8. > > [7478798.734070] EXT4-fs (mmcblk0p1): Remounting filesystem read-only > > [7478798.734589] EXT4-fs error (device mmcblk0p1): ext4_journal_check_start:84: Detected aborted journal > > > > Running "e2fsck -n" on the system without having done anything gives: > > > > Inode 13755 passes checks, but checksum does not match inode. Fix? no > > Inode 157096 passes checks, but checksum does not match inode. Fix? no > > > > amongst other errors, which are expected for a filesystem that is > > normally "in-use". Using "hdparm -f" does not make these errors go > > away. > > > > The offending inodes found by e2fsck corresponds with: > > /usr/share/man/nl/man1/apt-transport-mirror.1.gz > > /lib/firmware/rtl_bt/rtl8723a_fw.bin > > > > However, just like all the other instances, these would not have changed > > recently except for atime updates. > > > > There are a couple of important differences here: > > - It is an Armada 8040 system - Clearfog GT-8K running a 5.6 kernel, > > rather than the LX2160A. > > - Its rootfs is on eMMC, not NVMe. > > > > That seems to rule out the NVMe being a cause of the problem, and any > > PCIe issues of the LX2160A rev 1. > > > > Another data point is that I'm also running an Armada 8040 system as a > > VM host, which has over a year uptime, so is on an older kernel (5.1). > > This uses EXT4 for its rootfs as well, but is on SATA SSD, and has not > > shown any issues. The VMs it runs are a later kernel (5.6) also with > > EXT4, and have yet to display any symptoms. > > > > The similarities are - the kernel is the same or similar binary on the > > failing systems (I've been running the same kernel config on both.) > > Both are a Cortex-A72, but slightly different revisions. > > > > So, it's starting to feel like an aarch64 problem, potentially a > > locking or ordering issue. Due to how rare this issue is, > > investigating it is likely very difficult. However, it seems to be > > very real, as the symptoms have now been observed on two rather > > different aarch64 platforms. > > > > Due to the amount of time required to test, it very difficult to do any > > kind of bisection, or test alternative kernels - it would take months > > of runtime for a single test. > > > > I'm chucking this out there so that if anyone else is seeing this > > behaviour, they can shout and maybe confirm what I'm seeing. > > A bit more information: > > Inode 157096 is /usr/share/man/nl/man1/apt-transport-mirror.1.gz: > > --- bad > +++ fixed > debugfs: stat <157096> > Inode: 157096 Type: regular Mode: 0644 Flags: 0x80000 > Generation: 3717235945 Version: 0x00000000:00000001 > User: 0 Group: 0 Project: 0 Size: 3811 > File ACL: 0 > Links: 1 Blockcount: 8 > Fragment: Address: 0 Number: 0 Size: 0 > ctime: 0x5ebcd62f:ba34bf1c -- Thu May 14 06:25:03 2020 > atime: 0x5ebcd63b:a2906fa0 -- Thu May 14 06:25:15 2020 > mtime: 0x5eba730a:00000000 -- Tue May 12 10:57:30 2020 > crtime: 0x5ebcd62f:a25cccf4 -- Thu May 14 06:25:03 2020 > Size of extra inode fields: 32 > -Inode checksum: 0x13fd5c3c > +Inode checksum: 0x600eba80 > EXTENTS: > (0):1173965 > > Note that mandb is set to run daily, so one must assume that the > inode checksum was fine the previous day. Note that the file itself > is fine - it passes gzip's integrity checks, and the contents are > correct: > > # zcat /usr/share/man/nl/man1/apt-transport-mirror.1.gz >/dev/null > > For the other inode, 13755, /lib/firmware/rtl_bt/rtl8723a_fw.bin: > > --- bad > +++ fixed > debugfs: stat <13755> > Inode: 13755 Type: regular Mode: 0644 Flags: 0x80000 > Generation: 2326028864 Version: 0x00000000:00000001 > User: 0 Group: 0 Project: 0 Size: 24548 > File ACL: 0 > Links: 1 Blockcount: 48 > Fragment: Address: 0 Number: 0 Size: 0 > ctime: 0x5e88ffc5:b9a541e4 -- Sat Apr 4 22:44:37 2020 > atime: 0x5e88ffc4:00000000 -- Sat Apr 4 22:44:36 2020 > mtime: 0x5d5f3bb0:00000000 -- Fri Aug 23 02:04:48 2019 > crtime: 0x5e88ffc5:51b03564 -- Sat Apr 4 22:44:37 2020 > Size of extra inode fields: 32 > -Inode checksum: 0x4d9c9f81 > +Inode checksum: 0x487c2bf3 > EXTENTS: > (0-5):835670-835675 > > In both cases, the times suggest that there has been no change made to > these inode recently. > > It would have been great to know the state of these inodes prior to the > checksum not matching, but alas, time travel has yet to be invented! > Maybe if/when it happens again on the Armada 8040, I'll have an ext4fs > image to compare against - and hopefully identify exactly what has > changed. The problems have persisted up until I added some additional debug to the ext4 code, and so far the Armada 8040 system has been up for 58 days without incident. This suggests that it is a subtle timing bug, which is going to be nigh on impossible to debug. Unfortunately, it means that I just can't trust recent aarch64 kernels not to corrupt my filesystems, and I certainly can't trust them to run any of my critical systems. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel