From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFB2F40C03 for ; Sun, 2 Jun 2024 23:37:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717371440; cv=none; b=IFEr+BEn0hzVlQa/EMekR91uo0Nuoc6gerUEFSLdpGTWb2huRFD2G64z5FrO9rfUCh74nBL2SpAb+MVX6vwnjA8ugVvNnnZIZU78CQO24/WbdmYu9clqCgiLqgOr1FbCr5XF6RV290fj9UAs0T7hoFMEHZWfwSwBfkQbNAVKVkw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717371440; c=relaxed/simple; bh=adM5UqbLZsUsSMDE6CRVvcIOUpORHYPuRZBuNyEkHMk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Lw5JN5Xb78dhFZYkddX3oDIU9IMJDofAXSaVxNOSnDmoRvEKZi+mRVvy2nqJPYAGpyGYvOO/qUjXNXE0wCT1nVmQj+6xXVrs1rK0xfSYFSM/1p+Z21GhcGsAAVAJi4V2zd0e40SfDuCes7b9jaiGgCDvqNDXncoYIRLtGauPkow= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=mrQ3e4hE; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="mrQ3e4hE" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-7025e7a42dcso667401b3a.1 for ; Sun, 02 Jun 2024 16:37:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1717371438; x=1717976238; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=yIyy6V/HgjngZpZdcQcFHRoQXU8swvm21+8SP0nEENo=; b=mrQ3e4hEw4K83fKAxT8xGWI4vG0T2VuORBDisXJVQ2cD9Mig8tJnCmPLvTWeGvS826 n3M9X7C3dNqNi4dBSHXvSEs6DENIiHLtZKkCNbcGBv7cXG1GEfnhTr4zxLxivf0yQNcx fZi8VseiSQUafDu1ZNZCCd0KdkAOwArH3eKrVFmsbEUAex7bOOrFVp32xRYa+4clQoOT a+fADBXwN3FuSvndESMt8CTYsKCbRX06p/CklDeF74pXNfjeWUVMGoY8rM4PXxXwluKZ 4sY/FQeDyMOUoTk6ui8y+yaofVhh6xVGVz+96BpQxWGrQp3uqe24oQqcgxJzq4/JTdLa 8UnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717371438; x=1717976238; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yIyy6V/HgjngZpZdcQcFHRoQXU8swvm21+8SP0nEENo=; b=ak1Q+xnWkyQXtT5P1EMv7R2t9Y50S/VyQZ6G//ZNcfSCe48oYPj4czf9rsH82mLuKr ROJTByhKeaaA2tfGo0nwWkE/c53BuXVaLxenPtFCm55tRBEGB/V0dxwefhIxZiNscQ3A Da0uic0o0kOQaB4q461tYr3k5tia9SgR2grYWmcD4FefIomVSIFlDihGGAaZjXZrkhs1 BT+nst+udEuBKTVHVzMYVuGLAQzymkeVcQZsq+ATpJKPLkqTXDqiudhOI0pv9LftrSTo /YGbX3sa0Eo2aubh5+tjNmE7P1/nGapyjQm/geBHeN0dYcRhtIUu+1vz4i1NgwVt5mml Jy5A== X-Gm-Message-State: AOJu0YzwxTAwvZsJYu0ALxjzeK/40W8v4delChfo5buVj87wBPDtfvhX 5aEkncAmpSTmSc1KP7RQ+WALiR2JULLYu51LNB52V8iSsspY+ekV6R2f8tqo9cIQ2iYDlkIKREo 7 X-Google-Smtp-Source: AGHT+IF2uElQpbK/FduWWGNKOmwVu50ouO12T2hnh7OvwVFDDTHqOMn2aZ3IgIHA1LmeWs5eXpfepw== X-Received: by 2002:a05:6a00:2343:b0:702:2f19:659c with SMTP id d2e1a72fcca58-702478c7cc2mr7795453b3a.31.1717371437955; Sun, 02 Jun 2024 16:37:17 -0700 (PDT) Received: from dread.disaster.area (pa49-179-32-121.pa.nsw.optusnet.com.au. [49.179.32.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-702423cb2e4sm4548137b3a.19.2024.06.02.16.37.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jun 2024 16:37:17 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1sDulG-002E4j-1x; Mon, 03 Jun 2024 09:37:14 +1000 Date: Mon, 3 Jun 2024 09:37:14 +1000 From: Dave Chinner To: Wengang Wang Cc: "linux-xfs@vger.kernel.org" Subject: Re: [PATCH] xfs: make sure sb_fdblocks is non-negative Message-ID: References: <20240511003426.13858-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Fri, May 31, 2024 at 03:44:56PM +0000, Wengang Wang wrote: > Hi Dave, > > Do you have further comments and/or suggestions? Or give a RB pls :D Sorry, LSFMM intervened and I didn't notice your comment until now. > > On May 13, 2024, at 10:06 AM, Wengang Wang wrote: > > > > Things is that we have a metadump, looking at the fdblocks from super block 0, it is good. > > > > $ xfs_db -c "sb 0" -c "p" cust.img |egrep "dblocks|ifree|icount" > > dblocks = 26214400 > > icount = 512 > > ifree = 337 > > fdblocks = 25997100 > > > > And when looking at the log, we have the following: > > > > $ egrep -a "fdblocks|icount|ifree" cust.log |tail > > sb_fdblocks 37 > > sb_icount 1056 > > sb_ifree 87 > > sb_fdblocks 37 > > sb_icount 1056 > > sb_ifree 87 > > sb_fdblocks 37 > > sb_icount 1056 > > sb_ifree 87 > > sb_fdblocks 18446744073709551604 > > > > # cust.log is output of my script which tries to parse the log buffer. > > > > 18446744073709551604ULL == 0xfffffffffffffff4 or -12LL > > > > With upstream kernel (6.7.0-rc3), when I tried to mount (log recover) the metadump, > > I got the following in dmesg: > > > > [ 52.927796] XFS (loop0): SB summary counter sanity check failed > > [ 52.928889] XFS (loop0): Metadata corruption detected at xfs_sb_write_verify+0x60/0x110 [xfs], xfs_sb block 0x0 > > [ 52.930890] XFS (loop0): Unmount and run xfs_repair > > [ 52.931797] XFS (loop0): First 128 bytes of corrupted metadata buffer: > > [ 52.932954] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 01 90 00 00 XFSB............ > > [ 52.934333] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > [ 52.935733] 00000020: c9 c1 ed ae 84 ed 46 b9 a1 f0 09 57 4a a9 98 42 ......F....WJ..B > > [ 52.937120] 00000030: 00 00 00 00 01 00 00 06 00 00 00 00 00 00 00 80 ................ > > [ 52.938515] 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................ > > [ 52.939919] 00000050: 00 00 00 01 00 64 00 00 00 00 00 04 00 00 00 00 .....d.......... > > [ 52.941293] 00000060: 00 00 64 00 b4 a5 02 00 02 00 00 08 00 00 00 00 ..d............. > > [ 52.942661] 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 17 00 00 19 ................ > > [ 52.944046] XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x38b/0x3a0 [xfs] (fs/xfs/xfs_buf.c:1559). Shutting down filesystem. > > [ 52.946710] XFS (loop0): Please unmount the filesystem and rectify the problem(s) > > [ 52.948099] XFS (loop0): log mount/recovery failed: error -117 > > [ 52.949810] XFS (loop0): log mount failed And that's what should be in the commit message, as it explains exactly how the problem occurred, the symptom that was seen, and why the change is necessary. It also means that anyone else who sees a similar problem and is grepping the commit history will see this and recognise it, thereby knowing that this is the fix they need... > > Looking at corresponding code: > > 231 xfs_validate_sb_write( > > 232 struct xfs_mount *mp, > > 233 struct xfs_buf *bp, > > 234 struct xfs_sb *sbp) > > 235 { > > 236 /* > > 237 * Carry out additional sb summary counter sanity checks when we write > > 238 * the superblock. We skip this in the read validator because there > > 239 * could be newer superblocks in the log and if the values are garbage > > 240 * even after replay we'll recalculate them at the end of log mount. > > 241 * > > 242 * mkfs has traditionally written zeroed counters to inprogress and > > 243 * secondary superblocks, so allow this usage to continue because > > 244 * we never read counters from such superblocks. > > 245 */ > > 246 if (xfs_buf_daddr(bp) == XFS_SB_DADDR && !sbp->sb_inprogress && > > 247 (sbp->sb_fdblocks > sbp->sb_dblocks || > > 248 !xfs_verify_icount(mp, sbp->sb_icount) || > > 249 sbp->sb_ifree > sbp->sb_icount)) { > > 250 xfs_warn(mp, "SB summary counter sanity check failed"); > > 251 return -EFSCORRUPTED; > > 252 } > > > > From dmesg and code, we know the check failure was due to bad sb_ifree vs sb_icount or bad sb_fdblocks vs sb_dblocks. > > > > Looking at the super block dump and log dump, > > We know ifree and icount are good, what’s bad is sb_fdblocks. And that sb_fdblocks is from log. > > # I verified that sb_fdblocks is 0xfffffffffffffff4 with a UEK debug kernel (though not 6.7.0-rc3) > > > > So the sb_fdblocks is updated from log to incore at xfs_log_sb() -> xfs_validate_sb_write() path though > > Should be may re-calculated from AGs. > > > > The fix aims to make xfs_validate_sb_write() happy. What about the sb_icount and sb_ifree counters? They are also percpu counters, and they can return transient negative numbers, too, right? If they end up in the log, the same as this transient negative sb_fdblocks count, won't that also cause exactly the same issue? i.e. if we need to fix the sb_fdblocks sum to always be positive, then we need to do the same thing with the other lazy superblock per-cpu counters so they don't trip the over the same transient underflow issue... -Dave. -- Dave Chinner david@fromorbit.com