public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
@ 2023-02-08 17:52 Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 01/10] xfs: zero inode fork buffer at allocation Leah Rumancik
                   ` (10 more replies)
  0 siblings, 11 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs; +Cc: amir73il, chandan.babu, Leah Rumancik

Hello again,

Here is the next batch of backports for 5.15.y. Testing included
25 runs of auto group on 12 xfs configs. No regressions were seen.
I checked xfs/538 was run without issue as this test was mentioned
in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
unable to reproduce the issue.

Below I've outlined which series the backports came from:

series "xfs: intent whiteouts" (1):
[01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d
    xfs: zero inode fork buffer at allocation
[02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301
    xfs: fix potential log item leak

series "xfs: fix random format verification issues" (2):
[1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a
    xfs: detect self referencing btree sibling pointers
[2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y
    xfs: validate inode fork size against fork format
[3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f
    xfs: set XFS_FEAT_NLINK correctly
[4/4] f0f5f658065a5af09126ec892e4c383540a1c77f
    xfs: validate v5 feature fields

series "xfs: small fixes for 5.19 cycle" (3):
[1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84
    xfs: avoid unnecessary runtime sibling pointer endian conversions
[2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea
    fs: don't assert fail on perag references on teardown
[2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f
    xfs: assert in xfs_btree_del_cursor should take into account error

series "xfs: random fixes for 5.19" (4):
[1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb
    xfs: purge dquots after inode walk fails during quotacheck
[2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d
    xfs: don't leak btree cursor when insrec fails after a split

(1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/
(2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/
(3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/
(4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/

Darrick J. Wong (2):
  xfs: purge dquots after inode walk fails during quotacheck
  xfs: don't leak btree cursor when insrec fails after a split

Dave Chinner (8):
  xfs: zero inode fork buffer at allocation
  xfs: fix potential log item leak
  xfs: detect self referencing btree sibling pointers
  xfs: set XFS_FEAT_NLINK correctly
  xfs: validate v5 feature fields
  xfs: avoid unnecessary runtime sibling pointer endian conversions
  xfs: don't assert fail on perag references on teardown
  xfs: assert in xfs_btree_del_cursor should take into account error

 fs/xfs/libxfs/xfs_ag.c         |   3 +-
 fs/xfs/libxfs/xfs_btree.c      | 175 +++++++++++++++++++++++++--------
 fs/xfs/libxfs/xfs_inode_fork.c |  12 ++-
 fs/xfs/libxfs/xfs_sb.c         |  70 +++++++++++--
 fs/xfs/xfs_bmap_item.c         |   2 +
 fs/xfs/xfs_icreate_item.c      |   1 +
 fs/xfs/xfs_qm.c                |   9 +-
 fs/xfs/xfs_refcount_item.c     |   2 +
 fs/xfs/xfs_rmap_item.c         |   2 +
 9 files changed, 221 insertions(+), 55 deletions(-)

-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 01/10] xfs: zero inode fork buffer at allocation
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 02/10] xfs: fix potential log item leak Leah Rumancik
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Darrick J . Wong,
	Allison Henderson, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit cb512c921639613ce03f87e62c5e93ed9fe8c84d ]

When we first allocate or resize an inline inode fork, we round up
the allocation to 4 byte alingment to make journal alignment
constraints. We don't clear the unused bytes, so we can copy up to
three uninitialised bytes into the journal. Zero those bytes so we
only ever copy zeros into the journal.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 1d174909f9bd..20095233d7bc 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -50,8 +50,13 @@ xfs_init_local_fork(
 		mem_size++;
 
 	if (size) {
+		/*
+		 * As we round up the allocation here, we need to ensure the
+		 * bytes we don't copy data into are zeroed because the log
+		 * vectors still copy them into the journal.
+		 */
 		real_size = roundup(mem_size, 4);
-		ifp->if_u1.if_data = kmem_alloc(real_size, KM_NOFS);
+		ifp->if_u1.if_data = kmem_zalloc(real_size, KM_NOFS);
 		memcpy(ifp->if_u1.if_data, data, size);
 		if (zero_terminate)
 			ifp->if_u1.if_data[size] = '\0';
@@ -500,10 +505,11 @@ xfs_idata_realloc(
 	/*
 	 * For inline data, the underlying buffer must be a multiple of 4 bytes
 	 * in size so that it can be logged and stay on word boundaries.
-	 * We enforce that here.
+	 * We enforce that here, and use __GFP_ZERO to ensure that size
+	 * extensions always zero the unused roundup area.
 	 */
 	ifp->if_u1.if_data = krealloc(ifp->if_u1.if_data, roundup(new_size, 4),
-				      GFP_NOFS | __GFP_NOFAIL);
+				      GFP_NOFS | __GFP_NOFAIL | __GFP_ZERO);
 	ifp->if_bytes = new_size;
 }
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 02/10] xfs: fix potential log item leak
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 01/10] xfs: zero inode fork buffer at allocation Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 03/10] xfs: detect self referencing btree sibling pointers Leah Rumancik
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Darrick J . Wong,
	Allison Henderson, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit c230a4a85bcdbfc1a7415deec6caf04e8fca1301 ]

Ever since we added shadown format buffers to the log items, log
items need to handle the item being released with shadow buffers
attached. Due to the fact this requirement was added at the same
time we added new rmap/reflink intents, we missed the cleanup of
those items.

In theory, this means shadow buffers can be leaked in a very small
window when a shutdown is initiated. Testing with KASAN shows this
leak does not happen in practice - we haven't identified a single
leak in several years of shutdown testing since ~v4.8 kernels.

However, the intent whiteout cleanup mechanism results in every
cancelled intent in exactly the same state as this tiny race window
creates and so if intents down clean up shadow buffers on final
release we will leak the shadow buffer for just about every intent
we create.

Hence we start with this patch to close this condition off and
ensure that when whiteouts start to be used we don't leak lots of
memory.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/xfs_bmap_item.c     | 2 ++
 fs/xfs/xfs_icreate_item.c  | 1 +
 fs/xfs/xfs_refcount_item.c | 2 ++
 fs/xfs/xfs_rmap_item.c     | 2 ++
 4 files changed, 7 insertions(+)

diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 03159970133f..51ffdec5e4fa 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -39,6 +39,7 @@ STATIC void
 xfs_bui_item_free(
 	struct xfs_bui_log_item	*buip)
 {
+	kmem_free(buip->bui_item.li_lv_shadow);
 	kmem_cache_free(xfs_bui_zone, buip);
 }
 
@@ -198,6 +199,7 @@ xfs_bud_item_release(
 	struct xfs_bud_log_item	*budp = BUD_ITEM(lip);
 
 	xfs_bui_release(budp->bud_buip);
+	kmem_free(budp->bud_item.li_lv_shadow);
 	kmem_cache_free(xfs_bud_zone, budp);
 }
 
diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
index 017904a34c02..c265ae20946d 100644
--- a/fs/xfs/xfs_icreate_item.c
+++ b/fs/xfs/xfs_icreate_item.c
@@ -63,6 +63,7 @@ STATIC void
 xfs_icreate_item_release(
 	struct xfs_log_item	*lip)
 {
+	kmem_free(ICR_ITEM(lip)->ic_item.li_lv_shadow);
 	kmem_cache_free(xfs_icreate_zone, ICR_ITEM(lip));
 }
 
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 46904b793bd4..8ef842d17916 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -35,6 +35,7 @@ STATIC void
 xfs_cui_item_free(
 	struct xfs_cui_log_item	*cuip)
 {
+	kmem_free(cuip->cui_item.li_lv_shadow);
 	if (cuip->cui_format.cui_nextents > XFS_CUI_MAX_FAST_EXTENTS)
 		kmem_free(cuip);
 	else
@@ -204,6 +205,7 @@ xfs_cud_item_release(
 	struct xfs_cud_log_item	*cudp = CUD_ITEM(lip);
 
 	xfs_cui_release(cudp->cud_cuip);
+	kmem_free(cudp->cud_item.li_lv_shadow);
 	kmem_cache_free(xfs_cud_zone, cudp);
 }
 
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 5f0695980467..15e7b01740a7 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -35,6 +35,7 @@ STATIC void
 xfs_rui_item_free(
 	struct xfs_rui_log_item	*ruip)
 {
+	kmem_free(ruip->rui_item.li_lv_shadow);
 	if (ruip->rui_format.rui_nextents > XFS_RUI_MAX_FAST_EXTENTS)
 		kmem_free(ruip);
 	else
@@ -227,6 +228,7 @@ xfs_rud_item_release(
 	struct xfs_rud_log_item	*rudp = RUD_ITEM(lip);
 
 	xfs_rui_release(rudp->rud_ruip);
+	kmem_free(rudp->rud_item.li_lv_shadow);
 	kmem_cache_free(xfs_rud_zone, rudp);
 }
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 03/10] xfs: detect self referencing btree sibling pointers
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 01/10] xfs: zero inode fork buffer at allocation Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 02/10] xfs: fix potential log item leak Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 04/10] xfs: set XFS_FEAT_NLINK correctly Leah Rumancik
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Christoph Hellwig,
	Darrick J . Wong, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit dc04db2aa7c9307e740d6d0e173085301c173b1a ]

To catch the obvious graph cycle problem and hence potential endless
looping.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_btree.c | 140 ++++++++++++++++++++++++++++----------
 1 file changed, 105 insertions(+), 35 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 298395481713..5bec048343b0 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -51,6 +51,52 @@ xfs_btree_magic(
 	return magic;
 }
 
+static xfs_failaddr_t
+xfs_btree_check_lblock_siblings(
+	struct xfs_mount	*mp,
+	struct xfs_btree_cur	*cur,
+	int			level,
+	xfs_fsblock_t		fsb,
+	xfs_fsblock_t		sibling)
+{
+	if (sibling == NULLFSBLOCK)
+		return NULL;
+	if (sibling == fsb)
+		return __this_address;
+	if (level >= 0) {
+		if (!xfs_btree_check_lptr(cur, sibling, level + 1))
+			return __this_address;
+	} else {
+		if (!xfs_verify_fsbno(mp, sibling))
+			return __this_address;
+	}
+
+	return NULL;
+}
+
+static xfs_failaddr_t
+xfs_btree_check_sblock_siblings(
+	struct xfs_mount	*mp,
+	struct xfs_btree_cur	*cur,
+	int			level,
+	xfs_agnumber_t		agno,
+	xfs_agblock_t		agbno,
+	xfs_agblock_t		sibling)
+{
+	if (sibling == NULLAGBLOCK)
+		return NULL;
+	if (sibling == agbno)
+		return __this_address;
+	if (level >= 0) {
+		if (!xfs_btree_check_sptr(cur, sibling, level + 1))
+			return __this_address;
+	} else {
+		if (!xfs_verify_agbno(mp, agno, sibling))
+			return __this_address;
+	}
+	return NULL;
+}
+
 /*
  * Check a long btree block header.  Return the address of the failing check,
  * or NULL if everything is ok.
@@ -65,6 +111,8 @@ __xfs_btree_check_lblock(
 	struct xfs_mount	*mp = cur->bc_mp;
 	xfs_btnum_t		btnum = cur->bc_btnum;
 	int			crc = xfs_has_crc(mp);
+	xfs_failaddr_t		fa;
+	xfs_fsblock_t		fsb = NULLFSBLOCK;
 
 	if (crc) {
 		if (!uuid_equal(&block->bb_u.l.bb_uuid, &mp->m_sb.sb_meta_uuid))
@@ -83,16 +131,16 @@ __xfs_btree_check_lblock(
 	if (be16_to_cpu(block->bb_numrecs) >
 	    cur->bc_ops->get_maxrecs(cur, level))
 		return __this_address;
-	if (block->bb_u.l.bb_leftsib != cpu_to_be64(NULLFSBLOCK) &&
-	    !xfs_btree_check_lptr(cur, be64_to_cpu(block->bb_u.l.bb_leftsib),
-			level + 1))
-		return __this_address;
-	if (block->bb_u.l.bb_rightsib != cpu_to_be64(NULLFSBLOCK) &&
-	    !xfs_btree_check_lptr(cur, be64_to_cpu(block->bb_u.l.bb_rightsib),
-			level + 1))
-		return __this_address;
 
-	return NULL;
+	if (bp)
+		fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
+
+	fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
+			be64_to_cpu(block->bb_u.l.bb_leftsib));
+	if (!fa)
+		fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
+				be64_to_cpu(block->bb_u.l.bb_rightsib));
+	return fa;
 }
 
 /* Check a long btree block header. */
@@ -130,6 +178,9 @@ __xfs_btree_check_sblock(
 	struct xfs_mount	*mp = cur->bc_mp;
 	xfs_btnum_t		btnum = cur->bc_btnum;
 	int			crc = xfs_has_crc(mp);
+	xfs_failaddr_t		fa;
+	xfs_agblock_t		agbno = NULLAGBLOCK;
+	xfs_agnumber_t		agno = NULLAGNUMBER;
 
 	if (crc) {
 		if (!uuid_equal(&block->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid))
@@ -146,16 +197,18 @@ __xfs_btree_check_sblock(
 	if (be16_to_cpu(block->bb_numrecs) >
 	    cur->bc_ops->get_maxrecs(cur, level))
 		return __this_address;
-	if (block->bb_u.s.bb_leftsib != cpu_to_be32(NULLAGBLOCK) &&
-	    !xfs_btree_check_sptr(cur, be32_to_cpu(block->bb_u.s.bb_leftsib),
-			level + 1))
-		return __this_address;
-	if (block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK) &&
-	    !xfs_btree_check_sptr(cur, be32_to_cpu(block->bb_u.s.bb_rightsib),
-			level + 1))
-		return __this_address;
 
-	return NULL;
+	if (bp) {
+		agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp));
+		agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp));
+	}
+
+	fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno,
+			be32_to_cpu(block->bb_u.s.bb_leftsib));
+	if (!fa)
+		fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno,
+				 agbno, be32_to_cpu(block->bb_u.s.bb_rightsib));
+	return fa;
 }
 
 /* Check a short btree block header. */
@@ -4265,6 +4318,21 @@ xfs_btree_visit_block(
 	if (xfs_btree_ptr_is_null(cur, &rptr))
 		return -ENOENT;
 
+	/*
+	 * We only visit blocks once in this walk, so we have to avoid the
+	 * internal xfs_btree_lookup_get_block() optimisation where it will
+	 * return the same block without checking if the right sibling points
+	 * back to us and creates a cyclic reference in the btree.
+	 */
+	if (cur->bc_flags & XFS_BTREE_LONG_PTRS) {
+		if (be64_to_cpu(rptr.l) == XFS_DADDR_TO_FSB(cur->bc_mp,
+							xfs_buf_daddr(bp)))
+			return -EFSCORRUPTED;
+	} else {
+		if (be32_to_cpu(rptr.s) == xfs_daddr_to_agbno(cur->bc_mp,
+							xfs_buf_daddr(bp)))
+			return -EFSCORRUPTED;
+	}
 	return xfs_btree_lookup_get_block(cur, level, &rptr, &block);
 }
 
@@ -4439,20 +4507,21 @@ xfs_btree_lblock_verify(
 {
 	struct xfs_mount	*mp = bp->b_mount;
 	struct xfs_btree_block	*block = XFS_BUF_TO_BLOCK(bp);
+	xfs_fsblock_t		fsb;
+	xfs_failaddr_t		fa;
 
 	/* numrecs verification */
 	if (be16_to_cpu(block->bb_numrecs) > max_recs)
 		return __this_address;
 
 	/* sibling pointer verification */
-	if (block->bb_u.l.bb_leftsib != cpu_to_be64(NULLFSBLOCK) &&
-	    !xfs_verify_fsbno(mp, be64_to_cpu(block->bb_u.l.bb_leftsib)))
-		return __this_address;
-	if (block->bb_u.l.bb_rightsib != cpu_to_be64(NULLFSBLOCK) &&
-	    !xfs_verify_fsbno(mp, be64_to_cpu(block->bb_u.l.bb_rightsib)))
-		return __this_address;
-
-	return NULL;
+	fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
+	fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
+			be64_to_cpu(block->bb_u.l.bb_leftsib));
+	if (!fa)
+		fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
+				be64_to_cpu(block->bb_u.l.bb_rightsib));
+	return fa;
 }
 
 /**
@@ -4493,7 +4562,9 @@ xfs_btree_sblock_verify(
 {
 	struct xfs_mount	*mp = bp->b_mount;
 	struct xfs_btree_block	*block = XFS_BUF_TO_BLOCK(bp);
-	xfs_agblock_t		agno;
+	xfs_agnumber_t		agno;
+	xfs_agblock_t		agbno;
+	xfs_failaddr_t		fa;
 
 	/* numrecs verification */
 	if (be16_to_cpu(block->bb_numrecs) > max_recs)
@@ -4501,14 +4572,13 @@ xfs_btree_sblock_verify(
 
 	/* sibling pointer verification */
 	agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp));
-	if (block->bb_u.s.bb_leftsib != cpu_to_be32(NULLAGBLOCK) &&
-	    !xfs_verify_agbno(mp, agno, be32_to_cpu(block->bb_u.s.bb_leftsib)))
-		return __this_address;
-	if (block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK) &&
-	    !xfs_verify_agbno(mp, agno, be32_to_cpu(block->bb_u.s.bb_rightsib)))
-		return __this_address;
-
-	return NULL;
+	agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp));
+	fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
+			be32_to_cpu(block->bb_u.s.bb_leftsib));
+	if (!fa)
+		fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
+				be32_to_cpu(block->bb_u.s.bb_rightsib));
+	return fa;
 }
 
 /*
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 04/10] xfs: set XFS_FEAT_NLINK correctly
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (2 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 03/10] xfs: detect self referencing btree sibling pointers Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 05/10] xfs: validate v5 feature fields Leah Rumancik
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Christoph Hellwig,
	Darrick J . Wong, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit dd0d2f9755191690541b09e6385d0f8cd8bc9d8f ]

While xfs_has_nlink() is not used in kernel, it is used in userspace
(e.g. by xfs_db) so we need to set the XFS_FEAT_NLINK flag correctly
in xfs_sb_version_to_features().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_sb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index e58349be78bd..72c05485c870 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -70,6 +70,8 @@ xfs_sb_version_to_features(
 	/* optional V4 features */
 	if (sbp->sb_rblocks > 0)
 		features |= XFS_FEAT_REALTIME;
+	if (sbp->sb_versionnum & XFS_SB_VERSION_NLINKBIT)
+		features |= XFS_FEAT_NLINK;
 	if (sbp->sb_versionnum & XFS_SB_VERSION_ATTRBIT)
 		features |= XFS_FEAT_ATTR;
 	if (sbp->sb_versionnum & XFS_SB_VERSION_QUOTABIT)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 05/10] xfs: validate v5 feature fields
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (3 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 04/10] xfs: set XFS_FEAT_NLINK correctly Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 06/10] xfs: avoid unnecessary runtime sibling pointer endian conversions Leah Rumancik
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Christoph Hellwig,
	Darrick J . Wong, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit f0f5f658065a5af09126ec892e4c383540a1c77f ]

We don't check that the v4 feature flags taht v5 requires to be set
are actually set anywhere. Do this check when we see that the
filesystem is a v5 filesystem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_sb.c | 68 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 72c05485c870..04e2a57313fa 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -30,6 +30,47 @@
  * Physical superblock buffer manipulations. Shared with libxfs in userspace.
  */
 
+/*
+ * Check that all the V4 feature bits that the V5 filesystem format requires are
+ * correctly set.
+ */
+static bool
+xfs_sb_validate_v5_features(
+	struct xfs_sb	*sbp)
+{
+	/* We must not have any unknown V4 feature bits set */
+	if (sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS)
+		return false;
+
+	/*
+	 * The CRC bit is considered an invalid V4 flag, so we have to add it
+	 * manually to the OKBITS mask.
+	 */
+	if (sbp->sb_features2 & ~(XFS_SB_VERSION2_OKBITS |
+				  XFS_SB_VERSION2_CRCBIT))
+		return false;
+
+	/* Now check all the required V4 feature flags are set. */
+
+#define V5_VERS_FLAGS	(XFS_SB_VERSION_NLINKBIT	| \
+			XFS_SB_VERSION_ALIGNBIT		| \
+			XFS_SB_VERSION_LOGV2BIT		| \
+			XFS_SB_VERSION_EXTFLGBIT	| \
+			XFS_SB_VERSION_DIRV2BIT		| \
+			XFS_SB_VERSION_MOREBITSBIT)
+
+#define V5_FEAT_FLAGS	(XFS_SB_VERSION2_LAZYSBCOUNTBIT	| \
+			XFS_SB_VERSION2_ATTR2BIT	| \
+			XFS_SB_VERSION2_PROJID32BIT	| \
+			XFS_SB_VERSION2_CRCBIT)
+
+	if ((sbp->sb_versionnum & V5_VERS_FLAGS) != V5_VERS_FLAGS)
+		return false;
+	if ((sbp->sb_features2 & V5_FEAT_FLAGS) != V5_FEAT_FLAGS)
+		return false;
+	return true;
+}
+
 /*
  * We support all XFS versions newer than a v4 superblock with V2 directories.
  */
@@ -37,9 +78,19 @@ bool
 xfs_sb_good_version(
 	struct xfs_sb	*sbp)
 {
-	/* all v5 filesystems are supported */
+	/*
+	 * All v5 filesystems are supported, but we must check that all the
+	 * required v4 feature flags are enabled correctly as the code checks
+	 * those flags and not for v5 support.
+	 */
 	if (xfs_sb_is_v5(sbp))
-		return true;
+		return xfs_sb_validate_v5_features(sbp);
+
+	/* We must not have any unknown v4 feature bits set */
+	if ((sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS) ||
+	    ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) &&
+	     (sbp->sb_features2 & ~XFS_SB_VERSION2_OKBITS)))
+		return false;
 
 	/* versions prior to v4 are not supported */
 	if (XFS_SB_VERSION_NUM(sbp) < XFS_SB_VERSION_4)
@@ -51,12 +102,6 @@ xfs_sb_good_version(
 	if (!(sbp->sb_versionnum & XFS_SB_VERSION_EXTFLGBIT))
 		return false;
 
-	/* And must not have any unknown v4 feature bits set */
-	if ((sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS) ||
-	    ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) &&
-	     (sbp->sb_features2 & ~XFS_SB_VERSION2_OKBITS)))
-		return false;
-
 	/* It's a supported v4 filesystem */
 	return true;
 }
@@ -264,12 +309,15 @@ xfs_validate_sb_common(
 	bool			has_dalign;
 
 	if (!xfs_verify_magic(bp, dsb->sb_magicnum)) {
-		xfs_warn(mp, "bad magic number");
+		xfs_warn(mp,
+"Superblock has bad magic number 0x%x. Not an XFS filesystem?",
+			be32_to_cpu(dsb->sb_magicnum));
 		return -EWRONGFS;
 	}
 
 	if (!xfs_sb_good_version(sbp)) {
-		xfs_warn(mp, "bad version");
+		xfs_warn(mp,
+"Superblock has unknown features enabled or corrupted feature masks.");
 		return -EWRONGFS;
 	}
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 06/10] xfs: avoid unnecessary runtime sibling pointer endian conversions
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (4 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 05/10] xfs: validate v5 feature fields Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 07/10] xfs: don't assert fail on perag references on teardown Leah Rumancik
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, kernel test robot,
	Darrick J . Wong, Christoph Hellwig, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit 5672225e8f2a872a22b0cecedba7a6644af1fb84 ]

Commit dc04db2aa7c9 has caused a small aim7 regression, showing a
small increase in CPU usage in __xfs_btree_check_sblock() as a
result of the extra checking.

This is likely due to the endian conversion of the sibling poitners
being unconditional instead of relying on the compiler to endian
convert the NULL pointer at compile time and avoiding the runtime
conversion for this common case.

Rework the checks so that endian conversion of the sibling pointers
is only done if they are not null as the original code did.

.... and these need to be "inline" because the compiler completely
fails to inline them automatically like it should be doing.

$ size fs/xfs/libxfs/xfs_btree.o*
   text	   data	    bss	    dec	    hex	filename
  51874	    240	      0	  52114	   cb92 fs/xfs/libxfs/xfs_btree.o.orig
  51562	    240	      0	  51802	   ca5a fs/xfs/libxfs/xfs_btree.o.inline

Just when you think the tools have advanced sufficiently we don't
have to care about stuff like this anymore, along comes a reminder
that *our tools still suck*.

Fixes: dc04db2aa7c9 ("xfs: detect self referencing btree sibling pointers")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_btree.c | 47 +++++++++++++++++++++++++++------------
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 5bec048343b0..b4b5bf4bfed7 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -51,16 +51,31 @@ xfs_btree_magic(
 	return magic;
 }
 
-static xfs_failaddr_t
+/*
+ * These sibling pointer checks are optimised for null sibling pointers. This
+ * happens a lot, and we don't need to byte swap at runtime if the sibling
+ * pointer is NULL.
+ *
+ * These are explicitly marked at inline because the cost of calling them as
+ * functions instead of inlining them is about 36 bytes extra code per call site
+ * on x86-64. Yes, gcc-11 fails to inline them, and explicit inlining of these
+ * two sibling check functions reduces the compiled code size by over 300
+ * bytes.
+ */
+static inline xfs_failaddr_t
 xfs_btree_check_lblock_siblings(
 	struct xfs_mount	*mp,
 	struct xfs_btree_cur	*cur,
 	int			level,
 	xfs_fsblock_t		fsb,
-	xfs_fsblock_t		sibling)
+	__be64			dsibling)
 {
-	if (sibling == NULLFSBLOCK)
+	xfs_fsblock_t		sibling;
+
+	if (dsibling == cpu_to_be64(NULLFSBLOCK))
 		return NULL;
+
+	sibling = be64_to_cpu(dsibling);
 	if (sibling == fsb)
 		return __this_address;
 	if (level >= 0) {
@@ -74,17 +89,21 @@ xfs_btree_check_lblock_siblings(
 	return NULL;
 }
 
-static xfs_failaddr_t
+static inline xfs_failaddr_t
 xfs_btree_check_sblock_siblings(
 	struct xfs_mount	*mp,
 	struct xfs_btree_cur	*cur,
 	int			level,
 	xfs_agnumber_t		agno,
 	xfs_agblock_t		agbno,
-	xfs_agblock_t		sibling)
+	__be32			dsibling)
 {
-	if (sibling == NULLAGBLOCK)
+	xfs_agblock_t		sibling;
+
+	if (dsibling == cpu_to_be32(NULLAGBLOCK))
 		return NULL;
+
+	sibling = be32_to_cpu(dsibling);
 	if (sibling == agbno)
 		return __this_address;
 	if (level >= 0) {
@@ -136,10 +155,10 @@ __xfs_btree_check_lblock(
 		fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
 
 	fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
-			be64_to_cpu(block->bb_u.l.bb_leftsib));
+			block->bb_u.l.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
-				be64_to_cpu(block->bb_u.l.bb_rightsib));
+				block->bb_u.l.bb_rightsib);
 	return fa;
 }
 
@@ -204,10 +223,10 @@ __xfs_btree_check_sblock(
 	}
 
 	fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno,
-			be32_to_cpu(block->bb_u.s.bb_leftsib));
+			block->bb_u.s.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno,
-				 agbno, be32_to_cpu(block->bb_u.s.bb_rightsib));
+				 agbno, block->bb_u.s.bb_rightsib);
 	return fa;
 }
 
@@ -4517,10 +4536,10 @@ xfs_btree_lblock_verify(
 	/* sibling pointer verification */
 	fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
 	fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
-			be64_to_cpu(block->bb_u.l.bb_leftsib));
+			block->bb_u.l.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
-				be64_to_cpu(block->bb_u.l.bb_rightsib));
+				block->bb_u.l.bb_rightsib);
 	return fa;
 }
 
@@ -4574,10 +4593,10 @@ xfs_btree_sblock_verify(
 	agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp));
 	agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp));
 	fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
-			be32_to_cpu(block->bb_u.s.bb_leftsib));
+			block->bb_u.s.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
-				be32_to_cpu(block->bb_u.s.bb_rightsib));
+				block->bb_u.s.bb_rightsib);
 	return fa;
 }
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 07/10] xfs: don't assert fail on perag references on teardown
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (5 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 06/10] xfs: avoid unnecessary runtime sibling pointer endian conversions Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 08/10] xfs: assert in xfs_btree_del_cursor should take into account error Leah Rumancik
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Darrick J . Wong,
	Christoph Hellwig, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit 5b55cbc2d72632e874e50d2e36bce608e55aaaea ]

Not fatal, the assert is there to catch developer attention. I'm
seeing this occasionally during recoveryloop testing after a
shutdown, and I don't want this to stop an overnight recoveryloop
run as it is currently doing.

Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a
corruption report into the log and cause a test failure that way,
but it won't stop the machine dead.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_ag.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 005abfd9fd34..aff6fb5281f6 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -173,7 +173,6 @@ __xfs_free_perag(
 	struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
 
 	ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
-	ASSERT(atomic_read(&pag->pag_ref) == 0);
 	kmem_free(pag);
 }
 
@@ -192,7 +191,7 @@ xfs_free_perag(
 		pag = radix_tree_delete(&mp->m_perag_tree, agno);
 		spin_unlock(&mp->m_perag_lock);
 		ASSERT(pag);
-		ASSERT(atomic_read(&pag->pag_ref) == 0);
+		XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
 
 		cancel_delayed_work_sync(&pag->pag_blockgc_work);
 		xfs_iunlink_destroy(pag);
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 08/10] xfs: assert in xfs_btree_del_cursor should take into account error
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (6 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 07/10] xfs: don't assert fail on perag references on teardown Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 09/10] xfs: purge dquots after inode walk fails during quotacheck Leah Rumancik
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Dave Chinner, Darrick J . Wong,
	Christoph Hellwig, Dave Chinner, Leah Rumancik

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit 56486f307100e8fc66efa2ebd8a71941fa10bf6f ]

xfs/538 on a 1kB block filesystem failed with this assert:

XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448

The problem was that an allocation failed unexpectedly in
xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error
injections, resulting in an EFSCORRUPTED error being returned to
xfs_bmapi_write(). The error occurred on extent-to-btree format
conversion allocating the new root block:

 RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210
 Call Trace:
  <TASK>
  xfs_btree_new_iroot+0xdf/0x520
  xfs_btree_make_block_unfull+0x10d/0x1c0
  xfs_btree_insrec+0x364/0x790
  xfs_btree_insert+0xaa/0x210
  xfs_bmap_add_extent_hole_real+0x1fe/0x9a0
  xfs_bmapi_allocate+0x34c/0x420
  xfs_bmapi_write+0x53c/0x9c0
  xfs_alloc_file_space+0xee/0x320
  xfs_file_fallocate+0x36b/0x450
  vfs_fallocate+0x148/0x340
  __x64_sys_fallocate+0x3c/0x70
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa

Why the allocation failed at this point is unknown, but is likely
that we ran the transaction out of reserved space and filesystem out
of space with bmbt blocks because of all the minlen allocations
being done causing worst case fragmentation of a large allocation.

Regardless of the cause, we've then called xfs_bmapi_finish() which
calls xfs_btree_del_cursor(cur, error) to tear down the cursor.

So we have a failed operation, error != 0, cur->bc_ino.allocated > 0
and the filesystem is still up. The assert fails to take into
account that allocation can fail with an error and the transaction
teardown will shut the filesystem down if necessary. i.e. the
assert needs to check "|| error != 0" as well, because at this point
shutdown is pending because the current transaction is dirty....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_btree.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index b4b5bf4bfed7..482a4ccc6568 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -445,8 +445,14 @@ xfs_btree_del_cursor(
 			break;
 	}
 
+	/*
+	 * If we are doing a BMBT update, the number of unaccounted blocks
+	 * allocated during this cursor life time should be zero. If it's not
+	 * zero, then we should be shut down or on our way to shutdown due to
+	 * cancelling a dirty transaction on error.
+	 */
 	ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 ||
-	       xfs_is_shutdown(cur->bc_mp));
+	       xfs_is_shutdown(cur->bc_mp) || error != 0);
 	if (unlikely(cur->bc_flags & XFS_BTREE_STAGING))
 		kmem_free(cur->bc_ops);
 	if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 09/10] xfs: purge dquots after inode walk fails during quotacheck
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (7 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 08/10] xfs: assert in xfs_btree_del_cursor should take into account error Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 10/10] xfs: don't leak btree cursor when insrec fails after a split Leah Rumancik
  2023-02-08 19:02 ` [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Amir Goldstein
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Darrick J. Wong, Christoph Hellwig,
	Dave Chinner, Dave Chinner, Leah Rumancik

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit 86d40f1e49e9a909d25c35ba01bea80dbcd758cb ]

xfs/434 and xfs/436 have been reporting occasional memory leaks of
xfs_dquot objects.  These tests themselves were the messenger, not the
culprit, since they unload the xfs module, which trips the slub
debugging code while tearing down all the xfs slab caches:

=============================================================================
BUG xfs_dquot (Tainted: G        W        ): Objects remaining in xfs_dquot on __kmem_cache_shutdown()
-----------------------------------------------------------------------------

Slab 0xffffea000606de00 objects=30 used=5 fp=0xffff888181b78a78 flags=0x17ff80000010200(slab|head|node=0|zone=2|lastcpupid=0xfff)
CPU: 0 PID: 3953166 Comm: modprobe Tainted: G        W         5.18.0-rc6-djwx #rc6 d5824be9e46a2393677bda868f9b154d917ca6a7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014

Since we don't generally rmmod the xfs module between fstests, this
means that xfs/434 is really just the canary in the coal mine --
something leaked a dquot, but we don't know who.  After days of pounding
on fstests with kmemleak enabled, I finally got it to spit this out:

unreferenced object 0xffff8880465654c0 (size 536):
  comm "u10:4", pid 88, jiffies 4294935810 (age 29.512s)
  hex dump (first 32 bytes):
    60 4a 56 46 80 88 ff ff 58 ea e4 5c 80 88 ff ff  `JVF....X..\....
    00 e0 52 49 80 88 ff ff 01 00 01 00 00 00 00 00  ..RI............
  backtrace:
    [<ffffffffa0740f6c>] xfs_dquot_alloc+0x2c/0x530 [xfs]
    [<ffffffffa07443df>] xfs_qm_dqread+0x6f/0x330 [xfs]
    [<ffffffffa07462a2>] xfs_qm_dqget+0x132/0x4e0 [xfs]
    [<ffffffffa0756bb0>] xfs_qm_quotacheck_dqadjust+0xa0/0x3e0 [xfs]
    [<ffffffffa075724d>] xfs_qm_dqusage_adjust+0x35d/0x4f0 [xfs]
    [<ffffffffa06c9068>] xfs_iwalk_ag_recs+0x348/0x5d0 [xfs]
    [<ffffffffa06c95d3>] xfs_iwalk_run_callbacks+0x273/0x540 [xfs]
    [<ffffffffa06c9e8d>] xfs_iwalk_ag+0x5ed/0x890 [xfs]
    [<ffffffffa06ca22f>] xfs_iwalk_ag_work+0xff/0x170 [xfs]
    [<ffffffffa06d22c9>] xfs_pwork_work+0x79/0x130 [xfs]
    [<ffffffff81170bb2>] process_one_work+0x672/0x1040
    [<ffffffff81171b1b>] worker_thread+0x59b/0xec0
    [<ffffffff8118711e>] kthread+0x29e/0x340
    [<ffffffff810032bf>] ret_from_fork+0x1f/0x30

Now we know that quotacheck is at fault, but even this report was
canaryish -- it was triggered by xfs/494, which doesn't actually mount
any filesystems.  (kmemleak can be a little slow to notice leaks, even
with fstests repeatedly whacking it to look for them.)  Looking at the
*previous* fstest, however, showed that the test run before xfs/494 was
xfs/117.  The tipoff to the problem is in this excerpt from dmesg:

XFS (sda4): Quotacheck needed: Please wait.
XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0xdb/0x7b0 [xfs], inode 0x119 dinode
XFS (sda4): Unmount and run xfs_repair
XFS (sda4): First 128 bytes of corrupted metadata buffer:
00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
00000010: 00 00 00 01 00 00 00 00 00 90 57 54 54 1a 4c 68  ..........WTT.Lh
00000020: 81 f9 7d e1 6d ee 16 00 34 bd 7d e1 6d ee 16 00  ..}.m...4.}.m...
00000030: 34 bd 7d e1 6d ee 16 00 00 00 00 00 00 00 00 00  4.}.m...........
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000050: 00 00 00 02 00 00 00 00 00 00 00 00 96 80 f3 ab  ................
00000060: ff ff ff ff da 57 7b 11 00 00 00 00 00 00 00 03  .....W{.........
00000070: 00 00 00 01 00 00 00 10 00 00 00 00 00 00 00 08  ................
XFS (sda4): Quotacheck: Unsuccessful (Error -117): Disabling quotas.

The dinode verifier decided that the inode was corrupt, which causes
iget to return with EFSCORRUPTED.  Since this happened during
quotacheck, it is obvious that the kernel aborted the inode walk on
account of the corruption error and disabled quotas.  Unfortunately, we
neglect to purge the dquot cache before doing that, which is how the
dquots leaked.

The problems started 10 years ago in commit b84a3a, when the dquot lists
were converted to a radix tree, but the error handling behavior was not
correctly preserved -- in that commit, if the bulkstat failed and
usrquota was enabled, the bulkstat failure code would be overwritten by
the result of flushing all the dquots to disk.  As long as that
succeeds, we'd continue the quota mount as if everything were ok, but
instead we're now operating with a corrupt inode and incorrect quota
usage counts.  I didn't notice this bug in 2019 when I wrote commit
ebd126a, which changed quotacheck to skip the dqflush when the scan
doesn't complete due to inode walk failures.

Introduced-by: b84a3a96751f ("xfs: remove the per-filesystem list of dquots")
Fixes: ebd126a651f8 ("xfs: convert quotacheck to use the new iwalk functions")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/xfs_qm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 5608066d6e53..623244650a2f 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1317,8 +1317,15 @@ xfs_qm_quotacheck(
 
 	error = xfs_iwalk_threaded(mp, 0, 0, xfs_qm_dqusage_adjust, 0, true,
 			NULL);
-	if (error)
+	if (error) {
+		/*
+		 * The inode walk may have partially populated the dquot
+		 * caches.  We must purge them before disabling quota and
+		 * tearing down the quotainfo, or else the dquots will leak.
+		 */
+		xfs_qm_dqpurge_all(mp);
 		goto error_return;
+	}
 
 	/*
 	 * We've made all the changes that we need to make incore.  Flush them
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5.15 CANDIDATE 10/10] xfs: don't leak btree cursor when insrec fails after a split
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (8 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 09/10] xfs: purge dquots after inode walk fails during quotacheck Leah Rumancik
@ 2023-02-08 17:52 ` Leah Rumancik
  2023-02-08 19:02 ` [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Amir Goldstein
  10 siblings, 0 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-08 17:52 UTC (permalink / raw)
  To: linux-xfs
  Cc: amir73il, chandan.babu, Darrick J. Wong, Christoph Hellwig,
	Dave Chinner, Dave Chinner, Leah Rumancik

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit a54f78def73d847cb060b18c4e4a3d1d26c9ca6d ]

The recent patch to improve btree cycle checking caused a regression
when I rebased the in-memory btree branch atop the 5.19 for-next branch,
because in-memory short-pointer btrees do not have AG numbers.  This
produced the following complaint from kmemleak:

unreferenced object 0xffff88803d47dde8 (size 264):
  comm "xfs_io", pid 4889, jiffies 4294906764 (age 24.072s)
  hex dump (first 32 bytes):
    90 4d 0b 0f 80 88 ff ff 00 a0 bd 05 80 88 ff ff  .M..............
    e0 44 3a a0 ff ff ff ff 00 df 08 06 80 88 ff ff  .D:.............
  backtrace:
    [<ffffffffa0388059>] xfbtree_dup_cursor+0x49/0xc0 [xfs]
    [<ffffffffa029887b>] xfs_btree_dup_cursor+0x3b/0x200 [xfs]
    [<ffffffffa029af5d>] __xfs_btree_split+0x6ad/0x820 [xfs]
    [<ffffffffa029b130>] xfs_btree_split+0x60/0x110 [xfs]
    [<ffffffffa029f6da>] xfs_btree_make_block_unfull+0x19a/0x1f0 [xfs]
    [<ffffffffa029fada>] xfs_btree_insrec+0x3aa/0x810 [xfs]
    [<ffffffffa029fff3>] xfs_btree_insert+0xb3/0x240 [xfs]
    [<ffffffffa02cb729>] xfs_rmap_insert+0x99/0x200 [xfs]
    [<ffffffffa02cf142>] xfs_rmap_map_shared+0x192/0x5f0 [xfs]
    [<ffffffffa02cf60b>] xfs_rmap_map_raw+0x6b/0x90 [xfs]
    [<ffffffffa0384a85>] xrep_rmap_stash+0xd5/0x1d0 [xfs]
    [<ffffffffa0384dc0>] xrep_rmap_visit_bmbt+0xa0/0xf0 [xfs]
    [<ffffffffa0384fb6>] xrep_rmap_scan_iext+0x56/0xa0 [xfs]
    [<ffffffffa03850d8>] xrep_rmap_scan_ifork+0xd8/0x160 [xfs]
    [<ffffffffa0385195>] xrep_rmap_scan_inode+0x35/0x80 [xfs]
    [<ffffffffa03852ee>] xrep_rmap_find_rmaps+0x10e/0x270 [xfs]

I noticed that xfs_btree_insrec has a bunch of debug code that return
out of the function immediately, without freeing the "new" btree cursor
that can be returned when _make_block_unfull calls xfs_btree_split.  Fix
the error return in this function to free the btree cursor.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/libxfs/xfs_btree.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 482a4ccc6568..dffe4ca58493 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -3266,7 +3266,7 @@ xfs_btree_insrec(
 	struct xfs_btree_block	*block;	/* btree block */
 	struct xfs_buf		*bp;	/* buffer for block */
 	union xfs_btree_ptr	nptr;	/* new block ptr */
-	struct xfs_btree_cur	*ncur;	/* new btree cursor */
+	struct xfs_btree_cur	*ncur = NULL;	/* new btree cursor */
 	union xfs_btree_key	nkey;	/* new block key */
 	union xfs_btree_key	*lkey;
 	int			optr;	/* old key/record index */
@@ -3346,7 +3346,7 @@ xfs_btree_insrec(
 #ifdef DEBUG
 	error = xfs_btree_check_block(cur, block, level, bp);
 	if (error)
-		return error;
+		goto error0;
 #endif
 
 	/*
@@ -3366,7 +3366,7 @@ xfs_btree_insrec(
 		for (i = numrecs - ptr; i >= 0; i--) {
 			error = xfs_btree_debug_check_ptr(cur, pp, i, level);
 			if (error)
-				return error;
+				goto error0;
 		}
 
 		xfs_btree_shift_keys(cur, kp, 1, numrecs - ptr + 1);
@@ -3451,6 +3451,8 @@ xfs_btree_insrec(
 	return 0;
 
 error0:
+	if (ncur)
+		xfs_btree_del_cursor(ncur, error);
 	return error;
 }
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
                   ` (9 preceding siblings ...)
  2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 10/10] xfs: don't leak btree cursor when insrec fails after a split Leah Rumancik
@ 2023-02-08 19:02 ` Amir Goldstein
  2023-02-08 19:40   ` Darrick J. Wong
  2023-02-10 13:11   ` Chandan Babu R
  10 siblings, 2 replies; 19+ messages in thread
From: Amir Goldstein @ 2023-02-08 19:02 UTC (permalink / raw)
  To: Leah Rumancik; +Cc: linux-xfs, chandan.babu, Christian Brauner

On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
>
> Hello again,
>
> Here is the next batch of backports for 5.15.y. Testing included
> 25 runs of auto group on 12 xfs configs. No regressions were seen.
> I checked xfs/538 was run without issue as this test was mentioned
> in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
> XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
> unable to reproduce the issue.

Did you find any tests that started to pass or whose failure rate reduced?

There are very few Fixes: annotations in these commits so it is hard for
me to assess if any of them are relevant/important/worth the effort/risk
to backport to 5.10.

Unless I know of reproducible bugs in 5.10, I don't think I will invest
in backporting this series to 5.10.
Chandan, if you find any of these fixes relevant and important for 5.4
let me know and I will make the effort to consider them for 5.10.

Leah, please consider working on the SGID bug fixes for the next 5.15
update, because my 5.10 SGID fixes series [1] has been blocked for
months and because there are several reproducible test cases in xfstest.

I did not push on that until now because SGID test expectations were
a moving target, but since xfstests commit 81e6f628 ("generic: update
setgid tests") in this week's xfstests release, I think that tests should be
stable and we can finally start backporting all relevant SGID fixes to
align the SGID behavior of LTS kernels with that of upstream.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes

>
> Below I've outlined which series the backports came from:
>
> series "xfs: intent whiteouts" (1):
> [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d
>     xfs: zero inode fork buffer at allocation
> [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301
>     xfs: fix potential log item leak
>
> series "xfs: fix random format verification issues" (2):
> [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a
>     xfs: detect self referencing btree sibling pointers
> [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y
>     xfs: validate inode fork size against fork format
> [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f
>     xfs: set XFS_FEAT_NLINK correctly
> [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f
>     xfs: validate v5 feature fields
>
> series "xfs: small fixes for 5.19 cycle" (3):
> [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84
>     xfs: avoid unnecessary runtime sibling pointer endian conversions
> [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea
>     fs: don't assert fail on perag references on teardown
> [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f
>     xfs: assert in xfs_btree_del_cursor should take into account error
>
> series "xfs: random fixes for 5.19" (4):
> [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb
>     xfs: purge dquots after inode walk fails during quotacheck
> [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d
>     xfs: don't leak btree cursor when insrec fails after a split
>
> (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/
> (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/
> (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/
> (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/
>
> Darrick J. Wong (2):
>   xfs: purge dquots after inode walk fails during quotacheck
>   xfs: don't leak btree cursor when insrec fails after a split
>
> Dave Chinner (8):
>   xfs: zero inode fork buffer at allocation
>   xfs: fix potential log item leak
>   xfs: detect self referencing btree sibling pointers
>   xfs: set XFS_FEAT_NLINK correctly
>   xfs: validate v5 feature fields
>   xfs: avoid unnecessary runtime sibling pointer endian conversions
>   xfs: don't assert fail on perag references on teardown
>   xfs: assert in xfs_btree_del_cursor should take into account error
>
>  fs/xfs/libxfs/xfs_ag.c         |   3 +-
>  fs/xfs/libxfs/xfs_btree.c      | 175 +++++++++++++++++++++++++--------
>  fs/xfs/libxfs/xfs_inode_fork.c |  12 ++-
>  fs/xfs/libxfs/xfs_sb.c         |  70 +++++++++++--
>  fs/xfs/xfs_bmap_item.c         |   2 +
>  fs/xfs/xfs_icreate_item.c      |   1 +
>  fs/xfs/xfs_qm.c                |   9 +-
>  fs/xfs/xfs_refcount_item.c     |   2 +
>  fs/xfs/xfs_rmap_item.c         |   2 +
>  9 files changed, 221 insertions(+), 55 deletions(-)
>
> --
> 2.39.1.519.gcb327c4b5f-goog
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-08 19:02 ` [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Amir Goldstein
@ 2023-02-08 19:40   ` Darrick J. Wong
  2023-02-08 22:21     ` Dave Chinner
  2023-02-10 19:51     ` Leah Rumancik
  2023-02-10 13:11   ` Chandan Babu R
  1 sibling, 2 replies; 19+ messages in thread
From: Darrick J. Wong @ 2023-02-08 19:40 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Leah Rumancik, linux-xfs, chandan.babu, Christian Brauner

On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote:
> On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
> >
> > Hello again,
> >
> > Here is the next batch of backports for 5.15.y. Testing included
> > 25 runs of auto group on 12 xfs configs. No regressions were seen.
> > I checked xfs/538 was run without issue as this test was mentioned
> > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
> > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
> > unable to reproduce the issue.
> 
> Did you find any tests that started to pass or whose failure rate reduced?

I wish Leah had, but there basically aren't any tests for the problems
fixed in this set for her to find. :(

The first two patches I think were from when Dave was working on log
intent whiteouts, turned on KASAN to diagnose some other problem he had,
and began pulling on the ball of string (as it were) as he noticed other
things in the codebase.  We don't usually bother with regression tests
for kernel memory leaks, since they're not so easy to reproduce.

Patches 3-6 are fixes for a rash of fuzzer reports that someone in China
posted last May:
https://bugzilla.kernel.org/show_bug.cgi?id=215927

(There are more than just that one)

As usual, the submitter didn't bother to help triage and just dumped a
ton of work in our laps.  They didn't follow up with any regression
tests, because few fuzz kiddiez ever do.  At the time, I was too burned
out to deal with it, so Dave posted fixes.

Patches 7-8 would manifest themselves as test VMs halting on ASSERTs if
you configure your kernel to panic.  Not strictly needed since most LTS
kernels probably don't even have XFS_DEBUG=y, but it makes the lives of
recoveryloop runners easier if they do.

Patch 9 trips xfs/434 and xfs/436, but they only run if you have slab
debugging enabled and build XFS as a module.  I don't know why very few
people do this.

Patch 10 is a memory leak if you have XFS_DEBUG=y.  No need for a
separate test for this one, since kmemleak catches it.  If you turn it
on.

(IOWs, LGTM for the whole set.)

> There are very few Fixes: annotations in these commits so it is hard for
> me to assess if any of them are relevant/important/worth the effort/risk
> to backport to 5.10.

<nod> Dave's fixpatches rarely have Fixes tags attached.  It's difficult
to get him to do that because he has so much bad history with AUTOSEL.
I've tried to get him to add them in the past, but if I'm already
stressed out and Dave doesn't reply then I tend to merge the fix and
move on.

> Unless I know of reproducible bugs in 5.10, I don't think I will invest
> in backporting this series to 5.10.
> Chandan, if you find any of these fixes relevant and important for 5.4
> let me know and I will make the effort to consider them for 5.10.
> 
> Leah, please consider working on the SGID bug fixes for the next 5.15
> update, because my 5.10 SGID fixes series [1] has been blocked for
> months and because there are several reproducible test cases in xfstest.

Whenever y'all get to 6.0, beware of commit 2ed5b09b3e8f ("xfs: make
inode attribute forks a permanent part of struct xfs_inode"), which is a
KASAN UAF that we fixed by eliminating the 'F'.  Do not pull on the
devilstring ("...the file capabilities code calls getxattr with and
without i_rwsem held...") if you can avoid it.

> I did not push on that until now because SGID test expectations were
> a moving target, but since xfstests commit 81e6f628 ("generic: update
> setgid tests") in this week's xfstests release, I think that tests should be
> stable and we can finally start backporting all relevant SGID fixes to
> align the SGID behavior of LTS kernels with that of upstream.

Oh good, I've been (gently) waiting on that one too. :)

--D

> Thanks,
> Amir.
> 
> [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes
> 
> >
> > Below I've outlined which series the backports came from:
> >
> > series "xfs: intent whiteouts" (1):
> > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d
> >     xfs: zero inode fork buffer at allocation
> > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301
> >     xfs: fix potential log item leak
> >
> > series "xfs: fix random format verification issues" (2):
> > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a
> >     xfs: detect self referencing btree sibling pointers
> > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y
> >     xfs: validate inode fork size against fork format
> > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f
> >     xfs: set XFS_FEAT_NLINK correctly
> > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f
> >     xfs: validate v5 feature fields
> >
> > series "xfs: small fixes for 5.19 cycle" (3):
> > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84
> >     xfs: avoid unnecessary runtime sibling pointer endian conversions
> > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea
> >     fs: don't assert fail on perag references on teardown
> > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f
> >     xfs: assert in xfs_btree_del_cursor should take into account error
> >
> > series "xfs: random fixes for 5.19" (4):
> > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb
> >     xfs: purge dquots after inode walk fails during quotacheck
> > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d
> >     xfs: don't leak btree cursor when insrec fails after a split
> >
> > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/
> > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/
> > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/
> > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/
> >
> > Darrick J. Wong (2):
> >   xfs: purge dquots after inode walk fails during quotacheck
> >   xfs: don't leak btree cursor when insrec fails after a split
> >
> > Dave Chinner (8):
> >   xfs: zero inode fork buffer at allocation
> >   xfs: fix potential log item leak
> >   xfs: detect self referencing btree sibling pointers
> >   xfs: set XFS_FEAT_NLINK correctly
> >   xfs: validate v5 feature fields
> >   xfs: avoid unnecessary runtime sibling pointer endian conversions
> >   xfs: don't assert fail on perag references on teardown
> >   xfs: assert in xfs_btree_del_cursor should take into account error
> >
> >  fs/xfs/libxfs/xfs_ag.c         |   3 +-
> >  fs/xfs/libxfs/xfs_btree.c      | 175 +++++++++++++++++++++++++--------
> >  fs/xfs/libxfs/xfs_inode_fork.c |  12 ++-
> >  fs/xfs/libxfs/xfs_sb.c         |  70 +++++++++++--
> >  fs/xfs/xfs_bmap_item.c         |   2 +
> >  fs/xfs/xfs_icreate_item.c      |   1 +
> >  fs/xfs/xfs_qm.c                |   9 +-
> >  fs/xfs/xfs_refcount_item.c     |   2 +
> >  fs/xfs/xfs_rmap_item.c         |   2 +
> >  9 files changed, 221 insertions(+), 55 deletions(-)
> >
> > --
> > 2.39.1.519.gcb327c4b5f-goog
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-08 19:40   ` Darrick J. Wong
@ 2023-02-08 22:21     ` Dave Chinner
  2023-02-09  8:08       ` Amir Goldstein
  2023-02-10 19:51     ` Leah Rumancik
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-02-08 22:21 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Amir Goldstein, Leah Rumancik, linux-xfs, chandan.babu,
	Christian Brauner

On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote:
> On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote:
> > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
> > >
> > > Hello again,
> > >
> > > Here is the next batch of backports for 5.15.y. Testing included
> > > 25 runs of auto group on 12 xfs configs. No regressions were seen.
> > > I checked xfs/538 was run without issue as this test was mentioned
> > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
> > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
> > > unable to reproduce the issue.
> > 
> > Did you find any tests that started to pass or whose failure rate reduced?
> 
> I wish Leah had, but there basically aren't any tests for the problems
> fixed in this set for her to find. :(
.....
> > There are very few Fixes: annotations in these commits so it is hard for
> > me to assess if any of them are relevant/important/worth the effort/risk
> > to backport to 5.10.
> 
> <nod> Dave's fixpatches rarely have Fixes tags attached.  It's difficult
> to get him to do that because he has so much bad history with AUTOSEL.
> I've tried to get him to add them in the past, but if I'm already
> stressed out and Dave doesn't reply then I tend to merge the fix and
> move on.

In my defense, all the "fixes" from me in this series (except for
the one with a fixes tag on it) date back so long ago it was
difficult to identify what commit actually introduced the issue.
Once we're talking about "it's been there for at least a decade" -
espcially for fuzzer issues - identifying the exact commit is time
consuming and often not possible, nor really useful for anything.

I'm also not going to tag a patch with "fixes commit xyz" when
commit xyz isn't actually the cause of the problem just so that
someone can blindly use that as a "it's got a fixes tag on it, we
should back port it" trigger.

That's the whole problem with AUTOSEL - blindly applying anything
with a fixes tag on it that merges cleanly into an older kernel -
and the whole point of having a human actually manage the stable
kernel backports.

The stable XFS kernel maintainer is supposed to be actively looking
at the commits that go into the upstream kernel to determine if they
are relevant or not to the given stable kernel, regardless of
whether they address fstests failures, have fixes/stable tags on
them, etc. If all we needed stable maintainers to do is turn a crank
handle, then we'd be perfectly OK with AUTOSEL and the upstream
stable kernel process....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-08 22:21     ` Dave Chinner
@ 2023-02-09  8:08       ` Amir Goldstein
  0 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2023-02-09  8:08 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Leah Rumancik, linux-xfs, chandan.babu,
	Christian Brauner

On Thu, Feb 9, 2023 at 12:21 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote:
> > On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote:
> > > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
> > > >
> > > > Hello again,
> > > >
> > > > Here is the next batch of backports for 5.15.y. Testing included
> > > > 25 runs of auto group on 12 xfs configs. No regressions were seen.
> > > > I checked xfs/538 was run without issue as this test was mentioned
> > > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
> > > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
> > > > unable to reproduce the issue.
> > >
> > > Did you find any tests that started to pass or whose failure rate reduced?
> >
> > I wish Leah had, but there basically aren't any tests for the problems
> > fixed in this set for her to find. :(
> .....
> > > There are very few Fixes: annotations in these commits so it is hard for
> > > me to assess if any of them are relevant/important/worth the effort/risk
> > > to backport to 5.10.
> >
> > <nod> Dave's fixpatches rarely have Fixes tags attached.  It's difficult
> > to get him to do that because he has so much bad history with AUTOSEL.
> > I've tried to get him to add them in the past, but if I'm already
> > stressed out and Dave doesn't reply then I tend to merge the fix and
> > move on.
>
> In my defense, all the "fixes" from me in this series (except for
> the one with a fixes tag on it) date back so long ago it was
> difficult to identify what commit actually introduced the issue.
> Once we're talking about "it's been there for at least a decade" -
> espcially for fuzzer issues - identifying the exact commit is time
> consuming and often not possible, nor really useful for anything.
>

I wouldn't want anyone to spend more time on fuzzer issues and
I myself have given those very little attention.
My expectation is to always have a Fixes tag when a developer
knows they are fixing a regression and best effort when it is easy
for the developer to determine when the buggy code was merged.

It doesn't even have to be in the form of a Fixes tag.
If the cover letter says "fixes to the $FEATURE from v5.18"
that is good enough for me.

As a stable maintainer, if a commit has no Fixes tag and no hint
in the cover letter about relevant kernel versions, I will assume
it is relevant to the LTS kernel and spend time on the manual triage.
If a developer adds a Fixes tag where relevant, that can save my
time by eliminating commits irrelevant for e.g. 5.10.y.
This wasn't the case in this series I guess.

> I'm also not going to tag a patch with "fixes commit xyz" when
> commit xyz isn't actually the cause of the problem just so that
> someone can blindly use that as a "it's got a fixes tag on it, we
> should back port it" trigger.
>
> That's the whole problem with AUTOSEL - blindly applying anything
> with a fixes tag on it that merges cleanly into an older kernel -
> and the whole point of having a human actually manage the stable
> kernel backports.
>
> The stable XFS kernel maintainer is supposed to be actively looking
> at the commits that go into the upstream kernel to determine if they
> are relevant or not to the given stable kernel, regardless of
> whether they address fstests failures, have fixes/stable tags on
> them, etc. If all we needed stable maintainers to do is turn a crank
> handle, then we'd be perfectly OK with AUTOSEL and the upstream
> stable kernel process....
>

Yes. The reason I did not pick any of the commits in this series for 5.10
is not because lack of Fixes/tests, but because I read the cover letters
(thanks for the links Leah!) and commit messages and conclude that
the risk/effort does not make the cut.

But this conclusion was based very much on my own intuition and my
own interpretation of the cover letters. This is why I asked Chandan to
take another look.

I encourage any developer who writes a cover letter to share their
own opinion about the relevancy and risks associated with backporting
their patch set to LTS kernels - make it as vague statement as you
wish, anything that can help a human LTS maintainer do their job better.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-08 19:02 ` [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Amir Goldstein
  2023-02-08 19:40   ` Darrick J. Wong
@ 2023-02-10 13:11   ` Chandan Babu R
  1 sibling, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2023-02-10 13:11 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Leah Rumancik, linux-xfs, Christian Brauner

On Wed, Feb 08, 2023 at 09:02:58 PM +0200, Amir Goldstein wrote:
> On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
>>
>> Hello again,
>>
>> Here is the next batch of backports for 5.15.y. Testing included
>> 25 runs of auto group on 12 xfs configs. No regressions were seen.
>> I checked xfs/538 was run without issue as this test was mentioned
>> in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
>> XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
>> unable to reproduce the issue.
>
> Did you find any tests that started to pass or whose failure rate reduced?
>
> There are very few Fixes: annotations in these commits so it is hard for
> me to assess if any of them are relevant/important/worth the effort/risk
> to backport to 5.10.
>
> Unless I know of reproducible bugs in 5.10, I don't think I will invest
> in backporting this series to 5.10.
> Chandan, if you find any of these fixes relevant and important for 5.4
> let me know and I will make the effort to consider them for 5.10.

Amir, Please find my observations listed below,

Patch 1: Intent whiteouts was designed to overcome performance problems with
delayed attributes feature. Hence this patch will not be needed for v5.4-LTS.

Patch 2: This patch won't be needed since this memory leak is prevalent mostly
when intent whiteouts are being used.

Patch 3: IMHO this fix does not need to be back-ported since ondisk btree
cycles are close to impossible to occur during regular operation of an XFS
filesystem.

Patch 4: This patch won't be needed since xfs_mount->m_features isn't present
in v5.4 xfsprogs.

Patch 5: I am not sure about this one. May be mkfs.xfs has always created V5
filesystems with the required V4 feature flags set and probably fuzzing is the
only way we could modify a V5 filesystem to have some of the required V4
feature flags disabled.
I don't think this patch needs to be backported if my assumptions are true.

Patch 6: This patch does not need to be backported since this fixes a
performance regression introduced by the third patch.

Patch 7: I will want to backport this patch since it will prevent
recovery-loop test execution from stalling.

Patch 8: I think backporting XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag to
5.4-LTS will be helpful since xfs/538 has revealed some important bugs. Patch
8 needs to backported after backporting patches relevant to
XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT.

Patch 9: I think this patch needs to be backported to prevent kmemleak
warnings.

Patch 10: I would backport this patch since this can prevent kmemleak warnings
from showing up (even though the bug was detected after the third patch was
applied).

But maybe you could wait until I come across these patches when working on
backporting commits from v5.11 and newer versions of the kernel. I will share
the list of candidate patches with you. Once we agree upon the list of
patches, you could then backport them to v5.10-LTS.

-- 
chandan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-08 19:40   ` Darrick J. Wong
  2023-02-08 22:21     ` Dave Chinner
@ 2023-02-10 19:51     ` Leah Rumancik
  2023-02-11  4:05       ` Darrick J. Wong
  2023-02-11  8:33       ` Amir Goldstein
  1 sibling, 2 replies; 19+ messages in thread
From: Leah Rumancik @ 2023-02-10 19:51 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Amir Goldstein, linux-xfs, chandan.babu, Christian Brauner

On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote:
> On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote:
> > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
> > >
> > > Hello again,
> > >
> > > Here is the next batch of backports for 5.15.y. Testing included
> > > 25 runs of auto group on 12 xfs configs. No regressions were seen.
> > > I checked xfs/538 was run without issue as this test was mentioned
> > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
> > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
> > > unable to reproduce the issue.
> > 
> > Did you find any tests that started to pass or whose failure rate reduced?
> 
> I wish Leah had, but there basically aren't any tests for the problems
> fixed in this set for her to find. :(
> 
> The first two patches I think were from when Dave was working on log
> intent whiteouts, turned on KASAN to diagnose some other problem he had,
> and began pulling on the ball of string (as it were) as he noticed other
> things in the codebase.  We don't usually bother with regression tests
> for kernel memory leaks, since they're not so easy to reproduce.
> 
> Patches 3-6 are fixes for a rash of fuzzer reports that someone in China
> posted last May:
> https://bugzilla.kernel.org/show_bug.cgi?id=215927
> 
> (There are more than just that one)
> 
> As usual, the submitter didn't bother to help triage and just dumped a
> ton of work in our laps.  They didn't follow up with any regression
> tests, because few fuzz kiddiez ever do.  At the time, I was too burned
> out to deal with it, so Dave posted fixes.
> 
> Patches 7-8 would manifest themselves as test VMs halting on ASSERTs if
> you configure your kernel to panic.  Not strictly needed since most LTS
> kernels probably don't even have XFS_DEBUG=y, but it makes the lives of
> recoveryloop runners easier if they do.
> 
> Patch 9 trips xfs/434 and xfs/436, but they only run if you have slab
> debugging enabled and build XFS as a module.  I don't know why very few
> people do this.
> 
> Patch 10 is a memory leak if you have XFS_DEBUG=y.  No need for a
> separate test for this one, since kmemleak catches it.  If you turn it
> on.
> 
> (IOWs, LGTM for the whole set.)

Good to add Ack tag?

> 
> > 
> > Leah, please consider working on the SGID bug fixes for the next 5.15
> > update, because my 5.10 SGID fixes series [1] has been blocked for
> > months and because there are several reproducible test cases in xfstest.
> 
> Whenever y'all get to 6.0, beware of commit 2ed5b09b3e8f ("xfs: make
> inode attribute forks a permanent part of struct xfs_inode"), which is a
> KASAN UAF that we fixed by eliminating the 'F'.  Do not pull on the
> devilstring ("...the file capabilities code calls getxattr with and
> without i_rwsem held...") if you can avoid it.
> 
> > I did not push on that until now because SGID test expectations were
> > a moving target, but since xfstests commit 81e6f628 ("generic: update
> > setgid tests") in this week's xfstests release, I think that tests should be
> > stable and we can finally start backporting all relevant SGID fixes to
> > align the SGID behavior of LTS kernels with that of upstream.

Ooo goody, ok, will do this next.

The following patches are on my radar to look into for this set. I have
yet to look into dependencies, so the set may grow. If the sgid tests
still fail after these ptaches, I will continue hunting for more fixes
to include in this set.

  e014f37db1a2 xfs: use setattr_copy to set vfs inode attributes
  472c6e46f589 xfs: remove XFS_PREALLOC_SYNC
  fbe7e5200365 xfs: fallocate() should call file_modified()
  0b02c8c0d75a xfs: set prealloc flag in xfs_alloc_file_space()
  2b3416ceff5e fs: add mode_strip_sgid() helper
  1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
  ed5a7047d201 attr: use consistent sgid stripping checks
  8d84e39d76bd fs: use consistent setgid checks in is_sxid()

In addition to the normal regression testing, I will specifically look
at the following tests for the sgid changes:

  generic/673
  generic/68[3-7]
  generic/69[6-7]

I will also do some extra runs on the entire perms group.

Let me know if you think something should be dropped or added.

- Leah

> 
> Oh good, I've been (gently) waiting on that one too. :)
> 
> --D
> 
> > Thanks,
> > Amir.
> > 
> > [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes
> > 
> > >
> > > Below I've outlined which series the backports came from:
> > >
> > > series "xfs: intent whiteouts" (1):
> > > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d
> > >     xfs: zero inode fork buffer at allocation
> > > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301
> > >     xfs: fix potential log item leak
> > >
> > > series "xfs: fix random format verification issues" (2):
> > > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a
> > >     xfs: detect self referencing btree sibling pointers
> > > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y
> > >     xfs: validate inode fork size against fork format
> > > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f
> > >     xfs: set XFS_FEAT_NLINK correctly
> > > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f
> > >     xfs: validate v5 feature fields
> > >
> > > series "xfs: small fixes for 5.19 cycle" (3):
> > > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84
> > >     xfs: avoid unnecessary runtime sibling pointer endian conversions
> > > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea
> > >     fs: don't assert fail on perag references on teardown
> > > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f
> > >     xfs: assert in xfs_btree_del_cursor should take into account error
> > >
> > > series "xfs: random fixes for 5.19" (4):
> > > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb
> > >     xfs: purge dquots after inode walk fails during quotacheck
> > > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d
> > >     xfs: don't leak btree cursor when insrec fails after a split
> > >
> > > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/
> > > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/
> > > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/
> > > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/
> > >
> > > Darrick J. Wong (2):
> > >   xfs: purge dquots after inode walk fails during quotacheck
> > >   xfs: don't leak btree cursor when insrec fails after a split
> > >
> > > Dave Chinner (8):
> > >   xfs: zero inode fork buffer at allocation
> > >   xfs: fix potential log item leak
> > >   xfs: detect self referencing btree sibling pointers
> > >   xfs: set XFS_FEAT_NLINK correctly
> > >   xfs: validate v5 feature fields
> > >   xfs: avoid unnecessary runtime sibling pointer endian conversions
> > >   xfs: don't assert fail on perag references on teardown
> > >   xfs: assert in xfs_btree_del_cursor should take into account error
> > >
> > >  fs/xfs/libxfs/xfs_ag.c         |   3 +-
> > >  fs/xfs/libxfs/xfs_btree.c      | 175 +++++++++++++++++++++++++--------
> > >  fs/xfs/libxfs/xfs_inode_fork.c |  12 ++-
> > >  fs/xfs/libxfs/xfs_sb.c         |  70 +++++++++++--
> > >  fs/xfs/xfs_bmap_item.c         |   2 +
> > >  fs/xfs/xfs_icreate_item.c      |   1 +
> > >  fs/xfs/xfs_qm.c                |   9 +-
> > >  fs/xfs/xfs_refcount_item.c     |   2 +
> > >  fs/xfs/xfs_rmap_item.c         |   2 +
> > >  9 files changed, 221 insertions(+), 55 deletions(-)
> > >
> > > --
> > > 2.39.1.519.gcb327c4b5f-goog
> > >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-10 19:51     ` Leah Rumancik
@ 2023-02-11  4:05       ` Darrick J. Wong
  2023-02-11  8:33       ` Amir Goldstein
  1 sibling, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2023-02-11  4:05 UTC (permalink / raw)
  To: Leah Rumancik; +Cc: Amir Goldstein, linux-xfs, chandan.babu, Christian Brauner

On Fri, Feb 10, 2023 at 11:51:03AM -0800, Leah Rumancik wrote:
> On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote:
> > On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote:
> > > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
> > > >
> > > > Hello again,
> > > >
> > > > Here is the next batch of backports for 5.15.y. Testing included
> > > > 25 runs of auto group on 12 xfs configs. No regressions were seen.
> > > > I checked xfs/538 was run without issue as this test was mentioned
> > > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with
> > > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was
> > > > unable to reproduce the issue.
> > > 
> > > Did you find any tests that started to pass or whose failure rate reduced?
> > 
> > I wish Leah had, but there basically aren't any tests for the problems
> > fixed in this set for her to find. :(
> > 
> > The first two patches I think were from when Dave was working on log
> > intent whiteouts, turned on KASAN to diagnose some other problem he had,
> > and began pulling on the ball of string (as it were) as he noticed other
> > things in the codebase.  We don't usually bother with regression tests
> > for kernel memory leaks, since they're not so easy to reproduce.
> > 
> > Patches 3-6 are fixes for a rash of fuzzer reports that someone in China
> > posted last May:
> > https://bugzilla.kernel.org/show_bug.cgi?id=215927
> > 
> > (There are more than just that one)
> > 
> > As usual, the submitter didn't bother to help triage and just dumped a
> > ton of work in our laps.  They didn't follow up with any regression
> > tests, because few fuzz kiddiez ever do.  At the time, I was too burned
> > out to deal with it, so Dave posted fixes.
> > 
> > Patches 7-8 would manifest themselves as test VMs halting on ASSERTs if
> > you configure your kernel to panic.  Not strictly needed since most LTS
> > kernels probably don't even have XFS_DEBUG=y, but it makes the lives of
> > recoveryloop runners easier if they do.
> > 
> > Patch 9 trips xfs/434 and xfs/436, but they only run if you have slab
> > debugging enabled and build XFS as a module.  I don't know why very few
> > people do this.
> > 
> > Patch 10 is a memory leak if you have XFS_DEBUG=y.  No need for a
> > separate test for this one, since kmemleak catches it.  If you turn it
> > on.
> > 
> > (IOWs, LGTM for the whole set.)
> 
> Good to add Ack tag?

Oh, yes, sorry, I forgot about the formal tagging bit:

Acked-by: Darrick J. Wong <djwong@kernel.org>

--D


> > 
> > > 
> > > Leah, please consider working on the SGID bug fixes for the next 5.15
> > > update, because my 5.10 SGID fixes series [1] has been blocked for
> > > months and because there are several reproducible test cases in xfstest.
> > 
> > Whenever y'all get to 6.0, beware of commit 2ed5b09b3e8f ("xfs: make
> > inode attribute forks a permanent part of struct xfs_inode"), which is a
> > KASAN UAF that we fixed by eliminating the 'F'.  Do not pull on the
> > devilstring ("...the file capabilities code calls getxattr with and
> > without i_rwsem held...") if you can avoid it.
> > 
> > > I did not push on that until now because SGID test expectations were
> > > a moving target, but since xfstests commit 81e6f628 ("generic: update
> > > setgid tests") in this week's xfstests release, I think that tests should be
> > > stable and we can finally start backporting all relevant SGID fixes to
> > > align the SGID behavior of LTS kernels with that of upstream.
> 
> Ooo goody, ok, will do this next.
> 
> The following patches are on my radar to look into for this set. I have
> yet to look into dependencies, so the set may grow. If the sgid tests
> still fail after these ptaches, I will continue hunting for more fixes
> to include in this set.
> 
>   e014f37db1a2 xfs: use setattr_copy to set vfs inode attributes
>   472c6e46f589 xfs: remove XFS_PREALLOC_SYNC
>   fbe7e5200365 xfs: fallocate() should call file_modified()
>   0b02c8c0d75a xfs: set prealloc flag in xfs_alloc_file_space()
>   2b3416ceff5e fs: add mode_strip_sgid() helper
>   1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
>   ed5a7047d201 attr: use consistent sgid stripping checks
>   8d84e39d76bd fs: use consistent setgid checks in is_sxid()
> 
> In addition to the normal regression testing, I will specifically look
> at the following tests for the sgid changes:
> 
>   generic/673
>   generic/68[3-7]
>   generic/69[6-7]
> 
> I will also do some extra runs on the entire perms group.
> 
> Let me know if you think something should be dropped or added.
> 
> - Leah
> 
> > 
> > Oh good, I've been (gently) waiting on that one too. :)
> > 
> > --D
> > 
> > > Thanks,
> > > Amir.
> > > 
> > > [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes
> > > 
> > > >
> > > > Below I've outlined which series the backports came from:
> > > >
> > > > series "xfs: intent whiteouts" (1):
> > > > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d
> > > >     xfs: zero inode fork buffer at allocation
> > > > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301
> > > >     xfs: fix potential log item leak
> > > >
> > > > series "xfs: fix random format verification issues" (2):
> > > > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a
> > > >     xfs: detect self referencing btree sibling pointers
> > > > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y
> > > >     xfs: validate inode fork size against fork format
> > > > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f
> > > >     xfs: set XFS_FEAT_NLINK correctly
> > > > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f
> > > >     xfs: validate v5 feature fields
> > > >
> > > > series "xfs: small fixes for 5.19 cycle" (3):
> > > > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84
> > > >     xfs: avoid unnecessary runtime sibling pointer endian conversions
> > > > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea
> > > >     fs: don't assert fail on perag references on teardown
> > > > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f
> > > >     xfs: assert in xfs_btree_del_cursor should take into account error
> > > >
> > > > series "xfs: random fixes for 5.19" (4):
> > > > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb
> > > >     xfs: purge dquots after inode walk fails during quotacheck
> > > > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d
> > > >     xfs: don't leak btree cursor when insrec fails after a split
> > > >
> > > > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/
> > > > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/
> > > > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/
> > > > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/
> > > >
> > > > Darrick J. Wong (2):
> > > >   xfs: purge dquots after inode walk fails during quotacheck
> > > >   xfs: don't leak btree cursor when insrec fails after a split
> > > >
> > > > Dave Chinner (8):
> > > >   xfs: zero inode fork buffer at allocation
> > > >   xfs: fix potential log item leak
> > > >   xfs: detect self referencing btree sibling pointers
> > > >   xfs: set XFS_FEAT_NLINK correctly
> > > >   xfs: validate v5 feature fields
> > > >   xfs: avoid unnecessary runtime sibling pointer endian conversions
> > > >   xfs: don't assert fail on perag references on teardown
> > > >   xfs: assert in xfs_btree_del_cursor should take into account error
> > > >
> > > >  fs/xfs/libxfs/xfs_ag.c         |   3 +-
> > > >  fs/xfs/libxfs/xfs_btree.c      | 175 +++++++++++++++++++++++++--------
> > > >  fs/xfs/libxfs/xfs_inode_fork.c |  12 ++-
> > > >  fs/xfs/libxfs/xfs_sb.c         |  70 +++++++++++--
> > > >  fs/xfs/xfs_bmap_item.c         |   2 +
> > > >  fs/xfs/xfs_icreate_item.c      |   1 +
> > > >  fs/xfs/xfs_qm.c                |   9 +-
> > > >  fs/xfs/xfs_refcount_item.c     |   2 +
> > > >  fs/xfs/xfs_rmap_item.c         |   2 +
> > > >  9 files changed, 221 insertions(+), 55 deletions(-)
> > > >
> > > > --
> > > > 2.39.1.519.gcb327c4b5f-goog
> > > >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15
  2023-02-10 19:51     ` Leah Rumancik
  2023-02-11  4:05       ` Darrick J. Wong
@ 2023-02-11  8:33       ` Amir Goldstein
  1 sibling, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2023-02-11  8:33 UTC (permalink / raw)
  To: Leah Rumancik; +Cc: Darrick J. Wong, linux-xfs, chandan.babu, Christian Brauner

On Fri, Feb 10, 2023 at 9:51 PM Leah Rumancik <leah.rumancik@gmail.com> wrote:
>
[...]
> > > I did not push on that until now because SGID test expectations were
> > > a moving target, but since xfstests commit 81e6f628 ("generic: update
> > > setgid tests") in this week's xfstests release, I think that tests should be
> > > stable and we can finally start backporting all relevant SGID fixes to
> > > align the SGID behavior of LTS kernels with that of upstream.
>
> Ooo goody, ok, will do this next.
>
> The following patches are on my radar to look into for this set. I have
> yet to look into dependencies, so the set may grow. If the sgid tests
> still fail after these ptaches, I will continue hunting for more fixes
> to include in this set.
>
>   e014f37db1a2 xfs: use setattr_copy to set vfs inode attributes
>   472c6e46f589 xfs: remove XFS_PREALLOC_SYNC
>   fbe7e5200365 xfs: fallocate() should call file_modified()
>   0b02c8c0d75a xfs: set prealloc flag in xfs_alloc_file_space()
>   2b3416ceff5e fs: add mode_strip_sgid() helper
>   1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
>   ed5a7047d201 attr: use consistent sgid stripping checks
>   8d84e39d76bd fs: use consistent setgid checks in is_sxid()
>
> In addition to the normal regression testing, I will specifically look
> at the following tests for the sgid changes:
>
>   generic/673
>   generic/68[3-7]
>   generic/69[6-7]
>
> I will also do some extra runs on the entire perms group.
>
> Let me know if you think something should be dropped or added.
>

I reckon you will need those dependency prep commits from
Christian's PR [1]:

11c2a8700cdc attr: add in_group_or_capable()
e243e3f94c80 fs: move should_remove_suid()
72ae017c5451 attr: add setattr_should_drop_sgid()

FYI, the ovl commits from this PR are independent fixes that were
already applied to 5.15.y.

Thanks,
Amir.

[1] https://lore.kernel.org/linux-fsdevel/20221212112053.99208-1-brauner@kernel.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-02-11  8:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-08 17:52 [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 01/10] xfs: zero inode fork buffer at allocation Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 02/10] xfs: fix potential log item leak Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 03/10] xfs: detect self referencing btree sibling pointers Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 04/10] xfs: set XFS_FEAT_NLINK correctly Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 05/10] xfs: validate v5 feature fields Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 06/10] xfs: avoid unnecessary runtime sibling pointer endian conversions Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 07/10] xfs: don't assert fail on perag references on teardown Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 08/10] xfs: assert in xfs_btree_del_cursor should take into account error Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 09/10] xfs: purge dquots after inode walk fails during quotacheck Leah Rumancik
2023-02-08 17:52 ` [PATCH 5.15 CANDIDATE 10/10] xfs: don't leak btree cursor when insrec fails after a split Leah Rumancik
2023-02-08 19:02 ` [PATCH 5.15 CANDIDATE 00/10] more xfs fixes for 5.15 Amir Goldstein
2023-02-08 19:40   ` Darrick J. Wong
2023-02-08 22:21     ` Dave Chinner
2023-02-09  8:08       ` Amir Goldstein
2023-02-10 19:51     ` Leah Rumancik
2023-02-11  4:05       ` Darrick J. Wong
2023-02-11  8:33       ` Amir Goldstein
2023-02-10 13:11   ` Chandan Babu R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox