All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chandan Babu R <chandanrlinux@gmail.com>
To: linux-xfs@vger.kernel.org
Cc: Chandan Babu R <chandanrlinux@gmail.com>,
	david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com,
	hch@infradead.org
Subject: [PATCH 3/7] xfs: Compute maximum height of directory BMBT separately
Date: Sat,  6 Jun 2020 13:57:41 +0530	[thread overview]
Message-ID: <20200606082745.15174-4-chandanrlinux@gmail.com> (raw)
In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com>

xfs/306 causes the following call trace when using a data fork with a
maximum extent count of 2^47,

 XFS (loop0): Mounting V5 Filesystem
 XFS (loop0): Log size 8906 blocks too small, minimum size is 9075 blocks
 XFS (loop0): AAIEEE! Log failed size checks. Abort!
 XFS: Assertion failed: 0, file: fs/xfs/xfs_log.c, line: 711
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 12821 at fs/xfs/xfs_message.c:112 assfail+0x25/0x28
 Modules linked in:
 CPU: 0 PID: 12821 Comm: mount Tainted: G        W         5.6.0-rc6-next-20200320-chandan-00003-g071c2af3f4de #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
 RIP: 0010:assfail+0x25/0x28
 Code: ff ff 0f 0b c3 0f 1f 44 00 00 41 89 c8 48 89 d1 48 89 f2 48 c7 c6 40 b7 4b b3 e8 82 f9 ff ff 80 3d 83 d6 64 01 00 74 02 0f $
 RSP: 0018:ffffb05b414cbd78 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff9d9d501d5000 RCX: 0000000000000000
 RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffb346dc65
 RBP: ffff9da444b49a80 R08: 0000000000000000 R09: 0000000000000000
 R10: 000000000000000a R11: f000000000000000 R12: 00000000ffffffea
 R13: 000000000000000e R14: 0000000000004594 R15: ffff9d9d501d5628
 FS:  00007fd6c5d17c80(0000) GS:ffff9da44d800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000002 CR3: 00000008a48c0000 CR4: 00000000000006f0
 Call Trace:
  xfs_log_mount+0xf8/0x300
  xfs_mountfs+0x46e/0x950
  xfs_fc_fill_super+0x318/0x510
  ? xfs_mount_free+0x30/0x30
  get_tree_bdev+0x15c/0x250
  vfs_get_tree+0x25/0xb0
  do_mount+0x740/0x9b0
  ? memdup_user+0x41/0x80
  __x64_sys_mount+0x8e/0xd0
  do_syscall_64+0x48/0x110
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fd6c5f2ccda
 Code: 48 8b 0d b9 e1 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f $
 RSP: 002b:00007ffe00dfb9f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
 RAX: ffffffffffffffda RBX: 0000560c1aaa92c0 RCX: 00007fd6c5f2ccda
 RDX: 0000560c1aaae110 RSI: 0000560c1aaad040 RDI: 0000560c1aaa94d0
 RBP: 00007fd6c607d204 R08: 0000000000000000 R09: 0000560c1aaadde0
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000560c1aaa94d0 R15: 0000560c1aaae110
 ---[ end trace 6436391b468bc652 ]---
 XFS (loop0): log mount failed

The corresponding filesystem was created using mkfs options
"-m rmapbt=1,reflink=1 -b size=1k -d size=20m -n size=64k".

i.e. We have a filesystem of size 20MiB, data block size of 1KiB and
directory block size of 64KiB. Filesystems of size < 1GiB can have less
than 10MiB on-disk log (Please refer to calculate_log_size() in
xfsprogs).

The largest reservation space was contributed by the rename
operation. The corresponding calculation is done inside
xfs_calc_rename_reservation(). In this case, the value returned by this
function is,

xfs_calc_inode_res(mp, 4)
+ xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1))

xfs_calc_inode_res(mp, 4) returns a constant value of 3040 bytes
regardless of the maximum data fork extent count.

The largest contribution to the rename operation was by "2 *
XFS_DIROP_LOG_COUNT(mp)" and it is a function of maximum height of a
directory's BMBT tree.

XFS_DIROP_LOG_COUNT() is a sum of,

1. The maximum number of dabtree blocks that needs to be logged
   i.e. XFS_DAENTER_BLOCKS() = XFS_DAENTER_1B(mp,w) *
   XFS_DAENTER_DBS(mp,w).  For directories, this evaluates
   to (64 * (XFS_DA_NODE_MAXDEPTH + 2)) = (64 * (5 + 2)) = 448.

2. The corresponding maximum number of BMBT blocks that needs to be
   logged i.e. XFS_DAENTER_BMAPS() = XFS_DAENTER_DBS(mp,w) *
   XFS_DAENTER_BMAP1B(mp,w)

   XFS_DAENTER_DBS(mp,w) = XFS_DA_NODE_MAXDEPTH + 2 = 7

   XFS_DAENTER_BMAP1B(mp,w)
   = XFS_NEXTENTADD_SPACE_RES(mp, XFS_DAENTER_1B(mp, w), w)
   = XFS_NEXTENTADD_SPACE_RES(mp, 64, w)
   = ((64 + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) /
   XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * XFS_EXTENTADD_SPACE_RES(mp, w)

   XFS_MAX_CONTIG_EXTENTS_PER_BLOCK() =
   mp->m_alloc_mxr[0] - mp->m_alloc_mnr[0] = 121 - 60 = 61

   XFS_DAENTER_BMAP1B(mp,w) =
   ((64 + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) /
   XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * XFS_EXTENTADD_SPACE_RES(mp, w)
   = ((64 + 61 - 1) / 61) * XFS_EXTENTADD_SPACE_RES(mp, w)
   = 2 * XFS_EXTENTADD_SPACE_RES(mp, w)
   = 2 * (XFS_BM_MAXLEVELS(mp,w) - 1)
   = 2 * (8 - 1)
   = 14

   With 2^32 as the maximum extent count the maximum height of the bmap btree
   was 7. Now with 2^47 maximum extent count, the height has increased to 8.

   Therefore, XFS_DAENTER_BMAPS() = 7 * 14 = 98.

XFS_DIROP_LOG_COUNT() = 448 + 98 = 546.
2 * XFS_DIROP_LOG_COUNT() = 2 * 546 = 1092.

With 2^32 max extent count, XFS_DIROP_LOG_COUNT() evaluates to
533. Hence 2 * XFS_DIROP_LOG_COUNT() = 2 * 533 = 1066.

This small difference of 1092 - 1066 = 26 fs blocks is sufficient to
trip us over the minimum log size check.

A future commit in this series will use 2^27 as the maximum directory
extent count. This will result in a shorter directory BMBT tree.  Log
reservation calculations that are applicable only to
directories (e.g. XFS_DIROP_LOG_COUNT()) can then choose this instead of
non-dir data fork BMBT height.

This commit introduces a new member in 'struct xfs_mount' to hold the
maximum BMBT height of a directory. At present, the maximum height of a
directory BMBT is the same as a the maximum height of a non-directory
BMBT. A future commit will change the parameters used as input for
computing the maximum height of a directory BMBT.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 17 ++++++++++++++---
 fs/xfs/libxfs/xfs_bmap.h |  3 ++-
 fs/xfs/xfs_mount.c       |  5 +++--
 fs/xfs/xfs_mount.h       |  1 +
 4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 798fca5c52af..01e2b543b139 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -50,7 +50,8 @@ kmem_zone_t		*xfs_bmap_free_item_zone;
 void
 xfs_bmap_compute_maxlevels(
 	xfs_mount_t	*mp,		/* file system mount structure */
-	int		whichfork)	/* data or attr fork */
+	int		whichfork,	/* data or attr fork */
+	int		dir_bmbt)	/* Dir or non-dir data fork */
 {
 	int		level;		/* btree level */
 	uint		maxblocks;	/* max blocks at this level */
@@ -60,6 +61,9 @@ xfs_bmap_compute_maxlevels(
 	int		minnoderecs;	/* min records in node block */
 	int		sz;		/* root block size */
 
+	if (whichfork == XFS_ATTR_FORK)
+		ASSERT(dir_bmbt == 0);
+
 	/*
 	 * The maximum number of extents in a file, hence the maximum number of
 	 * leaf entries, is controlled by the size of the on-disk extent count,
@@ -75,8 +79,11 @@ xfs_bmap_compute_maxlevels(
 	 * of a minimum size available.
 	 */
 	if (whichfork == XFS_DATA_FORK) {
-		maxleafents = MAXEXTNUM;
 		sz = XFS_BMDR_SPACE_CALC(MINDBTPTRS);
+		if (dir_bmbt)
+			maxleafents = MAXEXTNUM;
+		else
+			maxleafents = MAXEXTNUM;
 	} else {
 		maxleafents = MAXAEXTNUM;
 		sz = XFS_BMDR_SPACE_CALC(MINABTPTRS);
@@ -91,7 +98,11 @@ xfs_bmap_compute_maxlevels(
 		else
 			maxblocks = (maxblocks + minnoderecs - 1) / minnoderecs;
 	}
-	mp->m_bm_maxlevels[whichfork] = level;
+
+	if (whichfork == XFS_DATA_FORK && dir_bmbt)
+		mp->m_bm_dir_maxlevel = level;
+	else
+		mp->m_bm_maxlevels[whichfork] = level;
 }
 
 STATIC int				/* error */
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 6028a3c825ba..4250c9ab4b75 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -187,7 +187,8 @@ void	xfs_bmap_local_to_extents_empty(struct xfs_trans *tp,
 void	__xfs_bmap_add_free(struct xfs_trans *tp, xfs_fsblock_t bno,
 		xfs_filblks_t len, const struct xfs_owner_info *oinfo,
 		bool skip_discard);
-void	xfs_bmap_compute_maxlevels(struct xfs_mount *mp, int whichfork);
+void	xfs_bmap_compute_maxlevels(struct xfs_mount *mp, int whichfork,
+		int dir_bmbt);
 int	xfs_bmap_first_unused(struct xfs_trans *tp, struct xfs_inode *ip,
 		xfs_extlen_t len, xfs_fileoff_t *unused, int whichfork);
 int	xfs_bmap_last_before(struct xfs_trans *tp, struct xfs_inode *ip,
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index bb91f04266b9..d8ebfc67bb63 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -711,8 +711,9 @@ xfs_mountfs(
 		goto out;
 
 	xfs_alloc_compute_maxlevels(mp);
-	xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK);
-	xfs_bmap_compute_maxlevels(mp, XFS_ATTR_FORK);
+	xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK, 0);
+	xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK, 1);
+	xfs_bmap_compute_maxlevels(mp, XFS_ATTR_FORK, 0);
 	xfs_ialloc_setup_geometry(mp);
 	xfs_rmapbt_compute_maxlevels(mp);
 	xfs_refcountbt_compute_maxlevels(mp);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index aba5a1579279..9dbf036ddace 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -133,6 +133,7 @@ typedef struct xfs_mount {
 	uint			m_refc_mnr[2];	/* min refc btree records */
 	uint			m_ag_maxlevels;	/* XFS_AG_MAXLEVELS */
 	uint			m_bm_maxlevels[2]; /* XFS_BM_MAXLEVELS */
+	uint			m_bm_dir_maxlevel;
 	uint			m_rmap_maxlevels; /* max rmap btree levels */
 	uint			m_refc_maxlevels; /* max refcount btree level */
 	xfs_extlen_t		m_ag_prealloc_blocks; /* reserved ag blocks */
-- 
2.20.1


  parent reply	other threads:[~2020-06-06  8:28 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-06  8:27 [PATCH 0/7] xfs: Extend per-inode extent counters Chandan Babu R
2020-06-06  8:27 ` [PATCH 1/7] xfs: Fix log reservation calculation for xattr insert operation Chandan Babu R
2020-06-19 14:33   ` Christoph Hellwig
2020-06-20 12:53     ` Chandan Babu R
2020-06-06  8:27 ` [PATCH 2/7] xfs: Check for per-inode extent count overflow Chandan Babu R
2020-06-08 16:24   ` Darrick J. Wong
2020-06-08 16:32     ` Darrick J. Wong
2020-06-09 14:22       ` Chandan Babu R
2020-06-09 17:07         ` Darrick J. Wong
2020-06-10  6:24           ` Chandan Babu R
2020-06-09 14:22     ` Chandan Babu R
2020-06-09 17:10       ` Darrick J. Wong
2020-06-19 14:36         ` Christoph Hellwig
2020-06-19 21:31           ` Darrick J. Wong
2020-06-20 12:53           ` Chandan Babu R
2020-06-06  8:27 ` Chandan Babu R [this message]
2020-06-08 20:59   ` [PATCH 3/7] xfs: Compute maximum height of directory BMBT separately Darrick J. Wong
2020-06-09 14:23     ` Chandan Babu R
2020-06-09 18:40       ` Darrick J. Wong
2020-06-10  6:23         ` Chandan Babu R
2020-06-11  6:38           ` Chandan Babu R
2020-06-06  8:27 ` [PATCH 4/7] xfs: Add "Use Dir BMBT height" argument to XFS_BM_MAXLEVELS() Chandan Babu R
2020-06-08 17:50   ` Darrick J. Wong
2020-06-09 14:23     ` Chandan Babu R
2020-06-06  8:27 ` [PATCH 5/7] xfs: Use 2^27 as the maximum number of directory extents Chandan Babu R
2020-06-08 16:52   ` Darrick J. Wong
2020-06-09 14:23     ` Chandan Babu R
2020-06-06  8:27 ` [PATCH 6/7] xfs: Extend data extent counter to 47 bits Chandan Babu R
2020-06-08 17:14   ` Darrick J. Wong
2020-06-09 14:23     ` Chandan Babu R
2020-08-31 21:05       ` Darrick J. Wong
2020-06-19 14:38   ` Christoph Hellwig
2020-06-20 12:52     ` Chandan Babu R
2020-06-06  8:27 ` [PATCH 7/7] xfs: Extend attr extent counter to 32 bits Chandan Babu R
2020-06-08 17:21   ` Darrick J. Wong
2020-06-09 14:22     ` Chandan Babu R
2020-06-19 14:39   ` Christoph Hellwig
2020-06-20 12:53     ` Chandan Babu R
2020-06-08 17:31 ` [PATCH 0/7] xfs: Extend per-inode extent counters Darrick J. Wong
2020-06-09 14:22   ` Chandan Babu R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200606082745.15174-4-chandanrlinux@gmail.com \
    --to=chandanrlinux@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.