All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Omar Sandoval <osandov@osandov.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Transaction log reservation overrun when fallocating realtime file
Date: Tue, 26 Nov 2019 16:34:26 -0800	[thread overview]
Message-ID: <20191127003426.GP6219@magnolia> (raw)
In-Reply-To: <20191126202714.GA667580@vader>

On Tue, Nov 26, 2019 at 12:27:14PM -0800, Omar Sandoval wrote:
> Hello,
> 
> The following reproducer results in a transaction log overrun warning
> for me:
> 
>   mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb
>   mount -o rtdev=/dev/vdc /dev/vdb /mnt
>   fallocate -l 4G /mnt/foo
> 
> I've attached the full dmesg output. My guess at the problem is that the
> tr_write reservation used by xfs_alloc_file_space is not taking the realtime
> bitmap and realtime summary inodes into account (inode numbers 129 and 130 on
> this filesystem, which I do see in some of the log items). However, I'm not
> familiar enough with the XFS transaction guts to confidently fix this. Can
> someone please help me out?

Hmm...

/*
 * In a write transaction we can allocate a maximum of 2
 * extents.  This gives:
 *    the inode getting the new extents: inode size
 *    the inode's bmap btree: max depth * block size
 *    the agfs of the ags from which the extents are allocated: 2 * sector
 *    the superblock free block counter: sector size
 *    the allocation btrees: 2 exts * 2 trees * (2 * max depth - 1) * block size
 * And the bmap_finish transaction can free bmap blocks in a join:
 *    the agfs of the ags containing the blocks: 2 * sector size
 *    the agfls of the ags containing the blocks: 2 * sector size
 *    the super block free block counter: sector size
 *    the allocation btrees: 2 exts * 2 trees * (2 * max depth - 1) * block size
 */
STATIC uint
xfs_calc_write_reservation(...);

So this means that the rt allocator can burn through at most ...
1 ext * 2 trees * (2 * maxdepth - 1) * blocksize
... worth of log reservation as part of setting bits in the rtbitmap and
fiddling with the rtsummary information.

Instead, 4GB of 4k rt extents == 1 million rtexts to mark in use, which
is 131072 bytes of rtbitmap to log, and *kaboom* there goes the 109K log
reservation.

So I think you're right, and the fix is probably? to cap ralen further
in xfs_bmap_rtalloc().  Does the following patch fix it?

--D

From: Darrick J. Wong <darrick.wong@oracle.com>

xfs: cap realtime allocation length to something we can log

Omar Sandoval reported that a 4G fallocate on the realtime device causes
filesystem shutdowns due to a log reservation overflow that happens when
we log the rtbitmap updates.

The tr_write transaction reserves enough log reservation to handle a
full splits of both free space btrees, so cap the rt allocation at that
number of bits.

"The following reproducer results in a transaction log overrun warning
for me:

    mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb
    mount -o rtdev=/dev/vdc /dev/vdb /mnt
    fallocate -l 4G /mnt/foo

Reported-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_util.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 49d7b530c8f7..15c4e2790de3 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -69,6 +69,26 @@ xfs_zero_extent(
 }
 
 #ifdef CONFIG_XFS_RT
+/*
+ * tr_write allows for one full split in the bnobt and cntbt to record the
+ * allocation, and that's how many bits of rtbitmap we can log to the
+ * transaction.  We leave one full block's worth of log space to handle the
+ * rtsummary update, though that's probably overkill.
+ */
+static inline uint64_t
+xfs_bmap_rtalloc_max(
+	struct xfs_mount	*mp)
+{
+	uint64_t		max_rtbitmap;
+
+	max_rtbitmap = xfs_allocfree_log_count(mp, 1) - 1;
+	max_rtbitmap *= XFS_FSB_TO_B(mp, 1);
+	max_rtbitmap *= NBBY;
+	max_rtbitmap *= mp->m_sb.sb_rextsize;
+
+	return max_rtbitmap;
+}
+
 int
 xfs_bmap_rtalloc(
 	struct xfs_bmalloca	*ap)	/* bmap alloc argument struct */
@@ -113,6 +133,9 @@ xfs_bmap_rtalloc(
 	if (ralen * mp->m_sb.sb_rextsize >= MAXEXTLEN)
 		ralen = MAXEXTLEN / mp->m_sb.sb_rextsize;
 
+	/* Don't allocate so much that we blow out the log reservation. */
+	ralen = min_t(uint64_t, ralen, xfs_bmap_rtalloc_max(mp));
+
 	/*
 	 * Lock out modifications to both the RT bitmap and summary inodes
 	 */

  reply	other threads:[~2019-11-27  0:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-26 20:27 Transaction log reservation overrun when fallocating realtime file Omar Sandoval
2019-11-27  0:34 ` Darrick J. Wong [this message]
2019-12-02 19:32   ` Omar Sandoval
2019-12-02 21:51   ` Dave Chinner
2019-12-03  2:45     ` Darrick J. Wong
2019-12-03 21:31       ` Dave Chinner
2019-12-04 16:31         ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191127003426.GP6219@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=osandov@osandov.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.