public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, david@fromorbit.com
Subject: Re: [PATCH 3/6] xfs: don't include bnobt blocks when reserving free block pool
Date: Fri, 18 Mar 2022 08:18:37 -0400	[thread overview]
Message-ID: <YjR4nWL9RXOq1mDi@bfoster> (raw)
In-Reply-To: <164755207216.4194202.19795257360716142.stgit@magnolia>

On Thu, Mar 17, 2022 at 02:21:12PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> xfs_reserve_blocks controls the size of the user-visible free space
> reserve pool.  Given the difference between the current and requested
> pool sizes, it will try to reserve free space from fdblocks.  However,
> the amount requested from fdblocks is also constrained by the amount of
> space that we think xfs_mod_fdblocks will give us.  We'll keep trying to
> reserve space so long as xfs_mod_fdblocks returns ENOSPC.
> 
> In commit fd43cf600cf6, we decided that xfs_mod_fdblocks should not hand
> out the "free space" used by the free space btrees, because some portion
> of the free space btrees hold in reserve space for future btree
> expansion.  Unfortunately, xfs_reserve_blocks' estimation of the number
> of blocks that it could request from xfs_mod_fdblocks was not updated to
> include m_allocbt_blks, so if space is extremely low, the caller hangs.
> 
> Fix this by creating a function to estimate the number of blocks that
> can be reserved from fdblocks, which needs to exclude the set-aside and
> m_allocbt_blks.
> 
> Found by running xfs/306 (which formats a single-AG 20MB filesystem)
> with an fstests configuration that specifies a 1k blocksize and a
> specially crafted log size that will consume 7/8 of the space (17920
> blocks, specifically) in that AG.
> 
> Cc: Brian Foster <bfoster@redhat.com>
> Fixes: fd43cf600cf6 ("xfs: set aside allocation btree blocks from block reservation")
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_fsops.c |    7 +++++--
>  fs/xfs/xfs_mount.h |   29 +++++++++++++++++++++++++++++
>  2 files changed, 34 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 33e26690a8c4..b71799a3acd3 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -433,8 +433,11 @@ xfs_reserve_blocks(
>  	 */
>  	error = -ENOSPC;
>  	do {
> -		free = percpu_counter_sum(&mp->m_fdblocks) -
> -						mp->m_alloc_set_aside;
> +		/*
> +		 * The reservation pool cannot take space that xfs_mod_fdblocks
> +		 * will not give us.
> +		 */

This comment seems unnecessary. I'm not sure what this is telling that
the code doesn't already..?

> +		free = xfs_fdblocks_available(mp);
>  		if (free <= 0)
>  			break;
>  
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index 00720a02e761..998b54c3c454 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -479,6 +479,35 @@ extern void	xfs_unmountfs(xfs_mount_t *);
>   */
>  #define XFS_FDBLOCKS_BATCH	1024
>  
> +/*
> + * Estimate the amount of space that xfs_mod_fdblocks might give us without
> + * drawing from the reservation pool.  In other words, estimate the free space
> + * that is available to userspace.
> + *
> + * This quantity is the amount of free space tracked in the on-disk metadata
> + * minus:
> + *
> + * - Delayed allocation reservations
> + * - Per-AG space reservations to guarantee metadata expansion
> + * - Userspace-controlled free space reserve pool
> + *
> + * - Space reserved to ensure that we can always split a bmap btree
> + * - Free space btree blocks that are not available for allocation due to
> + *   per-AG metadata reservations
> + *
> + * The first three are captured in the incore fdblocks counter.
> + */

Hm. Sometimes I wonder if we overdocument things to our own detriment
(reading back my own comments at times suggests I'm terrible at this).
So do we really need to document what other internal reservations are or
are not taken out of ->m_fdblocks here..? I suspect we already have
plenty of sufficient documentation for things like perag res colocated
with the actual code, such that this kind of thing just creates an
external reference that will probably just bitrot as years go by. Can we
reduce this down to just explain how/why this helper has to calculate a
block availability value for blocks that otherwise haven't been
explicitly allocated out of the in-core free block counters?

> +static inline int64_t
> +xfs_fdblocks_available(
> +	struct xfs_mount	*mp)
> +{
> +	int64_t			free = percpu_counter_sum(&mp->m_fdblocks);
> +
> +	free -= mp->m_alloc_set_aside;
> +	free -= atomic64_read(&mp->m_allocbt_blks);
> +	return free;
> +}
> +

FWIW the helper seems fine in context, but will this help us avoid the
duplicate calculation in xfs_mod_fdblocks(), for instance?

Brian

>  extern int	xfs_mod_fdblocks(struct xfs_mount *mp, int64_t delta,
>  				 bool reserved);
>  extern int	xfs_mod_frextents(struct xfs_mount *mp, int64_t delta);
> 


  reply	other threads:[~2022-03-18 12:18 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17 21:20 [PATCHSET v2 0/6] xfs: fix incorrect reserve pool calculations and reporting Darrick J. Wong
2022-03-17 21:21 ` [PATCH 1/6] xfs: document the XFS_ALLOC_AGFL_RESERVE constant Darrick J. Wong
2022-03-18 12:17   ` Brian Foster
2022-03-17 21:21 ` [PATCH 2/6] xfs: actually set aside enough space to handle a bmbt split Darrick J. Wong
2022-03-18 12:17   ` Brian Foster
2022-03-18 20:52     ` Darrick J. Wong
2022-03-17 21:21 ` [PATCH 3/6] xfs: don't include bnobt blocks when reserving free block pool Darrick J. Wong
2022-03-18 12:18   ` Brian Foster [this message]
2022-03-18 21:01     ` Darrick J. Wong
2022-03-17 21:21 ` [PATCH 4/6] xfs: fix infinite loop " Darrick J. Wong
2022-03-18 12:18   ` Brian Foster
2022-03-17 21:21 ` [PATCH 5/6] xfs: don't report reserved bnobt space as available Darrick J. Wong
2022-03-18 12:19   ` Brian Foster
2022-03-18 21:19     ` Darrick J. Wong
2022-03-17 21:21 ` [PATCH 6/6] xfs: rename "alloc_set_aside" to be more descriptive Darrick J. Wong
2022-03-18 12:21   ` Brian Foster
  -- strict thread matches above, loose matches on Subject: below --
2022-03-20 16:43 [PATCHSET v3 0/6] xfs: fix incorrect reserve pool calculations and reporting Darrick J. Wong
2022-03-20 16:43 ` [PATCH 3/6] xfs: don't include bnobt blocks when reserving free block pool Darrick J. Wong
2022-03-21 15:22   ` Brian Foster
2022-03-21 20:42     ` Darrick J. Wong
2022-03-23 20:51   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjR4nWL9RXOq1mDi@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox