From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 21/23] xfs: clean up and simplify xfs_dialloc()
Date: Thu, 27 May 2021 07:20:20 -0400 [thread overview]
Message-ID: <YK+AdHtinZckZP3I@bfoster> (raw)
In-Reply-To: <20210519012102.450926-22-david@fromorbit.com>
On Wed, May 19, 2021 at 11:21:00AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Because it's a mess.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/libxfs/xfs_ialloc.c | 268 +++++++++++++++++++++----------------
> 1 file changed, 153 insertions(+), 115 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> index 88c15d8bd6e4..63a1b6d422cc 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.c
> +++ b/fs/xfs/libxfs/xfs_ialloc.c
...
> @@ -1590,16 +1594,136 @@ xfs_ialloc_next_ag(
> return agno;
> }
>
> +static bool
> +xfs_dialloc_good_ag(
> + struct xfs_trans *tp,
> + struct xfs_perag *pag,
> + umode_t mode,
> + int flags,
> + bool ok_alloc)
> +{
> + struct xfs_mount *mp = tp->t_mountp;
> + xfs_extlen_t ineed;
> + xfs_extlen_t longest = 0;
> + int needspace;
> + int error;
> +
> + if (!pag->pagi_inodeok)
> + return false;
> +
> + if (!pag->pagi_init) {
> + error = xfs_ialloc_pagi_init(mp, tp, pag->pag_agno);
> + if (error)
> + return false;
> + }
> +
> + if (pag->pagi_freecount)
> + return true;
> + if (!ok_alloc)
> + return false;
It looks like this fixes the problem in the earlier patch (though the
earlier patch should still be fixed up):
Reviewed-by: Brian Foster <bfoster@redhat.com>
> +
> + if (!pag->pagf_init) {
> + error = xfs_alloc_pagf_init(mp, tp, pag->pag_agno, flags);
> + if (error)
> + return false;
> + }
> +
> + /*
> + * Check that there is enough free space for the file plus a chunk of
> + * inodes if we need to allocate some. If this is the first pass across
> + * the AGs, take into account the potential space needed for alignment
> + * of inode chunks when checking the longest contiguous free space in
> + * the AG - this prevents us from getting ENOSPC because we have free
> + * space larger than ialloc_blks but alignment constraints prevent us
> + * from using it.
> + *
> + * If we can't find an AG with space for full alignment slack to be
> + * taken into account, we must be near ENOSPC in all AGs. Hence we
> + * don't include alignment for the second pass and so if we fail
> + * allocation due to alignment issues then it is most likely a real
> + * ENOSPC condition.
> + *
> + * XXX(dgc): this calculation is now bogus thanks to the per-ag
> + * reservations that xfs_alloc_fix_freelist() now does via
> + * xfs_alloc_space_available(). When the AG fills up, pagf_freeblks will
> + * be more than large enough for the check below to succeed, but
> + * xfs_alloc_space_available() will fail because of the non-zero
> + * metadata reservation and hence we won't actually be able to allocate
> + * more inodes in this AG. We do soooo much unnecessary work near ENOSPC
> + * because of this.
> + */
> + ineed = M_IGEO(mp)->ialloc_min_blks;
> + if (flags && ineed > 1)
> + ineed += M_IGEO(mp)->cluster_align;
> + longest = pag->pagf_longest;
> + if (!longest)
> + longest = pag->pagf_flcount > 0;
> + needspace = S_ISDIR(mode) || S_ISREG(mode) || S_ISLNK(mode);
> +
> + if (pag->pagf_freeblks < needspace + ineed || longest < ineed)
> + return false;
> + return true;
> +}
> +
> +static int
> +xfs_dialloc_try_ag(
> + struct xfs_trans **tpp,
> + struct xfs_perag *pag,
> + xfs_ino_t parent,
> + xfs_ino_t *new_ino,
> + bool ok_alloc)
> +{
> + struct xfs_buf *agbp;
> + xfs_ino_t ino;
> + int error;
> +
> + /*
> + * Then read in the AGI buffer and recheck with the AGI buffer
> + * lock held.
> + */
> + error = xfs_ialloc_read_agi(pag->pag_mount, *tpp, pag->pag_agno, &agbp);
> + if (error)
> + return error;
> +
> + if (!pag->pagi_freecount) {
> + if (!ok_alloc) {
> + error = -EAGAIN;
> + goto out_release;
> + }
> +
> + error = xfs_ialloc_ag_alloc(*tpp, agbp, pag);
> + if (error < 0)
> + goto out_release;
> +
> + /*
> + * We successfully allocated space for an inode cluster in this
> + * AG. Roll the transaction so that we can allocate one of the
> + * new inodes.
> + */
> + ASSERT(pag->pagi_freecount > 0);
> + error = xfs_dialloc_roll(tpp, agbp);
> + if (error)
> + goto out_release;
> + }
> +
> + /* Allocate an inode in the found AG */
> + error = xfs_dialloc_ag(*tpp, agbp, pag, parent, &ino);
> + if (!error)
> + *new_ino = ino;
> + return error;
> +
> +out_release:
> + xfs_trans_brelse(*tpp, agbp);
> + return error;
> +}
> +
> /*
> - * Select and prepare an AG for inode allocation.
> + * Allocate an on-disk inode.
> *
> * Mode is used to tell whether the new inode is a directory and hence where to
> - * locate it.
> - *
> - * This function will ensure that the selected AG has free inodes available to
> - * allocate from. The selected AGI will be returned locked to the caller, and it
> - * will allocate more free inodes if required. If no free inodes are found or
> - * can be allocated, -ENOSPC be returned.
> + * locate it. The on-disk inode that is allocated will be returned in @new_ino
> + * on success, otherwise an error will be set to indicate the failure (e.g.
> + * -ENOSPC).
> */
> int
> xfs_dialloc(
> @@ -1609,14 +1733,12 @@ xfs_dialloc(
> xfs_ino_t *new_ino)
> {
> struct xfs_mount *mp = (*tpp)->t_mountp;
> - struct xfs_buf *agbp;
> xfs_agnumber_t agno;
> int error = 0;
> xfs_agnumber_t start_agno;
> struct xfs_perag *pag;
> struct xfs_ino_geometry *igeo = M_IGEO(mp);
> - bool okalloc = true;
> - int needspace;
> + bool ok_alloc = true;
> int flags;
> xfs_ino_t ino;
>
> @@ -1625,7 +1747,6 @@ xfs_dialloc(
> * one block, so factor that potential expansion when we examine whether
> * an AG has enough space for file creation.
> */
> - needspace = S_ISDIR(mode) || S_ISREG(mode) || S_ISLNK(mode);
> if (S_ISDIR(mode))
> start_agno = xfs_ialloc_next_ag(mp);
> else {
> @@ -1636,7 +1757,7 @@ xfs_dialloc(
>
> /*
> * If we have already hit the ceiling of inode blocks then clear
> - * okalloc so we scan all available agi structures for a free
> + * ok_alloc so we scan all available agi structures for a free
> * inode.
> *
> * Read rough value of mp->m_icount by percpu_counter_read_positive,
> @@ -1645,7 +1766,7 @@ xfs_dialloc(
> if (igeo->maxicount &&
> percpu_counter_read_positive(&mp->m_icount) + igeo->ialloc_inos
> > igeo->maxicount) {
> - okalloc = false;
> + ok_alloc = false;
> }
>
> /*
> @@ -1656,93 +1777,14 @@ xfs_dialloc(
> agno = start_agno;
> flags = XFS_ALLOC_FLAG_TRYLOCK;
> for (;;) {
> - xfs_extlen_t ineed;
> - xfs_extlen_t longest = 0;
> -
> pag = xfs_perag_get(mp, agno);
> - if (!pag->pagi_inodeok)
> - goto nextag;
> -
> - if (!pag->pagi_init) {
> - error = xfs_ialloc_pagi_init(mp, *tpp, agno);
> - if (error)
> - break;
> - }
> -
> - if (!pag->pagi_freecount && !okalloc)
> - goto nextag;
> -
> - if (!pag->pagf_init) {
> - error = xfs_alloc_pagf_init(mp, *tpp, agno, flags);
> - if (error)
> - goto nextag;
> - }
> -
> - /*
> - * Check that there is enough free space for the file plus a
> - * chunk of inodes if we need to allocate some. If this is the
> - * first pass across the AGs, take into account the potential
> - * space needed for alignment of inode chunks when checking the
> - * longest contiguous free space in the AG - this prevents us
> - * from getting ENOSPC because we have free space larger than
> - * ialloc_blks but alignment constraints prevent us from using
> - * it.
> - *
> - * If we can't find an AG with space for full alignment slack to
> - * be taken into account, we must be near ENOSPC in all AGs.
> - * Hence we don't include alignment for the second pass and so
> - * if we fail allocation due to alignment issues then it is most
> - * likely a real ENOSPC condition.
> - */
> - ineed = M_IGEO(mp)->ialloc_min_blks;
> - if (flags && ineed > 1)
> - ineed += M_IGEO(mp)->cluster_align;
> - longest = pag->pagf_longest;
> - if (!longest)
> - longest = pag->pagf_flcount > 0;
> -
> - if (pag->pagf_freeblks < needspace + ineed || longest < ineed)
> - goto nextag;
> -
> - /*
> - * Then read in the AGI buffer and recheck with the AGI buffer
> - * lock held.
> - */
> - error = xfs_ialloc_read_agi(mp, *tpp, agno, &agbp);
> - if (error)
> - break;
> -
> - if (pag->pagi_freecount)
> - goto found_ag;
> -
> - if (!okalloc)
> - goto nextag_relse_buffer;
> -
> - error = xfs_ialloc_ag_alloc(*tpp, agbp, pag);
> - if (error < 0) {
> - xfs_trans_brelse(*tpp, agbp);
> - break;
> - }
> -
> - if (error == 0) {
> - /*
> - * We successfully allocated space for an inode cluster
> - * in this AG. Roll the transaction so that we can
> - * allocate one of the new inodes.
> - */
> - ASSERT(pag->pagi_freecount > 0);
> -
> - error = xfs_dialloc_roll(tpp, agbp);
> - if (error) {
> - xfs_buf_relse(agbp);
> + if (xfs_dialloc_good_ag(*tpp, pag, mode, flags, ok_alloc)) {
> + error = xfs_dialloc_try_ag(tpp, pag, parent,
> + &ino, ok_alloc);
> + if (error != -EAGAIN)
> break;
> - }
> - goto found_ag;
> }
>
> -nextag_relse_buffer:
> - xfs_trans_brelse(*tpp, agbp);
> -nextag:
> if (XFS_FORCED_SHUTDOWN(mp)) {
> error = -EFSCORRUPTED;
> break;
> @@ -1750,23 +1792,19 @@ xfs_dialloc(
> if (++agno == mp->m_maxagi)
> agno = 0;
> if (agno == start_agno) {
> - if (!flags)
> + if (!flags) {
> + error = -ENOSPC;
> break;
> + }
> flags = 0;
> }
> xfs_perag_put(pag);
> }
>
> + if (!error)
> + *new_ino = ino;
> xfs_perag_put(pag);
> - return error ? error : -ENOSPC;
> -found_ag:
> - /* Allocate an inode in the found AG */
> - error = xfs_dialloc_ag(*tpp, agbp, pag, parent, &ino);
> - xfs_perag_put(pag);
> - if (error)
> - return error;
> - *new_ino = ino;
> - return 0;
> + return error;
> }
>
> /*
> --
> 2.31.1
>
next prev parent reply other threads:[~2021-05-27 11:20 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-19 1:20 [PATCH 00/23] xfs: initial agnumber -> perag conversions for shrink Dave Chinner
2021-05-19 1:20 ` [PATCH 01/23] xfs: move xfs_perag_get/put to xfs_ag.[ch] Dave Chinner
2021-05-19 1:20 ` [PATCH 02/23] xfs: prepare for moving perag definitions and support to libxfs Dave Chinner
2021-05-26 12:33 ` Brian Foster
2021-05-27 22:02 ` Darrick J. Wong
2021-05-19 1:20 ` [PATCH 03/23] xfs: move perag structure and setup to libxfs/xfs_ag.[ch] Dave Chinner
2021-05-26 12:33 ` Brian Foster
2021-05-27 22:07 ` Darrick J. Wong
2021-05-19 1:20 ` [PATCH 04/23] xfs: make for_each_perag... a first class citizen Dave Chinner
2021-05-26 12:33 ` Brian Foster
2021-05-27 22:10 ` Darrick J. Wong
2021-05-19 1:20 ` [PATCH 05/23] xfs: convert raw ag walks to use for_each_perag Dave Chinner
2021-05-26 12:33 ` Brian Foster
2021-05-27 22:12 ` Darrick J. Wong
2021-05-19 1:20 ` [PATCH 06/23] xfs: convert xfs_iwalk to use perag references Dave Chinner
2021-05-27 22:16 ` Darrick J. Wong
2021-06-01 22:00 ` [PATCH 06/23 V2] " Dave Chinner
2021-06-01 23:32 ` Darrick J. Wong
2021-06-02 0:34 ` Dave Chinner
2021-05-19 1:20 ` [PATCH 07/23] xfs: convert secondary superblock walk to use perags Dave Chinner
2021-05-19 1:20 ` [PATCH 08/23] xfs: pass perags through to the busy extent code Dave Chinner
2021-05-19 1:20 ` [PATCH 09/23] xfs: push perags through the ag reservation callouts Dave Chinner
2021-05-26 12:34 ` Brian Foster
2021-05-19 1:20 ` [PATCH 10/23] xfs: pass perags around in fsmap data dev functions Dave Chinner
2021-05-27 22:19 ` Darrick J. Wong
2021-05-19 1:20 ` [PATCH 11/23] xfs: add a perag to the btree cursor Dave Chinner
2021-05-26 12:34 ` Brian Foster
2021-05-19 1:20 ` [PATCH 12/23] xfs: convert rmap btree cursor to using a perag Dave Chinner
2021-05-19 1:20 ` [PATCH 13/23] xfs: convert refcount btree cursor to use perags Dave Chinner
2021-05-27 22:22 ` Darrick J. Wong
2021-05-19 1:20 ` [PATCH 14/23] xfs: convert allocbt cursors " Dave Chinner
2021-05-19 1:20 ` [PATCH 15/23] xfs: use perag for ialloc btree cursors Dave Chinner
2021-05-19 1:20 ` [PATCH 16/23] xfs: remove agno from btree cursor Dave Chinner
2021-05-19 1:20 ` [PATCH 17/23] xfs: simplify xfs_dialloc_select_ag() return values Dave Chinner
2021-05-19 1:20 ` [PATCH 18/23] xfs: collapse AG selection for inode allocation Dave Chinner
2021-05-27 11:19 ` Brian Foster
2021-06-01 22:40 ` [PATCH 18/23 V2] " Dave Chinner
2021-06-01 23:27 ` Darrick J. Wong
2021-05-27 22:33 ` [PATCH 18/23] " Darrick J. Wong
2021-05-19 1:20 ` [PATCH 19/23] xfs: get rid of xfs_dir_ialloc() Dave Chinner
2021-05-19 1:20 ` [PATCH 20/23] xfs: inode allocation can use a single perag instance Dave Chinner
2021-05-19 1:21 ` [PATCH 21/23] xfs: clean up and simplify xfs_dialloc() Dave Chinner
2021-05-27 11:20 ` Brian Foster [this message]
2021-05-27 22:39 ` Darrick J. Wong
2021-05-19 1:21 ` [PATCH 22/23] xfs: use perag through unlink processing Dave Chinner
2021-05-27 11:20 ` Brian Foster
2021-05-19 1:21 ` [PATCH 23/23] xfs: remove xfs_perag_t Dave Chinner
2021-05-27 11:20 ` Brian Foster
2021-05-27 22:06 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YK+AdHtinZckZP3I@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox