From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 24/30] xfs: rework stale inodes in xfs_ifree_cluster
Date: Fri, 5 Jun 2020 14:27:22 -0400 [thread overview]
Message-ID: <20200605182722.GH23747@bfoster> (raw)
In-Reply-To: <20200604074606.266213-25-david@fromorbit.com>
On Thu, Jun 04, 2020 at 05:46:00PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Once we have inodes pinning the cluster buffer and attached whenever
> they are dirty, we no longer have a guarantee that the items are
> flush locked when we lock the cluster buffer. Hence we cannot just
> walk the buffer log item list and modify the attached inodes.
>
> If the inode is not flush locked, we have to ILOCK it first and then
> flush lock it to do all the prerequisite checks needed to avoid
> races with other code. This is already handled by
> xfs_ifree_get_one_inode(), so rework the inode iteration loop and
> function to update all inodes in cache whether they are attached to
> the buffer or not.
>
> Note: we also remove the copying of the log item lsn to the
> ili_flush_lsn as xfs_iflush_done() now uses the XFS_ISTALE flag to
> trigger aborts and so flush lsn matching is not needed in IO
> completion for processing freed inodes.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_inode.c | 158 ++++++++++++++++++---------------------------
> 1 file changed, 62 insertions(+), 96 deletions(-)
>
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 272b54cf97000..fb4c614c64fda 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
...
> @@ -2559,43 +2563,53 @@ xfs_ifree_get_one_inode(
> */
> if (ip != free_ip) {
> if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) {
> + spin_unlock(&ip->i_flags_lock);
> rcu_read_unlock();
> delay(1);
> goto retry;
> }
> -
> - /*
> - * Check the inode number again in case we're racing with
> - * freeing in xfs_reclaim_inode(). See the comments in that
> - * function for more information as to why the initial check is
> - * not sufficient.
> - */
> - if (ip->i_ino != inum) {
> - xfs_iunlock(ip, XFS_ILOCK_EXCL);
> - goto out_rcu_unlock;
> - }
Why is the recheck under ILOCK_EXCL no longer necessary? It looks like
reclaim decides whether to proceed or not under the ilock and doesn't
acquire the spinlock until it decides to reclaim. Hm?
> }
> + ip->i_flags |= XFS_ISTALE;
> + spin_unlock(&ip->i_flags_lock);
> rcu_read_unlock();
>
> - xfs_iflock(ip);
> - xfs_iflags_set(ip, XFS_ISTALE);
> + /*
> + * If we can't get the flush lock, the inode is already attached. All
> + * we needed to do here is mark the inode stale so buffer IO completion
> + * will remove it from the AIL.
> + */
To make sure I'm following this correctly, we can assume the inode is
attached based on an iflock_nowait() failure because we hold the ilock,
right? IOW, any other task doing a similar iflock check would have to do
so under ilock and release the flush lock first if the inode didn't end
up flushed, for whatever reason.
> + iip = ip->i_itemp;
> + if (!xfs_iflock_nowait(ip)) {
> + ASSERT(!list_empty(&iip->ili_item.li_bio_list));
> + ASSERT(iip->ili_last_fields);
> + goto out_iunlock;
> + }
> + ASSERT(!iip || list_empty(&iip->ili_item.li_bio_list));
>
> /*
> - * We don't need to attach clean inodes or those only with unlogged
> - * changes (which we throw away, anyway).
> + * Clean inodes can be released immediately. Everything else has to go
> + * through xfs_iflush_abort() on journal commit as the flock
> + * synchronises removal of the inode from the cluster buffer against
> + * inode reclaim.
> */
> - if (!ip->i_itemp || xfs_inode_clean(ip)) {
> - ASSERT(ip != free_ip);
> + if (xfs_inode_clean(ip)) {
> xfs_ifunlock(ip);
> - xfs_iunlock(ip, XFS_ILOCK_EXCL);
> - goto out_no_inode;
> + goto out_iunlock;
> }
> - return ip;
>
> -out_rcu_unlock:
> - rcu_read_unlock();
> -out_no_inode:
> - return NULL;
> + /* we have a dirty inode in memory that has not yet been flushed. */
> + ASSERT(iip->ili_fields);
> + spin_lock(&iip->ili_lock);
> + iip->ili_last_fields = iip->ili_fields;
> + iip->ili_fields = 0;
> + iip->ili_fsync_fields = 0;
> + spin_unlock(&iip->ili_lock);
> + list_add_tail(&iip->ili_item.li_bio_list, &bp->b_li_list);
> + ASSERT(iip->ili_last_fields);
We already asserted ->ili_fields and assigned ->ili_fields to
->ili_last_fields, so this assert seems spurious.
Brian
> +
> +out_iunlock:
> + if (ip != free_ip)
> + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> }
>
> /*
> @@ -2605,26 +2619,20 @@ xfs_ifree_get_one_inode(
> */
> STATIC int
> xfs_ifree_cluster(
> - xfs_inode_t *free_ip,
> - xfs_trans_t *tp,
> + struct xfs_inode *free_ip,
> + struct xfs_trans *tp,
> struct xfs_icluster *xic)
> {
> - xfs_mount_t *mp = free_ip->i_mount;
> + struct xfs_mount *mp = free_ip->i_mount;
> + struct xfs_ino_geometry *igeo = M_IGEO(mp);
> + struct xfs_buf *bp;
> + xfs_daddr_t blkno;
> + xfs_ino_t inum = xic->first_ino;
> int nbufs;
> int i, j;
> int ioffset;
> - xfs_daddr_t blkno;
> - xfs_buf_t *bp;
> - xfs_inode_t *ip;
> - struct xfs_inode_log_item *iip;
> - struct xfs_log_item *lip;
> - struct xfs_perag *pag;
> - struct xfs_ino_geometry *igeo = M_IGEO(mp);
> - xfs_ino_t inum;
> int error;
>
> - inum = xic->first_ino;
> - pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, inum));
> nbufs = igeo->ialloc_blks / igeo->blocks_per_cluster;
>
> for (j = 0; j < nbufs; j++, inum += igeo->inodes_per_cluster) {
> @@ -2668,59 +2676,16 @@ xfs_ifree_cluster(
> bp->b_ops = &xfs_inode_buf_ops;
>
> /*
> - * Walk the inodes already attached to the buffer and mark them
> - * stale. These will all have the flush locks held, so an
> - * in-memory inode walk can't lock them. By marking them all
> - * stale first, we will not attempt to lock them in the loop
> - * below as the XFS_ISTALE flag will be set.
> - */
> - list_for_each_entry(lip, &bp->b_li_list, li_bio_list) {
> - if (lip->li_type == XFS_LI_INODE) {
> - iip = (struct xfs_inode_log_item *)lip;
> - xfs_trans_ail_copy_lsn(mp->m_ail,
> - &iip->ili_flush_lsn,
> - &iip->ili_item.li_lsn);
> - xfs_iflags_set(iip->ili_inode, XFS_ISTALE);
> - }
> - }
> -
> -
> - /*
> - * For each inode in memory attempt to add it to the inode
> - * buffer and set it up for being staled on buffer IO
> - * completion. This is safe as we've locked out tail pushing
> - * and flushing by locking the buffer.
> - *
> - * We have already marked every inode that was part of a
> - * transaction stale above, which means there is no point in
> - * even trying to lock them.
> + * Now we need to set all the cached clean inodes as XFS_ISTALE,
> + * too. This requires lookups, and will skip inodes that we've
> + * already marked XFS_ISTALE.
> */
> - for (i = 0; i < igeo->inodes_per_cluster; i++) {
> - ip = xfs_ifree_get_one_inode(pag, free_ip, inum + i);
> - if (!ip)
> - continue;
> -
> - iip = ip->i_itemp;
> - spin_lock(&iip->ili_lock);
> - iip->ili_last_fields = iip->ili_fields;
> - iip->ili_fields = 0;
> - iip->ili_fsync_fields = 0;
> - spin_unlock(&iip->ili_lock);
> - xfs_trans_ail_copy_lsn(mp->m_ail, &iip->ili_flush_lsn,
> - &iip->ili_item.li_lsn);
> -
> - list_add_tail(&iip->ili_item.li_bio_list,
> - &bp->b_li_list);
> -
> - if (ip != free_ip)
> - xfs_iunlock(ip, XFS_ILOCK_EXCL);
> - }
> + for (i = 0; i < igeo->inodes_per_cluster; i++)
> + xfs_ifree_mark_inode_stale(bp, free_ip, inum + i);
>
> xfs_trans_stale_inode_buf(tp, bp);
> xfs_trans_binval(tp, bp);
> }
> -
> - xfs_perag_put(pag);
> return 0;
> }
>
> @@ -3845,6 +3810,7 @@ xfs_iflush_int(
> iip->ili_fields = 0;
> iip->ili_fsync_fields = 0;
> spin_unlock(&iip->ili_lock);
> + ASSERT(iip->ili_last_fields);
>
> /*
> * Store the current LSN of the inode so that we can tell whether the
> --
> 2.26.2.761.g0e0b3e54be
>
next prev parent reply other threads:[~2020-06-05 18:27 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 7:45 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-04 7:45 ` [PATCH 01/30] xfs: Don't allow logging of XFS_ISTALE inodes Dave Chinner
2020-06-04 7:45 ` [PATCH 02/30] xfs: remove logged flag from inode log item Dave Chinner
2020-06-04 7:45 ` [PATCH 03/30] xfs: add an inode item lock Dave Chinner
2020-06-09 13:13 ` Brian Foster
2020-06-04 7:45 ` [PATCH 04/30] xfs: mark inode buffers in cache Dave Chinner
2020-06-04 14:04 ` Brian Foster
2020-06-04 7:45 ` [PATCH 05/30] xfs: mark dquot " Dave Chinner
2020-06-04 7:45 ` [PATCH 06/30] xfs: mark log recovery buffers for completion Dave Chinner
2020-06-04 7:45 ` [PATCH 07/30] xfs: call xfs_buf_iodone directly Dave Chinner
2020-06-04 7:45 ` [PATCH 08/30] xfs: clean up whacky buffer log item list reinit Dave Chinner
2020-06-04 7:45 ` [PATCH 09/30] xfs: make inode IO completion buffer centric Dave Chinner
2020-06-04 7:45 ` [PATCH 10/30] xfs: use direct calls for dquot IO completion Dave Chinner
2020-06-04 7:45 ` [PATCH 11/30] xfs: clean up the buffer iodone callback functions Dave Chinner
2020-06-04 7:45 ` [PATCH 12/30] xfs: get rid of log item callbacks Dave Chinner
2020-06-04 7:45 ` [PATCH 13/30] xfs: handle buffer log item IO errors directly Dave Chinner
2020-06-04 14:05 ` Brian Foster
2020-06-05 0:59 ` Dave Chinner
2020-06-05 1:32 ` [PATCH 13/30 V2] " Dave Chinner
2020-06-05 16:24 ` Brian Foster
2020-06-04 7:45 ` [PATCH 14/30] xfs: unwind log item error flagging Dave Chinner
2020-06-04 7:45 ` [PATCH 15/30] xfs: move xfs_clear_li_failed out of xfs_ail_delete_one() Dave Chinner
2020-06-04 7:45 ` [PATCH 16/30] xfs: pin inode backing buffer to the inode log item Dave Chinner
2020-06-04 14:05 ` Brian Foster
2020-06-04 7:45 ` [PATCH 17/30] xfs: make inode reclaim almost non-blocking Dave Chinner
2020-06-04 18:06 ` Brian Foster
2020-06-04 7:45 ` [PATCH 18/30] xfs: remove IO submission from xfs_reclaim_inode() Dave Chinner
2020-06-04 18:08 ` Brian Foster
2020-06-04 22:53 ` Dave Chinner
2020-06-05 16:25 ` Brian Foster
2020-06-04 7:45 ` [PATCH 19/30] xfs: allow multiple reclaimers per AG Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-05 21:07 ` Dave Chinner
2020-06-08 16:44 ` Brian Foster
2020-06-04 7:45 ` [PATCH 20/30] xfs: don't block inode reclaim on the ILOCK Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-04 7:45 ` [PATCH 21/30] xfs: remove SYNC_TRYLOCK from inode reclaim Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-04 7:45 ` [PATCH 22/30] xfs: remove SYNC_WAIT from xfs_reclaim_inodes() Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-05 21:09 ` Dave Chinner
2020-06-04 7:45 ` [PATCH 23/30] xfs: clean up inode reclaim comments Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-04 7:46 ` [PATCH 24/30] xfs: rework stale inodes in xfs_ifree_cluster Dave Chinner
2020-06-05 18:27 ` Brian Foster [this message]
2020-06-05 21:32 ` Dave Chinner
2020-06-08 16:44 ` Brian Foster
2020-06-04 7:46 ` [PATCH 25/30] xfs: attach inodes to the cluster buffer when dirtied Dave Chinner
2020-06-08 16:45 ` Brian Foster
2020-06-08 21:05 ` Dave Chinner
2020-06-04 7:46 ` [PATCH 26/30] xfs: xfs_iflush() is no longer necessary Dave Chinner
2020-06-08 16:45 ` Brian Foster
2020-06-08 21:37 ` Dave Chinner
2020-06-08 22:26 ` [PATCH 26/30 V2] " Dave Chinner
2020-06-09 13:11 ` Brian Foster
2020-06-04 7:46 ` [PATCH 27/30] xfs: rename xfs_iflush_int() Dave Chinner
2020-06-08 17:37 ` Brian Foster
2020-06-04 7:46 ` [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration Dave Chinner
2020-06-09 13:11 ` Brian Foster
2020-06-09 22:01 ` Dave Chinner
2020-06-10 13:06 ` Brian Foster
2020-06-10 23:40 ` Dave Chinner
2020-06-11 13:56 ` Brian Foster
2020-06-15 1:01 ` Dave Chinner
2020-06-15 14:21 ` Brian Foster
2020-06-16 14:41 ` Brian Foster
2020-06-11 1:56 ` [PATCH 28/30 V2] " Dave Chinner
2020-06-04 7:46 ` [PATCH 29/30] xfs: factor xfs_iflush_done Dave Chinner
2020-06-09 13:12 ` Brian Foster
2020-06-09 22:14 ` Dave Chinner
2020-06-10 13:08 ` Brian Foster
2020-06-11 0:16 ` Dave Chinner
2020-06-11 14:07 ` Brian Foster
2020-06-15 1:49 ` Dave Chinner
2020-06-15 5:20 ` Amir Goldstein
2020-06-15 14:31 ` Brian Foster
2020-06-11 1:58 ` [PATCH 29/30 V2] " Dave Chinner
2020-06-04 7:46 ` [PATCH 30/30] xfs: remove xfs_inobp_check() Dave Chinner
2020-06-09 13:12 ` Brian Foster
-- strict thread matches above, loose matches on Subject: below --
2020-06-22 8:15 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-22 8:15 ` [PATCH 24/30] xfs: rework stale inodes in xfs_ifree_cluster Dave Chinner
2020-06-01 21:42 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-01 21:42 ` [PATCH 24/30] xfs: rework stale inodes in xfs_ifree_cluster Dave Chinner
2020-06-02 23:01 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200605182722.GH23747@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.