All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: linux-xfs@vger.kernel.org, bfoster@redhat.com, hch@lst.de
Subject: Re: [PATCH 08/13] xfs: refactor xfs_qm_dqtobp and xfs_qm_dqalloc
Date: Wed, 2 May 2018 17:10:52 -0700	[thread overview]
Message-ID: <20180503001052.GC4127@magnolia> (raw)
In-Reply-To: <152506704453.21553.1328093513666063391.stgit@magnolia>

On Sun, Apr 29, 2018 at 10:44:04PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Separate the disk dquot read and allocation functionality into
> two helper functions, then refactor dqread to call them directly.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_dquot.c |  261 ++++++++++++++++++++--------------------------------
>  1 file changed, 98 insertions(+), 163 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index f1dc62ca54a7..d8730f6110b8 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -288,49 +288,43 @@ xfs_dquot_set_prealloc_limits(struct xfs_dquot *dqp)
>  }
>  
>  /*
> - * Allocate a block and fill it with dquots.
> - * This is called when the bmapi finds a hole.
> + * Ensure that the given in-core dquot has a buffer on disk backing it, and
> + * return the buffer. This is called when the bmapi finds a hole.
>   */
>  STATIC int
> -xfs_qm_dqalloc(
> -	xfs_trans_t	**tpp,
> -	xfs_mount_t	*mp,
> -	xfs_dquot_t	*dqp,
> -	xfs_inode_t	*quotip,
> -	xfs_fileoff_t	offset_fsb,
> -	xfs_buf_t	**O_bpp)
> +xfs_dquot_disk_alloc(
> +	struct xfs_trans	**tpp,
> +	struct xfs_dquot	*dqp,
> +	struct xfs_buf		**bpp)
>  {
> -	xfs_fsblock_t	firstblock;
> -	struct xfs_defer_ops dfops;
> -	xfs_bmbt_irec_t map;
> -	int		nmaps, error;
> -	xfs_buf_t	*bp;
> -	xfs_trans_t	*tp = *tpp;
> -
> -	ASSERT(tp != NULL);
> +	struct xfs_bmbt_irec	map;
> +	struct xfs_defer_ops	dfops;
> +	struct xfs_mount	*mp = (*tpp)->t_mountp;
> +	struct xfs_buf		*bp;
> +	struct xfs_inode	*quotip = xfs_quota_inode(mp, dqp->dq_flags);
> +	xfs_fsblock_t		firstblock;
> +	int			nmaps = 1;
> +	int			error;
>  
>  	trace_xfs_dqalloc(dqp);
>  
> -	/*
> -	 * Initialize the bmap freelist prior to calling bmapi code.
> -	 */
>  	xfs_defer_init(&dfops, &firstblock);
>  	xfs_ilock(quotip, XFS_ILOCK_EXCL);
> -	/*
> -	 * Return if this type of quotas is turned off while we didn't
> -	 * have an inode lock
> -	 */
>  	if (!xfs_this_quota_on(dqp->q_mount, dqp->dq_flags)) {
> +		/*
> +		 * Return if this type of quotas is turned off while we didn't
> +		 * have an inode lock
> +		 */
>  		xfs_iunlock(quotip, XFS_ILOCK_EXCL);
>  		return -ESRCH;
>  	}
>  
> -	xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL);
> -	nmaps = 1;
> -	error = xfs_bmapi_write(tp, quotip, offset_fsb,
> -				XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA,
> -				&firstblock, XFS_QM_DQALLOC_SPACE_RES(mp),
> -				&map, &nmaps, &dfops);
> +	/* Create the block mapping. */
> +	xfs_trans_ijoin(*tpp, quotip, XFS_ILOCK_EXCL);
> +	error = xfs_bmapi_write(*tpp, quotip, dqp->q_fileoffset,
> +			XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA,
> +			&firstblock, XFS_QM_DQALLOC_SPACE_RES(mp),
> +			&map, &nmaps, &dfops);
>  	if (error)
>  		goto error0;
>  	ASSERT(map.br_blockcount == XFS_DQUOT_CLUSTER_SIZE_FSB);
> @@ -344,10 +338,8 @@ xfs_qm_dqalloc(
>  	dqp->q_blkno = XFS_FSB_TO_DADDR(mp, map.br_startblock);
>  
>  	/* now we can just get the buffer (there's nothing to read yet) */
> -	bp = xfs_trans_get_buf(tp, mp->m_ddev_targp,
> -			       dqp->q_blkno,
> -			       mp->m_quotainfo->qi_dqchunklen,
> -			       0);
> +	bp = xfs_trans_get_buf(*tpp, mp->m_ddev_targp, dqp->q_blkno,
> +			mp->m_quotainfo->qi_dqchunklen, 0);
>  	if (!bp) {
>  		error = -ENOMEM;
>  		goto error1;
> @@ -358,37 +350,22 @@ xfs_qm_dqalloc(
>  	 * Make a chunk of dquots out of this buffer and log
>  	 * the entire thing.
>  	 */
> -	xfs_qm_init_dquot_blk(tp, mp, be32_to_cpu(dqp->q_core.d_id),
> +	xfs_qm_init_dquot_blk(*tpp, mp, be32_to_cpu(dqp->q_core.d_id),
>  			      dqp->dq_flags & XFS_DQ_ALLTYPES, bp);
> +	xfs_buf_set_ref(bp, XFS_DQUOT_REF);
>  
>  	/*
> -	 * xfs_defer_finish() may commit the current transaction and
> -	 * start a second transaction if the freelist is not empty.
> -	 *
> -	 * Since we still want to modify this buffer, we need to
> -	 * ensure that the buffer is not released on commit of
> -	 * the first transaction and ensure the buffer is added to the
> -	 * second transaction.
> -	 *
> -	 * If there is only one transaction then don't stop the buffer
> -	 * from being released when it commits later on.
> +	 * Hold the buffer and join it to the dfops so that we'll still own
> +	 * the buffer when we return to the caller.
>  	 */
> -
> -	xfs_trans_bhold(tp, bp);
> -
> +	xfs_trans_bhold(*tpp, bp);
> +	error = xfs_defer_bjoin(&dfops, bp);
> +	if (error)
> +		goto error1;
>  	error = xfs_defer_finish(tpp, &dfops);
>  	if (error)
>  		goto error1;

Sigh... this is wrong.  We allocated a dquot block and get_buf'd it to
initialize the dquot buffer.  If the defer_finish fails we error out,
leaving the locked buffer dangling.  This seems to have been introduced
in commit efa092f3d4c6 "[XFS] Fixes a bug in the quota code when
allocating a new dquot record" back in 2005.  The leaked locked buffer
would seem to be the cause of the only-when-quota generic/388 hangs that
I was complaining about at LSF last week.

The _defer_bjoin compounds this mistake further by re-bhold'ing the
buffer after each transaction roll -- if the defer_finish succeeds but
the subsequent trans_commit fails, we again leak the locked buffer.

Will repost this patch with all this fixed.

--D

> -
> -	/* Transaction was committed? */
> -	if (*tpp != tp) {
> -		tp = *tpp;
> -		xfs_trans_bjoin(tp, bp);
> -	} else {
> -		xfs_trans_bhold_release(tp, bp);
> -	}
> -
> -	*O_bpp = bp;
> +	*bpp = bp;
>  	return 0;
>  
>  error1:
> @@ -398,32 +375,24 @@ xfs_qm_dqalloc(
>  }
>  
>  /*
> - * Maps a dquot to the buffer containing its on-disk version.
> - * This returns a ptr to the buffer containing the on-disk dquot
> - * in the bpp param, and a ptr to the on-disk dquot within that buffer
> + * Read in the in-core dquot's on-disk metadata and return the buffer.
> + * Returns ENOENT to signal a hole.
>   */
>  STATIC int
> -xfs_qm_dqtobp(
> -	xfs_trans_t		**tpp,
> -	xfs_dquot_t		*dqp,
> -	xfs_disk_dquot_t	**O_ddpp,
> -	xfs_buf_t		**O_bpp,
> -	uint			flags)
> +xfs_dquot_disk_read(
> +	struct xfs_mount	*mp,
> +	struct xfs_dquot	*dqp,
> +	struct xfs_buf		**bpp)
>  {
>  	struct xfs_bmbt_irec	map;
> -	int			nmaps = 1, error;
>  	struct xfs_buf		*bp;
> -	struct xfs_inode	*quotip;
> -	struct xfs_mount	*mp = dqp->q_mount;
> -	xfs_dqid_t		id = be32_to_cpu(dqp->q_core.d_id);
> -	struct xfs_trans	*tp = (tpp ? *tpp : NULL);
> +	struct xfs_inode	*quotip = xfs_quota_inode(mp, dqp->dq_flags);
>  	uint			lock_mode;
> -
> -	quotip = xfs_quota_inode(dqp->q_mount, dqp->dq_flags);
> -	dqp->q_fileoffset = (xfs_fileoff_t)id / mp->m_quotainfo->qi_dqperchunk;
> +	int			nmaps = 1;
> +	int			error;
>  
>  	lock_mode = xfs_ilock_data_map_shared(quotip);
> -	if (!xfs_this_quota_on(dqp->q_mount, dqp->dq_flags)) {
> +	if (!xfs_this_quota_on(mp, dqp->dq_flags)) {
>  		/*
>  		 * Return if this type of quotas is turned off while we
>  		 * didn't have the quota inode lock.
> @@ -436,57 +405,36 @@ xfs_qm_dqtobp(
>  	 * Find the block map; no allocations yet
>  	 */
>  	error = xfs_bmapi_read(quotip, dqp->q_fileoffset,
> -			       XFS_DQUOT_CLUSTER_SIZE_FSB, &map, &nmaps, 0);
> -
> +			XFS_DQUOT_CLUSTER_SIZE_FSB, &map, &nmaps, 0);
>  	xfs_iunlock(quotip, lock_mode);
>  	if (error)
>  		return error;
>  
>  	ASSERT(nmaps == 1);
> -	ASSERT(map.br_blockcount == 1);
> +	ASSERT(map.br_blockcount >= 1);
> +	ASSERT(map.br_startblock != DELAYSTARTBLOCK);
> +	if (map.br_startblock == HOLESTARTBLOCK)
> +		return -ENOENT;
> +
> +	trace_xfs_dqtobp_read(dqp);
>  
>  	/*
> -	 * Offset of dquot in the (fixed sized) dquot chunk.
> +	 * store the blkno etc so that we don't have to do the
> +	 * mapping all the time
>  	 */
> -	dqp->q_bufoffset = (id % mp->m_quotainfo->qi_dqperchunk) *
> -		sizeof(xfs_dqblk_t);
> -
> -	ASSERT(map.br_startblock != DELAYSTARTBLOCK);
> -	if (map.br_startblock == HOLESTARTBLOCK) {
> -		/*
> -		 * We don't allocate unless we're asked to
> -		 */
> -		if (!(flags & XFS_QMOPT_DQALLOC))
> -			return -ENOENT;
> -
> -		ASSERT(tp);
> -		error = xfs_qm_dqalloc(tpp, mp, dqp, quotip,
> -					dqp->q_fileoffset, &bp);
> -		if (error)
> -			return error;
> -		tp = *tpp;
> -	} else {
> -		trace_xfs_dqtobp_read(dqp);
> +	dqp->q_blkno = XFS_FSB_TO_DADDR(mp, map.br_startblock);
>  
> -		/*
> -		 * store the blkno etc so that we don't have to do the
> -		 * mapping all the time
> -		 */
> -		dqp->q_blkno = XFS_FSB_TO_DADDR(mp, map.br_startblock);
> -
> -		error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> -					   dqp->q_blkno,
> -					   mp->m_quotainfo->qi_dqchunklen,
> -					   0, &bp, &xfs_dquot_buf_ops);
> -		if (error) {
> -			ASSERT(bp == NULL);
> -			return error;
> -		}
> +	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> +			mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> +			&xfs_dquot_buf_ops);
> +	if (error) {
> +		ASSERT(bp == NULL);
> +		return error;
>  	}
>  
>  	ASSERT(xfs_buf_islocked(bp));
> -	*O_bpp = bp;
> -	*O_ddpp = bp->b_addr + dqp->q_bufoffset;
> +	xfs_buf_set_ref(bp, XFS_DQUOT_REF);
> +	*bpp = bp;
>  
>  	return 0;
>  }
> @@ -508,6 +456,12 @@ xfs_dquot_alloc(
>  	INIT_LIST_HEAD(&dqp->q_lru);
>  	mutex_init(&dqp->q_qlock);
>  	init_waitqueue_head(&dqp->q_pinwait);
> +	dqp->q_fileoffset = (xfs_fileoff_t)id / mp->m_quotainfo->qi_dqperchunk;
> +	/*
> +	 * Offset of dquot in the (fixed sized) dquot chunk.
> +	 */
> +	dqp->q_bufoffset = (id % mp->m_quotainfo->qi_dqperchunk) *
> +			sizeof(xfs_dqblk_t);
>  
>  	/*
>  	 * Because we want to use a counting completion, complete
> @@ -546,8 +500,10 @@ xfs_dquot_alloc(
>  STATIC void
>  xfs_dquot_from_disk(
>  	struct xfs_dquot	*dqp,
> -	struct xfs_disk_dquot	*ddqp)
> +	struct xfs_buf		*bp)
>  {
> +	struct xfs_disk_dquot	*ddqp = bp->b_addr + dqp->q_bufoffset;
> +
>  	/* copy everything from disk dquot to the incore dquot */
>  	memcpy(&dqp->q_core, ddqp, sizeof(xfs_disk_dquot_t));
>  
> @@ -575,74 +531,53 @@ xfs_qm_dqread(
>  	xfs_dqid_t		id,
>  	uint			type,
>  	uint			flags,
> -	struct xfs_dquot	**O_dqpp)
> +	struct xfs_dquot	**dqpp)
>  {
>  	struct xfs_dquot	*dqp;
> -	struct xfs_disk_dquot	*ddqp;
>  	struct xfs_buf		*bp;
> -	struct xfs_trans	*tp = NULL;
> +	struct xfs_trans	*tp;
>  	int			error;
>  
>  	dqp = xfs_dquot_alloc(mp, id, type);
>  	trace_xfs_dqread(dqp);
>  
> -	if (flags & XFS_QMOPT_DQALLOC) {
> +	/* Try to read the buffer... */
> +	error = xfs_dquot_disk_read(mp, dqp, &bp);
> +	if (error == -ENOENT && (flags & XFS_QMOPT_DQALLOC)) {
> +		/* ...or allocate a new block and buffer. */
>  		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_qm_dqalloc,
>  				XFS_QM_DQALLOC_SPACE_RES(mp), 0, 0, &tp);
>  		if (error)
> -			goto error0;
> -	}
> +			goto err;
>  
> -	/*
> -	 * get a pointer to the on-disk dquot and the buffer containing it
> -	 * dqp already knows its own type (GROUP/USER).
> -	 */
> -	error = xfs_qm_dqtobp(&tp, dqp, &ddqp, &bp, flags);
> -	if (error) {
> -		/*
> -		 * This can happen if quotas got turned off (ESRCH),
> -		 * or if the dquot didn't exist on disk and we ask to
> -		 * allocate (ENOENT).
> -		 */
> -		trace_xfs_dqread_fail(dqp);
> -		goto error1;
> -	}
> -
> -	xfs_dquot_from_disk(dqp, ddqp);
> +		error = xfs_dquot_disk_alloc(&tp, dqp, &bp);
> +		if (error)
> +			goto err_cancel;
>  
> -	/* Mark the buf so that this will stay incore a little longer */
> -	xfs_buf_set_ref(bp, XFS_DQUOT_REF);
> +		error = xfs_trans_commit(tp);
> +	}
> +	if (error)
> +		goto err;
>  
>  	/*
> -	 * We got the buffer with a xfs_trans_read_buf() (in dqtobp())
> -	 * So we need to release with xfs_trans_brelse().
> -	 * The strategy here is identical to that of inodes; we lock
> -	 * the dquot in xfs_qm_dqget() before making it accessible to
> -	 * others. This is because dquots, like inodes, need a good level of
> -	 * concurrency, and we don't want to take locks on the entire buffers
> -	 * for dquot accesses.
> -	 * Note also that the dquot buffer may even be dirty at this point, if
> -	 * this particular dquot was repaired. We still aren't afraid to
> -	 * brelse it because we have the changes incore.
> +	 * At this point we should have a clean locked buffer.  Copy the data
> +	 * to the incore dquot and release the buffer since the incore dquot
> +	 * has its own locking protocol so we needn't tie up the buffer any
> +	 * further.
>  	 */
>  	ASSERT(xfs_buf_islocked(bp));
> -	xfs_trans_brelse(tp, bp);
> +	xfs_dquot_from_disk(dqp, bp);
>  
> -	if (tp) {
> -		error = xfs_trans_commit(tp);
> -		if (error)
> -			goto error0;
> -	}
> -
> -	*O_dqpp = dqp;
> +	xfs_buf_relse(bp);
> +	*dqpp = dqp;
>  	return error;
>  
> -error1:
> -	if (tp)
> -		xfs_trans_cancel(tp);
> -error0:
> +err_cancel:
> +	xfs_trans_cancel(tp);
> +err:
> +	trace_xfs_dqread_fail(dqp);
>  	xfs_qm_dqdestroy(dqp);
> -	*O_dqpp = NULL;
> +	*dqpp = NULL;
>  	return error;
>  }
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2018-05-03  0:11 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30  5:43 [PATCH v2 00/13] xfs-4.18: quota refactor Darrick J. Wong
2018-04-30  5:43 ` [PATCH 01/13] xfs: refactor XFS_QMOPT_DQNEXT out of existence Darrick J. Wong
2018-04-30  5:43 ` [PATCH 02/13] xfs: refactor dquot cache handling Darrick J. Wong
2018-04-30  5:43 ` [PATCH 03/13] xfs: delegate dqget input checks to helper function Darrick J. Wong
2018-04-30  5:43 ` [PATCH 04/13] xfs: remove unnecessary xfs_qm_dqattach parameter Darrick J. Wong
2018-04-30  5:43 ` [PATCH 05/13] xfs: split out dqget for inodes from regular dqget Darrick J. Wong
2018-04-30  5:43 ` [PATCH 06/13] xfs: fetch dquots directly during quotacheck Darrick J. Wong
2018-04-30  5:43 ` [PATCH 07/13] xfs: refactor incore dquot initialization functions Darrick J. Wong
2018-04-30  5:44 ` [PATCH 08/13] xfs: refactor xfs_qm_dqtobp and xfs_qm_dqalloc Darrick J. Wong
2018-05-01 13:44   ` Brian Foster
2018-05-02 16:32   ` Christoph Hellwig
2018-05-03  0:10   ` Darrick J. Wong [this message]
2018-04-30  5:44 ` [PATCH 09/13] xfs: remove xfs_qm_dqread flags argument Darrick J. Wong
2018-05-01 13:44   ` Brian Foster
2018-05-02 16:34   ` Christoph Hellwig
2018-05-02 16:58     ` Darrick J. Wong
2018-05-07 14:41       ` Christoph Hellwig
2018-05-08  0:04         ` Darrick J. Wong
2018-05-08  0:05   ` [PATCH 09/13] xfs: remove direct calls to _qm_dqread Darrick J. Wong
2018-05-09 16:40     ` Brian Foster
2018-05-10  8:26     ` Christoph Hellwig
2018-05-10 15:20       ` Darrick J. Wong
2018-04-30  5:44 ` [PATCH 10/13] xfs: replace XFS_QMOPT_DQALLOC with XFS_DQGET_{ALLOC, EXISTS} Darrick J. Wong
2018-04-30  5:47   ` [PATCH v2 10/13] xfs: replace XFS_QMOPT_DQALLOC with boolean Darrick J. Wong
2018-05-01 13:45     ` Brian Foster
2018-05-01 15:52       ` Darrick J. Wong
2018-05-02 16:35     ` Christoph Hellwig
2018-04-30  5:44 ` [PATCH 11/13] xfs: report failing address when dquot verifier fails Darrick J. Wong
2018-04-30  5:44 ` [PATCH 12/13] xfs: rename on-disk dquot counter zap functions Darrick J. Wong
2018-05-01 13:45   ` Brian Foster
2018-05-02 16:35   ` Christoph Hellwig
2018-04-30  5:44 ` [PATCH 13/13] xfs: refactor dquot iteration Darrick J. Wong
2018-05-01 13:45   ` Brian Foster
2018-05-01 15:53     ` Darrick J. Wong
2018-05-02 16:37       ` Christoph Hellwig
2018-05-02 16:43   ` [PATCH v2 " Darrick J. Wong
2018-05-03 17:53 ` [PATCH 0.1/13] xfs: release new dquot buffer on defer_finish error Darrick J. Wong
2018-05-04 11:31   ` Brian Foster
2018-05-04 15:12     ` Darrick J. Wong
2018-05-04 15:41       ` Brian Foster
2018-05-04 15:52         ` Darrick J. Wong
2018-05-04 16:03           ` Brian Foster
2018-05-04 20:05             ` Darrick J. Wong
2018-05-04 21:19   ` [PATCH v2 " Darrick J. Wong
2018-05-07 11:03     ` Brian Foster
2018-05-03 17:54 ` [PATCH 0.2/13] xfs: don't spray logs when dquot flush/purge fail Darrick J. Wong
2018-05-04 11:32   ` Brian Foster
  -- strict thread matches above, loose matches on Subject: below --
2018-04-22 15:05 [PATCH 00/13] xfs-4.18: quota refactor Darrick J. Wong
2018-04-22 15:06 ` [PATCH 08/13] xfs: refactor xfs_qm_dqtobp and xfs_qm_dqalloc Darrick J. Wong
2018-04-23 17:31   ` Christoph Hellwig
2018-04-24 13:07   ` Brian Foster
2018-04-24 14:08     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180503001052.GC4127@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.