linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org, Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH 01/10] xfs: fix transaction leak in xfs_reflink_allocate_cow()
Date: Mon, 17 Sep 2018 16:51:10 -0700	[thread overview]
Message-ID: <20180917235110.GA20086@magnolia> (raw)
In-Reply-To: <20180917205354.15401-2-hch@lst.de>

On Mon, Sep 17, 2018 at 10:53:45PM +0200, Christoph Hellwig wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When xfs_reflink_allocate_cow() allocates a transaction, it drops
> the ILOCK to perform the operation. This Introduces a race condition
> where another thread modifying the file can perform the COW
> allocation operation underneath us. This result in the retry loop
> finding an allocated block and jumping straight to the conversion
> code. It does not, however, cancel the transaction it holds and so
> this gets leaked. This results in a lockdep warning:
> 
> ================================================
> WARNING: lock held when returning to user space!
> 4.18.5 #1 Not tainted
> ------------------------------------------------
> worker/6123 is leaving the kernel with locks still held!
> 1 lock held by worker/6123:
>  #0: 000000009eab4f1b (sb_internal#2){.+.+}, at: xfs_trans_alloc+0x17c/0x220
> 
> And eventually the filesystem deadlocks because it runs out of log
> space that is reserved by the leaked transaction and never gets
> released.
> 
> The logic flow in xfs_reflink_allocate_cow() is a convoluted mess of
> gotos - it's no surprise that it has bug where the flow through
> several goto jumps then fails to clean up context from a non-obvious
> logic path. CLean up the logic flow and make sure every path does
> the right thing.
> 
> Reported-by: Alexander Y. Fomichev <git.user@gmail.com>
> Tested-by: Alexander Y. Fomichev <git.user@gmail.com>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200981
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> [hch: slight refactor]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_reflink.c | 127 ++++++++++++++++++++++++++-----------------
>  1 file changed, 77 insertions(+), 50 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 38f405415b88..d60d0eeed7b9 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -352,6 +352,47 @@ xfs_reflink_convert_cow(
>  	return error;
>  }
>  
> +/*
> + * Find the extent that maps the given range in the COW fork. Even if the extent
> + * is not shared we might have a preallocation for it in the COW fork. If so we
> + * use it that rather than trigger a new allocation.
> + */
> +static int
> +xfs_find_trim_cow_extent(
> +	struct xfs_inode	*ip,
> +	struct xfs_bmbt_irec	*imap,
> +	bool			*shared,
> +	bool			*found)
> +{
> +	xfs_fileoff_t		offset_fsb = imap->br_startoff;
> +	xfs_filblks_t		count_fsb = imap->br_blockcount;
> +	struct xfs_iext_cursor	icur;
> +	struct xfs_bmbt_irec	got;
> +	bool			trimmed;
> +
> +	*found = false;
> +
> +	/*
> +	 * If we don't find an overlapping extent, trim the range we need to
> +	 * allocate to fit the hole we found.
> +	 */
> +	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got) ||
> +	    got.br_startoff > offset_fsb)
> +		return xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed);
> +
> +	*shared = true;
> +	if (isnullstartblock(got.br_startblock)) {
> +		xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
> +		return 0;
> +	}
> +
> +	/* real extent found - no need to allocate */
> +	xfs_trim_extent(&got, offset_fsb, count_fsb);
> +	*imap = got;
> +	*found = true;
> +	return 0;
> +}
> +
>  /* Allocate all CoW reservations covering a range of blocks in a file. */
>  int
>  xfs_reflink_allocate_cow(
> @@ -363,78 +404,64 @@ xfs_reflink_allocate_cow(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	xfs_fileoff_t		offset_fsb = imap->br_startoff;
>  	xfs_filblks_t		count_fsb = imap->br_blockcount;
> -	struct xfs_bmbt_irec	got;
> -	struct xfs_trans	*tp = NULL;
> +	struct xfs_trans	*tp;
>  	int			nimaps, error = 0;
> -	bool			trimmed;
> +	bool			found;
>  	xfs_filblks_t		resaligned;
>  	xfs_extlen_t		resblks = 0;
> -	struct xfs_iext_cursor	icur;
>  
> -retry:
> -	ASSERT(xfs_is_reflink_inode(ip));
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> +	ASSERT(xfs_is_reflink_inode(ip));
>  
> -	/*
> -	 * Even if the extent is not shared we might have a preallocation for
> -	 * it in the COW fork.  If so use it.
> -	 */
> -	if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got) &&
> -	    got.br_startoff <= offset_fsb) {
> -		*shared = true;
> -
> -		/* If we have a real allocation in the COW fork we're done. */
> -		if (!isnullstartblock(got.br_startblock)) {
> -			xfs_trim_extent(&got, offset_fsb, count_fsb);
> -			*imap = got;
> -			goto convert;
> -		}
> +	error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
> +	if (error || !*shared)
> +		return error;
> +	if (found)
> +		goto convert;
>  
> -		xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
> -	} else {
> -		error = xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed);
> -		if (error || !*shared)
> -			goto out;
> -	}
> +	resaligned = xfs_aligned_fsb_count(imap->br_startoff,
> +		imap->br_blockcount, xfs_get_cowextsz_hint(ip));
> +	resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned);
>  
> -	if (!tp) {
> -		resaligned = xfs_aligned_fsb_count(imap->br_startoff,
> -			imap->br_blockcount, xfs_get_cowextsz_hint(ip));
> -		resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned);
> +	xfs_iunlock(ip, *lockmode);
> +	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
> +	*lockmode = XFS_ILOCK_EXCL;
> +	xfs_ilock(ip, *lockmode);
>  
> -		xfs_iunlock(ip, *lockmode);
> -		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
> -		*lockmode = XFS_ILOCK_EXCL;
> -		xfs_ilock(ip, *lockmode);
> +	if (error)
> +		return error;
>  
> -		if (error)
> -			return error;
> +	error = xfs_qm_dqattach_locked(ip, false);
> +	if (error)
> +		goto out_trans_cancel;
>  
> -		error = xfs_qm_dqattach_locked(ip, false);
> -		if (error)
> -			goto out;
> -		goto retry;
> +	/*
> +	 * Check for an overlapping extent again now that we dropped the ilock.
> +	 */
> +	error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
> +	if (error || !*shared)
> +		goto out_trans_cancel;
> +	if (found) {
> +		xfs_trans_cancel(tp);
> +		goto convert;
>  	}
>  
>  	error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
>  			XFS_QMOPT_RES_REGBLKS);
>  	if (error)
> -		goto out;
> +		goto out_trans_cancel;
>  
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> -	nimaps = 1;
> -
>  	/* Allocate the entire reservation as unwritten blocks. */
> +	nimaps = 1;
>  	error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount,
>  			XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC,
>  			resblks, imap, &nimaps);
>  	if (error)
> -		goto out_trans_cancel;
> +		goto out_unreserve;
>  
>  	xfs_inode_set_cowblocks_tag(ip);
> -
> -	/* Finish up. */
>  	error = xfs_trans_commit(tp);
>  	if (error)
>  		return error;
> @@ -447,12 +474,12 @@ xfs_reflink_allocate_cow(
>  		return -ENOSPC;
>  convert:
>  	return xfs_reflink_convert_cow_extent(ip, imap, offset_fsb, count_fsb);
> -out_trans_cancel:
> +
> +out_unreserve:
>  	xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0,
>  			XFS_QMOPT_RES_REGBLKS);
> -out:
> -	if (tp)
> -		xfs_trans_cancel(tp);
> +out_trans_cancel:
> +	xfs_trans_cancel(tp);
>  	return error;
>  }
>  
> -- 
> 2.18.0
> 

  reply	other threads:[~2018-09-18  5:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 20:53 delalloc and reflink fixes & tweaks Christoph Hellwig
2018-09-17 20:53 ` [PATCH 01/10] xfs: fix transaction leak in xfs_reflink_allocate_cow() Christoph Hellwig
2018-09-17 23:51   ` Darrick J. Wong [this message]
2018-09-17 20:53 ` [PATCH 02/10] xfs: don't bring in extents in xfs_bmap_punch_delalloc_range Christoph Hellwig
2018-09-20 20:23   ` Darrick J. Wong
2018-09-17 20:53 ` [PATCH 03/10] xfs: remove XFS_IO_INVALID Christoph Hellwig
2018-09-20 20:31   ` Darrick J. Wong
2018-09-27 18:38     ` Christoph Hellwig
2018-09-17 20:53 ` [PATCH 04/10] xfs: simplify the IOMAP_ZERO check in xfs_file_iomap_begin a bit Christoph Hellwig
2018-09-20 20:31   ` Darrick J. Wong
2018-09-26 15:17   ` Brian Foster
2018-09-27 18:40     ` Christoph Hellwig
2018-09-17 20:53 ` [PATCH 05/10] xfs: handle zeroing in xfs_file_iomap_begin_delay Christoph Hellwig
2018-09-17 20:53 ` [PATCH 06/10] xfs: always allocate blocks as unwritten for file data Christoph Hellwig
2018-09-17 20:53 ` [PATCH 07/10] xfs: handle extent size hints in xfs_file_iomap_begin_delay Christoph Hellwig
2018-09-26 15:17   ` Brian Foster
2018-10-01 12:38     ` Christoph Hellwig
2018-09-17 20:53 ` [PATCH 08/10] xfs: remove the unused shared argument to xfs_reflink_reserve_cow Christoph Hellwig
2018-09-17 20:53 ` [PATCH 09/10] xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared Christoph Hellwig
2018-09-17 20:53 ` [PATCH 10/10] xfs: use a separate iomap_ops for delalloc writes Christoph Hellwig
2018-09-26 15:18   ` Brian Foster
2018-10-01 12:40     ` Christoph Hellwig
2018-09-17 21:23 ` delalloc and reflink fixes & tweaks Dave Chinner
2018-09-18 18:17   ` Christoph Hellwig
2018-09-18 23:00     ` Dave Chinner
2018-09-19  5:40       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180917235110.GA20086@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=dchinner@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).