All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lachlan McIlroy <lachlan@sgi.com>
To: David Chinner <dgc@sgi.com>
Cc: xfs@oss.sgi.com, xfs-dev@sgi.com
Subject: Re: [PATCH] fix transaction overrun during writeback
Date: Thu, 01 Nov 2007 12:47:54 +1100	[thread overview]
Message-ID: <4729304A.2010202@sgi.com> (raw)
In-Reply-To: <20071029234010.GU995458@sgi.com>

Looks good Dave.  Since this is a writeback path is there some way
we can tell xfs_bmapi() that it should not convert anything but
delayed allocs and have it assert/error out if it tries to - not
that it will now with this change but just as defensive measure?

David Chinner wrote:
> Prevent transaction overrun in xfs_iomap_write_allocate() if we
> rce with a truncate that overlaps the delalloc range we were
> planning to allocate.
> 
> If we race, we may allocate into a hole and that requires block
> allocation. At this point in time we don't have a reservation for
> block allocation (apart from metadata blocks) and so allocating
> into a hole rather than a delalloc region results in overflowing
> the transaction block reservation.
> 
> Fix it by only allowing a single extent to be allocated at a
> time.
> 
> Signed-Off-By: Dave Chinner <dgc@sgi.com>
> ---
>  fs/xfs/xfs_iomap.c |   75 +++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 50 insertions(+), 25 deletions(-)
> 
> Index: 2.6.x-xfs-new/fs/xfs/xfs_iomap.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/xfs_iomap.c	2007-10-30 10:18:58.777772241 +1100
> +++ 2.6.x-xfs-new/fs/xfs/xfs_iomap.c	2007-10-30 10:19:30.365685668 +1100
> @@ -702,6 +702,9 @@ retry:
>   * the originating callers request.
>   *
>   * Called without a lock on the inode.
> + *
> + * We no longer bother to look at the incoming map - all we have to
> + * guarantee is that whatever we allocate fills the required range.
>   */
>  int
>  xfs_iomap_write_allocate(
> @@ -717,9 +720,9 @@ xfs_iomap_write_allocate(
>  	xfs_fsblock_t	first_block;
>  	xfs_bmap_free_t	free_list;
>  	xfs_filblks_t	count_fsb;
> -	xfs_bmbt_irec_t	imap[XFS_STRAT_WRITE_IMAPS];
> +	xfs_bmbt_irec_t	imap;
>  	xfs_trans_t	*tp;
> -	int		i, nimaps, committed;
> +	int		nimaps, committed;
>  	int		error = 0;
>  	int		nres;
>  
> @@ -766,13 +769,38 @@ xfs_iomap_write_allocate(
>  
>  			XFS_BMAP_INIT(&free_list, &first_block);
>  
> -			nimaps = XFS_STRAT_WRITE_IMAPS;
>  			/*
> -			 * Ensure we don't go beyond eof - it is possible
> -			 * the extents changed since we did the read call,
> -			 * we dropped the ilock in the interim.
> +			 * it is possible that the extents have changed since
> +			 * we did the read call as we dropped the ilock for a
> +			 * while. We have to be careful about truncates or hole
> +			 * punchs here - we are not allowed to allocate
> +			 * non-delalloc blocks here.
> +			 *
> +			 * The only protection against truncation is the pages
> +			 * for the range we are being asked to convert are
> +			 * locked and hence a truncate will block on them
> +			 * first.
> +			 *
> +			 * As a result, if we go beyond the range we really
> +			 * need and hit an delalloc extent boundary followed by
> +			 * a hole while we have excess blocks in the map, we
> +			 * will fill the hole incorrectly and overrun the
> +			 * transaction reservation.
> +			 *
> +			 * Using a single map prevents this as we are forced to
> +			 * check each map we look for overlap with the desired
> +			 * range and abort as soon as we find it. Also, given
> +			 * that we only return a single map, having one beyond
> +			 * what we can return is probably a bit silly.
> +			 *
> +			 * We also need to check that we don't go beyond EOF;
> +			 * this is a truncate optimisation as a truncate sets
> +			 * the new file size before block on the pages we
> +			 * currently have locked under writeback. Because they
> +			 * are about to be tossed, we don't need to write them
> +			 * back....
>  			 */
> -
> +			nimaps = 1;
>  			end_fsb = XFS_B_TO_FSB(mp, ip->i_size);
>  			xfs_bmap_last_offset(NULL, ip, &last_block,
>  				XFS_DATA_FORK);
> @@ -788,7 +816,7 @@ xfs_iomap_write_allocate(
>  			/* Go get the actual blocks */
>  			error = xfs_bmapi(tp, ip, map_start_fsb, count_fsb,
>  					XFS_BMAPI_WRITE, &first_block, 1,
> -					imap, &nimaps, &free_list, NULL);
> +					&imap, &nimaps, &free_list, NULL);
>  			if (error)
>  				goto trans_cancel;
>  
> @@ -807,27 +835,24 @@ xfs_iomap_write_allocate(
>  		 * See if we were able to allocate an extent that
>  		 * covers at least part of the callers request
>  		 */
> -		for (i = 0; i < nimaps; i++) {
> -			if (unlikely(!imap[i].br_startblock &&
> -				     !(ip->i_d.di_flags & XFS_DIFLAG_REALTIME)))
> -				return xfs_cmn_err_fsblock_zero(ip, &imap[i]);
> -			if ((offset_fsb >= imap[i].br_startoff) &&
> -			    (offset_fsb < (imap[i].br_startoff +
> -					   imap[i].br_blockcount))) {
> -				*map = imap[i];
> -				*retmap = 1;
> -				XFS_STATS_INC(xs_xstrat_quick);
> -				return 0;
> -			}
> -			count_fsb -= imap[i].br_blockcount;
> +		if (unlikely(!imap.br_startblock &&
> +			     XFS_IS_REALTIME_INODE(ip)))
> +			return xfs_cmn_err_fsblock_zero(ip, &imap);
> +		if ((offset_fsb >= imap.br_startoff) &&
> +		    (offset_fsb < (imap.br_startoff +
> +				   imap.br_blockcount))) {
> +			*map = imap;
> +			*retmap = 1;
> +			XFS_STATS_INC(xs_xstrat_quick);
> +			return 0;
>  		}
>  
> -		/* So far we have not mapped the requested part of the
> +		/*
> +		 * So far we have not mapped the requested part of the
>  		 * file, just surrounding data, try again.
>  		 */
> -		nimaps--;
> -		map_start_fsb = imap[nimaps].br_startoff +
> -				imap[nimaps].br_blockcount;
> +		count_fsb -= imap.br_blockcount;
> +		map_start_fsb = imap.br_startoff + imap.br_blockcount;
>  	}
>  
>  trans_cancel:
> 

  reply	other threads:[~2007-11-01  1:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-29 23:40 [PATCH] fix transaction overrun during writeback David Chinner
2007-11-01  1:47 ` Lachlan McIlroy [this message]
2007-11-01 22:54   ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4729304A.2010202@sgi.com \
    --to=lachlan@sgi.com \
    --cc=dgc@sgi.com \
    --cc=xfs-dev@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.