linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/3] xfs: EOF blocks are not busy extents
Date: Wed, 20 Feb 2019 10:12:07 -0500	[thread overview]
Message-ID: <20190220151207.GA4930@localhost.localdomain> (raw)
In-Reply-To: <20190218022619.GD14116@dastard>

On Mon, Feb 18, 2019 at 01:26:19PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Userdata extents are considered as "busy extents" when they are
> freed to ensure that they ar enot reallocated and written to before
> the transaction that frees the user data extent has been committed
> to disk.
> 
> However, in the case of post EOF blocks, these block have
> never been exposed to user applications and so don't contain valid
> data. Hence they don't need to be considered "busy" when they've
> been freed because there is no data in them that can be destroyed
> if they are reallocated and have data written to them before the
> free transaction is committed.
> 
> We already have XFS_BMAPI_NODISCARD to extent freeing that the data

"to extent freeing" ?

> extent has never been used so it doesn't need discards issued on it.
> This new functionality is just an extension of that concept - the
> extent is actually unused, so doesn't even need to be marked busy.
> 
> Hence fix this by adding XFS_BMAPI_UNUSED and update the EOF block
> data extent truncate with XFS_BMAPI_UNUSED and propagate that all
> the way through the various structures an use it to avoid inserting
> the extent into the busy list.
> 
> [ Note: this seems like a bit of a hack, but I just don't see the
> point of inserting it into the busy list, then adding a new busy
> list unused flag, then allowing every allocation type to use it in
> the _trim code, then have every caller have to call _reuse to remove
> the range from the busy tree. It just seems like complexity that
> doesn't need to exist because anyone can reallocate an
> unused extent for immediate use. ]
> 

Yeah, I'm not a fan of adding that kind of noop code to achieve behavior
that's equivalent to bypassing subsystems or removing unnecessary code.

> This avoids the problem of free space fragmentation when multiple
> files are written sequentially  via synchronous writes or post-write
> fsync calls before the next file is written. This results in the
> post-eof blocks being marked busy and can't be immediately
> reallocated resulting in the files packing poorly and unnecessarily
> leaving free space between them.
> 
> Freespace fragmentation from sequential multi-file synchronous write
> workload:
> 
> Before:
>   from      to extents  blocks    pct
>       1       1       7       7   0.00
>       2       3      34      80   0.00
>       4       7      65     345   0.01
>       8      15     208    2417   0.05
>      16      31     147    2982   0.06
>      32      63       1      49   0.00
> 1048576 1310720       4 5185064  99.89
> 
> After:
>    from      to extents  blocks    pct
>       1       1       3       3   0.00
>       2       3       1       3   0.00
> 1048576 1310720       4 5190871 100.00
> 
> Much better.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

This seems reasonable to me in principle. Some comments...

>  fs/xfs/libxfs/xfs_alloc.c  | 7 +++----
>  fs/xfs/libxfs/xfs_bmap.c   | 6 +++---
>  fs/xfs/libxfs/xfs_bmap.h   | 8 ++++++--
>  fs/xfs/xfs_bmap_util.c     | 2 +-
>  fs/xfs/xfs_trans_extfree.c | 6 +++---
>  5 files changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 659bb9133955..729136ef0ed1 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -3014,7 +3014,7 @@ __xfs_free_extent(
>  	xfs_extlen_t			len,
>  	const struct xfs_owner_info	*oinfo,
>  	enum xfs_ag_resv_type		type,
> -	bool				skip_discard)
> +	bool				unused)
>  {
>  	struct xfs_mount		*mp = tp->t_mountp;
>  	struct xfs_buf			*agbp;
> @@ -3045,9 +3045,8 @@ __xfs_free_extent(
>  	if (error)
>  		goto err;
>  
> -	if (skip_discard)
> -		busy_flags |= XFS_EXTENT_BUSY_SKIP_DISCARD;
> -	xfs_extent_busy_insert(tp, agno, agbno, len, busy_flags);
> +	if (!unused)
> +		xfs_extent_busy_insert(tp, agno, agbno, len, busy_flags);

busy_flags now serves no purpose in this function (it's hardcoded to
zero).

>  	return 0;
>  
>  err:
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 332eefa2700b..ba0fd80eede4 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -537,7 +537,7 @@ __xfs_bmap_add_free(
>  	xfs_fsblock_t			bno,
>  	xfs_filblks_t			len,
>  	const struct xfs_owner_info	*oinfo,
> -	bool				skip_discard)
> +	bool				unused)
>  {
>  	struct xfs_extent_free_item	*new;		/* new element */
>  #ifdef DEBUG
> @@ -565,7 +565,7 @@ __xfs_bmap_add_free(
>  		new->xefi_oinfo = *oinfo;
>  	else
>  		new->xefi_oinfo = XFS_RMAP_OINFO_SKIP_UPDATE;
> -	new->xefi_skip_discard = skip_discard;
> +	new->xefi_unused = unused;
>  	trace_xfs_bmap_free_defer(tp->t_mountp,
>  			XFS_FSB_TO_AGNO(tp->t_mountp, bno), 0,
>  			XFS_FSB_TO_AGBNO(tp->t_mountp, bno), len);
> @@ -5069,7 +5069,7 @@ xfs_bmap_del_extent_real(
>  		} else {
>  			__xfs_bmap_add_free(tp, del->br_startblock,
>  					del->br_blockcount, NULL,
> -					(bflags & XFS_BMAPI_NODISCARD) ||
> +					(bflags & XFS_BMAPI_UNUSED) ||
>  					del->br_state == XFS_EXT_UNWRITTEN);

Note that this does technically change unrelated behavior as we now
consider unwritten extents as unused (along with post-eof blocks). I
think that's still fine for similar reasons as for eofblocks, but this
should at least be noted in the commit log.

>  		}
>  	}
> diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
> index 09d3ea97cc15..33fb95f84ea0 100644
> --- a/fs/xfs/libxfs/xfs_bmap.h
> +++ b/fs/xfs/libxfs/xfs_bmap.h
> @@ -54,7 +54,7 @@ struct xfs_extent_free_item
>  	xfs_extlen_t		xefi_blockcount;/* number of blocks in extent */
>  	struct list_head	xefi_list;
>  	struct xfs_owner_info	xefi_oinfo;	/* extent owner */
> -	bool			xefi_skip_discard;
> +	bool			xefi_unused;
>  };
>  
>  #define	XFS_BMAP_MAX_NMAP	4
> @@ -107,6 +107,9 @@ struct xfs_extent_free_item
>  /* Do not update the rmap btree.  Used for reconstructing bmbt from rmapbt. */
>  #define XFS_BMAPI_NORMAP	0x2000
>  
> +/* Unused freed data extent, no need to mark busy */
> +#define XFS_BMAPI_UNUSED	0x4000
> +
>  #define XFS_BMAPI_FLAGS \
>  	{ XFS_BMAPI_ENTIRE,	"ENTIRE" }, \
>  	{ XFS_BMAPI_METADATA,	"METADATA" }, \
> @@ -120,7 +123,8 @@ struct xfs_extent_free_item
>  	{ XFS_BMAPI_DELALLOC,	"DELALLOC" }, \
>  	{ XFS_BMAPI_CONVERT_ONLY, "CONVERT_ONLY" }, \
>  	{ XFS_BMAPI_NODISCARD,	"NODISCARD" }, \
> -	{ XFS_BMAPI_NORMAP,	"NORMAP" }
> +	{ XFS_BMAPI_NORMAP,	"NORMAP" }, \
> +	{ XFS_BMAPI_UNUSED,	"UNUSED" }

We can kill XFS_BMAPI_NODISCARD since _UNUSED replaces the only usage of
it. I suppose we might as well just rename it as you've done for the
related plumbing.

Brian

>  
>  
>  static inline int xfs_bmapi_aflag(int w)
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index af2e30d33794..f5a8a4385512 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -841,7 +841,7 @@ xfs_free_eofblocks(
>  		 * may be full of holes (ie NULL files bug).
>  		 */
>  		error = xfs_itruncate_extents_flags(&tp, ip, XFS_DATA_FORK,
> -					XFS_ISIZE(ip), XFS_BMAPI_NODISCARD);
> +					XFS_ISIZE(ip), XFS_BMAPI_UNUSED);
>  		if (error) {
>  			/*
>  			 * If we get an error at this point we simply don't
> diff --git a/fs/xfs/xfs_trans_extfree.c b/fs/xfs/xfs_trans_extfree.c
> index 0710434eb240..d06fb2cd6ffb 100644
> --- a/fs/xfs/xfs_trans_extfree.c
> +++ b/fs/xfs/xfs_trans_extfree.c
> @@ -58,7 +58,7 @@ xfs_trans_free_extent(
>  	xfs_fsblock_t			start_block,
>  	xfs_extlen_t			ext_len,
>  	const struct xfs_owner_info	*oinfo,
> -	bool				skip_discard)
> +	bool				unused)
>  {
>  	struct xfs_mount		*mp = tp->t_mountp;
>  	struct xfs_extent		*extp;
> @@ -71,7 +71,7 @@ xfs_trans_free_extent(
>  	trace_xfs_bmap_free_deferred(tp->t_mountp, agno, 0, agbno, ext_len);
>  
>  	error = __xfs_free_extent(tp, start_block, ext_len,
> -				  oinfo, XFS_AG_RESV_NONE, skip_discard);
> +				  oinfo, XFS_AG_RESV_NONE, unused);
>  	/*
>  	 * Mark the transaction dirty, even on error. This ensures the
>  	 * transaction is aborted, which:
> @@ -184,7 +184,7 @@ xfs_extent_free_finish_item(
>  	error = xfs_trans_free_extent(tp, done_item,
>  			free->xefi_startblock,
>  			free->xefi_blockcount,
> -			&free->xefi_oinfo, free->xefi_skip_discard);
> +			&free->xefi_oinfo, free->xefi_unused);
>  	kmem_free(free);
>  	return error;
>  }

      reply	other threads:[~2019-02-20 15:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07  5:08 [RFC PATCH 0/3]: Extreme fragmentation ahoy! Dave Chinner
2019-02-07  5:08 ` [PATCH 1/3] xfs: Don't free EOF blocks on sync write close Dave Chinner
2019-02-07  5:08 ` [PATCH 2/3] xfs: Don't free EOF blocks on close when extent size hints are set Dave Chinner
2019-02-07 15:51   ` Brian Foster
2019-02-07  5:08 ` [PATCH 3/3] xfs: Don't free EOF blocks on sync write close Dave Chinner
2019-02-07  5:19   ` Dave Chinner
2019-02-07  5:21 ` [RFC PATCH 0/3]: Extreme fragmentation ahoy! Darrick J. Wong
2019-02-07  5:39   ` Dave Chinner
2019-02-07 15:52     ` Brian Foster
2019-02-08  2:47       ` Dave Chinner
2019-02-08 12:34         ` Brian Foster
2019-02-12  1:13           ` Darrick J. Wong
2019-02-12 11:46             ` Brian Foster
2019-02-12 20:21               ` Dave Chinner
2019-02-13 13:50                 ` Brian Foster
2019-02-13 22:27                   ` Dave Chinner
2019-02-14 13:00                     ` Brian Foster
2019-02-14 21:51                       ` Dave Chinner
2019-02-15  2:35                         ` Brian Foster
2019-02-15  7:23                           ` Dave Chinner
2019-02-15 20:33                             ` Brian Foster
2019-02-08 16:29         ` Darrick J. Wong
2019-02-18  2:26 ` [PATCH 4/3] xfs: EOF blocks are not busy extents Dave Chinner
2019-02-20 15:12   ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190220151207.GA4930@localhost.localdomain \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).