All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Elder <aelder@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 1/2] xfs: dynamic speculative EOF preallocation
Date: Thu, 14 Oct 2010 12:22:45 -0500	[thread overview]
Message-ID: <1287076965.2362.520.camel@doink> (raw)
In-Reply-To: <1286187236-16682-2-git-send-email-david@fromorbit.com>

On Mon, 2010-10-04 at 21:13 +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Currently the size of the speculative preallocation during delayed
> allocation is fixed by either the allocsize mount option of a
> default size. We are seeing a lot of cases where we need to
> recommend using the allocsize mount option to prevent fragmentation
> when buffered writes land in the same AG.
> 
> Rather than using a fixed preallocation size by default (up to 64k),
> make it dynamic by exponentially increasing it on each subsequent
> preallocation. This will result in the preallocation size increasing
> as the file increases, so for streaming writes we are much more
> likely to get large preallocations exactly when we need it to reduce
> fragementation. It should also prevent the need for using the
> allocsize mount option for most workloads involving concurrent
> streaming writes.

I have some comments, below.

> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

. . .

> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 2057614..b2e4782 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c

. . .

> @@ -427,11 +431,25 @@ xfs_iomap_eof_want_preallocate(
>  			if ((imap[n].br_startblock != HOLESTARTBLOCK) &&
>  			    (imap[n].br_startblock != DELAYSTARTBLOCK))
>  				return 0;
> +
>  			start_fsb += imap[n].br_blockcount;
>  			count_fsb -= imap[n].br_blockcount;
> +
> +			/* count delalloc blocks beyond EOF */
> +			if (imap[n].br_startblock == DELAYSTARTBLOCK)
> +				found_delalloc += imap[n].br_blockcount;
>  		}
>  	}
> -	*prealloc = 1;

At this point, count_fsb will be 0 (a necessary condition
for loop termination, since count_fsb is unsigned).  Since
found_delalloc is initially 0 (and is also unsigned), we
can safely assume that found_delalloc >= count_fsb.  The
only case in which found_delalloc <= count_fsb is if
found_delalloc is also 0 (a case you cover separately,
first, below).

Furthermore, *prealloc was assigned the value 0 at the top.
So I think this bottom section can be simplified to:
	if (!found_delalloc)
		*prealloc = 1;

But if that's the case, then maybe the loop can simply
return as soon as it finds a delayed allocation entry.

In other words, the net effect of this is that you
only tell the caller we want preallocation if *no*
preallocated blocks beyond EOF exist.  That may be
fine, but it doesn't seem to match your understanding
based on your code, so I just wanted to call your
attention to it.

> +	if (!found_delalloc) {
> +		/* haven't got any prealloc, so need some */
> +		*prealloc = 1;
> +	} else if (found_delalloc <= count_fsb) {
> +		/* almost run out of prealloc */
> +		*prealloc = 1;
> +	} else {
> +		/* still lots of prealloc left */
> +		*prealloc = 0;
> +	}
>  	return 0;
>  }
>  
> @@ -469,6 +487,7 @@ xfs_iomap_write_delay(
>  	extsz = xfs_get_extsz_hint(ip);
>  	offset_fsb = XFS_B_TO_FSBT(mp, offset);
>  
> +

This is hunk should be killed.  It just adds an unwanted
blank line.

>  	error = xfs_iomap_eof_want_preallocate(mp, ip, offset, count,
>  				ioflag, imap, XFS_WRITE_IMAPS, &prealloc);
>  	if (error)
> @@ -476,9 +495,25 @@ xfs_iomap_write_delay(
>  
>  retry:
>  	if (prealloc) {
> +		xfs_fileoff_t	alloc_blocks = 0;
> +		/*
> +		 * If we don't have a user specified preallocation size, dynamically
> +		 * increase the preallocation size as we do more preallocation.
> +		 * Cap the maximum size at a single extent.
> +		 */
> +		if (!(mp->m_flags & XFS_MOUNT_DFLT_IOSIZE)) {

I note that this is circumventing the special code in
xfs_set_rw_sizes() that tries to set up a different (smaller)
size in the event the "sync" (generic) mount option was supplied
(indicated by XFS_MOUNT_SYNC).  If that is a good thing, then I
suggest the code in xfs_set_rw_sizes() go away.  But it would be
good to have the case for making that change stated.

> +			alloc_blocks = XFS_FILEOFF_MIN(MAXEXTLEN,
> +						(ip->i_last_prealloc * 4));
> +		}

So this is the spot that begs the question of whether the
the default I/O size mount option is needed any more.  The
net effect of your change (assuming no "allocsize" mount
option is in effect) is:
- Initially, ip->i_last_prealloc will be 0.  Therefore the
  first time through, the preallocated blocks beyond the
  end will be based on m_writeio_blocks (either 16KB or
  64KB, dependent on whether XFS_MOUNT_WSYNC was specified).
- Thereafter, whenever more preallocated-at-EOF blocks
  are needed, the number allocated will be 4 times more
  than the last time (growing exponentially), limited by
  the maximum extent size.

I guess the reason one might want the "allocsize" mount
option now becomes the opposite of why one might have
wanted it before.  I.e., it would be used to *reduce*
the size of the preallocated range beyond EOF, which I
could envision might be reasonable in some circumstances.

> +		if (alloc_blocks == 0)
> +			alloc_blocks = mp->m_writeio_blocks;
> +		ip->i_last_prealloc = alloc_blocks;
> +
>  		aligned_offset = XFS_WRITEIO_ALIGN(mp, (offset + count - 1));
>  		ioalign = XFS_B_TO_FSBT(mp, aligned_offset);
> -		last_fsb = ioalign + mp->m_writeio_blocks;
> +		last_fsb = ioalign + alloc_blocks;
> +		printk("ino %lld, ioalign 0x%llx, alloc_blocks 0x%llx\n",
> +				ip->i_ino, ioalign, alloc_blocks);

Kill the debug printk() call.

>  	} else {
>  		last_fsb = XFS_B_TO_FSB(mp, ((xfs_ufsize_t)(offset + count)));
>  	}



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2010-10-14 17:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-04 10:13 [RFC, PATCH 0/2] xfs: dynamic speculative preallocation for delalloc Dave Chinner
2010-10-04 10:13 ` [PATCH 1/2] xfs: dynamic speculative EOF preallocation Dave Chinner
2010-10-14 17:22   ` Alex Elder [this message]
2010-10-14 21:33     ` Dave Chinner
2010-10-15  6:51       ` allocsize mount option, was: " Michael Monnerie
2010-10-15 11:59         ` Dave Chinner
2010-10-04 10:13 ` [PATCH 2/2] xfs: don't truncate prealloc from frequently accessed inodes Dave Chinner
2010-10-14 17:22   ` Alex Elder
2010-10-14 21:28     ` Dave Chinner
2010-10-14 17:22 ` [RFC, PATCH 0/2] xfs: dynamic speculative preallocation for delalloc Alex Elder
2010-10-14 21:16   ` Dave Chinner
2010-10-14 21:50     ` Ivan.Novick
2010-10-15  7:14       ` Michael Monnerie
2010-10-15 11:45         ` Dave Chinner
2010-10-17 14:31           ` Michael Monnerie
2010-10-17 23:49             ` Dave Chinner
2010-10-18  6:39               ` Michael Monnerie
  -- strict thread matches above, loose matches on Subject: below --
2010-11-29  0:43 [PATCH 0/2] xfs: dynamic speculative allocation beyond EOF V3 Dave Chinner
2010-11-29  0:43 ` [PATCH 1/2] xfs: dynamic speculative EOF preallocation Dave Chinner
2010-12-07 10:17   ` Christoph Hellwig
2010-12-07 10:49     ` Dave Chinner
2010-12-13  1:25 [PATCH 0/2] xfs: dynamic speculative allocation beyond EOF V4 Dave Chinner
2010-12-13  1:25 ` [PATCH 1/2] xfs: dynamic speculative EOF preallocation Dave Chinner
2010-12-15 18:57   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1287076965.2362.520.camel@doink \
    --to=aelder@sgi.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.