public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: jbr <jbr@squareup.com>
Cc: xfs@oss.sgi.com
Subject: Re: understanding speculative preallocation
Date: Fri, 26 Jul 2013 21:50:21 +1000	[thread overview]
Message-ID: <20130726115021.GO13468@dastard> (raw)
In-Reply-To: <1374823420041-35002.post@n7.nabble.com>

On Fri, Jul 26, 2013 at 12:23:40AM -0700, jbr wrote:
> Hello,
> 
> I'm looking for general documentation/help with the speculative
> preallocation feature in xfs.  So far, I haven't really been able to find
> any definitive, up to date documentation on it.

Read the code - it's documented in the comments. ;)

Or ask questions here, because the code changes and the only up to
date reference is the code and/or the developers that work on it...

> I'm wondering how I can find out definitively which version of xfs I am
> using, and what the preallocation scheme in use is.

Look at the kernel version, then look at the corresponding source
code.

> We are running apache kafka on our servers, and kafka uses sequential io to
> write data log files.  Kafka uses, by default, a maximum log file size of
> 1Gb.  However, most of the log files end up being 2Gb, and thus the disk
> fills up twice as fast as it should.
> 
> We are using xfs on CentOS 2.6.32-358.  Is there a way I can know which
> version of xfs is built into this version of the kernel?

The XFS code is part of the kernel, so look at the kernel source
code that CentOS ships.

> We are using xfs (mounted with no allocsize specified).  I've seen varying
> info suggesting this means it either defaults to an allocsize of 64K (which
> doesn't seem to match my observations), or that it will use dynamic
> preallocation.
> 
> I've also seen hints (but no actual canonical documentation) suggesting that
> the dynamic preallocation works by progressively doubling the current file
> size (which does match my observations).

Well, it started off that way, but it has been refined since to
handle many different cases where this behaviour is sub-optimal.

> What I'm not clear on, is the scheduling for the preallocation. At what
> point does it decide to preallocate the next doubling of space.

Depends on the type of IO being done.

> Is it when
> the current preallocated space is used up,

Usually.

> or does it happen when the
> current space is used up within some threshold.

No.

> What I'd like to do, is
> keep the doubling behavior in tact, but have it capped so it never increases
> the file beyond 1Gb.  Is there a way to do that?

No.

> Can I trick the
> preallocation to not do a final doubling, if I cap my kafka log files at
> say, 900Mb (or some percentage under 1Gb)?

No.

> There are numerous references to an allocation schedule like this:
> 
> freespace       max prealloc size
>   >5%             full extent (8GB)
>   4-5%             2GB (8GB >> 2)
>   3-4%             1GB (8GB >> 3)
>   2-3%           512MB (8GB >> 4)
>   1-2%           256MB (8GB >> 5)
>   <1%            128MB (8GB >> 6)
> 
> I'm just not sure I understand what this is telling me.  It seems to tell me
> what the max prealloc size is, with being reduced if the disk is nearly
> full.

Yes, that's correct. Mainline also does this for quota exhaustion,
too.

> But it doesn't tell me about the progressive doubling in
> preallocation (I assume up to a max of 8Gb).  Is any of this configurable? 

No.

> Can we specify a max prealloc size somewhere?

Use the allocsize mount option. It turns off dynamic behaviour and
fixes the pre-allocation size.

> The other issue seems to be that after the files are closed (from within the
> java jvm), they still don't seem to have their pre-allocated space
> reclaimed.  Are there known issues with closing the files in java not
> properly causing a flush of the preallocated space?

Possibly. There's a heuristic that turns of truncation at close - if
your applicatin keeps doing "open-write-close" it will not truncate
preallocation. Log files typically see this IO pattern from
applications, and hence triggering that "no truncate" heuristic is
exactly what you want to have happen to avoid severe fragmentation
of the log files.

> Any help pointing me to any documentation/user guides which accurately
> describes this would be appreciated!

The mechanism is not documented outside the code as it changes from
kernel release to kernel release and supposed to be transparent to
userspace. It's being refined and optimisaed as issues are reported.
Indeed, I suspect that all your problems would disappear on mainline
due to the background removal of preallocation that is no longer
needed, and Centos doesn't have that...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-07-26 11:50 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-26  7:23 understanding speculative preallocation jbr
2013-07-26 11:50 ` Dave Chinner [this message]
2013-07-26 17:40   ` Jason Rosenberg
2013-07-26 19:27     ` Stan Hoeppner
2013-07-26 19:43       ` A short digression on FOSS (Re: understanding speculative preallocation) Jay Ashworth
2013-07-27  3:52         ` Stan Hoeppner
2013-07-27 21:00           ` Jay Ashworth
2013-07-28  1:38             ` aurfalien
2013-07-28  1:50               ` Jay Ashworth
2013-07-28  2:08                 ` aurfalien
2013-07-28  2:21                   ` Jay Ashworth
2013-07-28  5:09                     ` Purpose of the XFS list -- was: " Stan Hoeppner
2013-07-28 15:45                       ` Jay Ashworth
2013-08-14 17:01                         ` Emmanuel Florac
2013-07-28  7:18                     ` Stefan Ring
2013-07-28 15:48                       ` Jay Ashworth
2013-07-29  0:02                       ` Dave Chinner
2013-07-29  0:06                         ` Jay Ashworth
2013-07-29  2:41                           ` Dave Chinner
2013-07-29  3:12                             ` Eric Sandeen
2013-07-29  4:11                               ` Stan Hoeppner
2013-07-29 14:33                                 ` Jay Ashworth
2013-07-29 15:25                                   ` Dave Howorth
2013-07-29  3:38                             ` Keith Keller
2013-07-29  4:32                               ` Eric Sandeen
2013-07-29  4:57                                 ` Keith Keller
2013-07-29 13:38                                   ` Eric Sandeen
2013-07-29 18:15                                     ` Keith Keller
2013-07-29 14:24                             ` Jay Ashworth
2013-07-29 14:36                               ` Jay Ashworth
2013-07-29 14:57                               ` Eric Sandeen
2013-07-29 15:30                                 ` Jay Ashworth
2013-07-29 17:05                                   ` Eric Sandeen
2013-07-29  0:00                     ` Dave Chinner
2013-07-28  5:15             ` Michael L. Semon
2013-07-26 20:38       ` understanding speculative preallocation Jason Rosenberg
2013-07-26 20:50         ` Ben Myers
2013-07-26 21:04           ` Jason Rosenberg
2013-07-26 21:11             ` Jason Rosenberg
2013-07-26 21:42               ` Ben Myers
2013-07-27  1:30               ` Dave Chinner
2013-07-28  2:19                 ` Jason Rosenberg
2013-07-29  0:04                   ` Dave Chinner
2013-07-26 21:45         ` Eric Sandeen
2013-07-27  4:26       ` Keith Keller
2013-07-27  1:26     ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2013-07-26  7:35 jbr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130726115021.GO13468@dastard \
    --to=david@fromorbit.com \
    --cc=jbr@squareup.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox