public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: xfs@oss.sgi.com
Subject: Pathological allocation pattern with direct IO
Date: Wed, 6 Mar 2013 21:22:10 +0100	[thread overview]
Message-ID: <20130306202210.GA1318@quack.suse.cz> (raw)

  Hello,

  one of our customers has application that write large (tens of GB) files
using direct IO done in 16 MB chunks. They keep the fs around 80% full
deleting oldest files when they need to store new ones. Usually the file
can be stored in under 10 extents but from time to time a pathological case
is triggered and the file has few thousands extents (which naturally has
impact on performance). The customer actually uses 2.6.32-based kernel but
I reproduced the issue with 3.8.2 kernel as well.

I was analyzing why this happens and the filefrag for the file looks like:
Filesystem type is: 58465342
File size of /raw_data/ex.20130302T121135/ov.s1a1.wb is 186294206464
(45481984 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0       13          4550656
   1 4550656 188136807  4550668 12562432
   2 17113088 200699240 200699238 622592
   3 17735680 182046055 201321831   4096
   4 17739776 182041959 182050150   4096
   5 17743872 182037863 182046054   4096
   6 17747968 182033767 182041958   4096
   7 17752064 182029671 182037862   4096
...
6757 45400064 154381644 154389835   4096
6758 45404160 154377548 154385739   4096
6759 45408256 252951571 154381643  73728 eof
/raw_data/ex.20130302T121135/ov.s1a1.wb: 6760 extents found

So we see that at one moment, the allocator starts giving us 16 MB chunks
backwards. This seems to be caused by XFS_ALLOCTYPE_NEAR_BNO allocation. For
two cases I was able to track down the logic:

1) We start allocating blocks for file. We want to allocate in the same AG
as the inode is. First we try exact allocation which fails so we try
XFS_ALLOCTYPE_NEAR_BNO allocation which finds large enough free extent
before the inode. So we start allocating 16 MB chunks from the end of that
free extent. From this moment on we are basically bound to continue
allocating backwards using XFS_ALLOCTYPE_NEAR_BNO allocation until we
exhaust the whole free extent.

2) Similar situation happens when we cannot further grow current extent but
there is large free space somewhere before this extent in the AG.

So I was wondering is this known? Is XFS_ALLOCTYPE_NEAR_BNO so beneficial
it outweights pathological cases like the above? Or shouldn't it maybe be
disabled for larger files or for direct IO?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

             reply	other threads:[~2013-03-06 20:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-06 20:22 Jan Kara [this message]
2013-03-06 22:01 ` Pathological allocation pattern with direct IO Ben Myers
2013-03-06 22:31 ` Peter Grandi
2013-03-07  5:03 ` Dave Chinner
2013-03-07 10:24   ` Jan Kara
2013-03-07 13:58     ` Mark Tinguely
2013-03-08  1:35     ` Dave Chinner
2013-03-12 11:01       ` Jan Kara
2013-03-14 23:36         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130306202210.GA1318@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox