From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH] xfs: speed up directory bestfree block scanning
Date: Wed, 3 Jan 2018 17:27:48 +1100 [thread overview]
Message-ID: <20180103062748.16400-1-david@fromorbit.com> (raw)
From: Dave Chinner <dchinner@redhat.com>
When running a "create millions inodes in a directory" test
recently, I noticed we were spending a huge amount of time
converting freespace block headers from disk format to in-memory
format:
31.47% [kernel] [k] xfs_dir2_node_addname
17.86% [kernel] [k] xfs_dir3_free_hdr_from_disk
3.55% [kernel] [k] xfs_dir3_free_bests_p
We shouldn't be hitting the best free block scanning code so hard
when doing sequential directory creates, and it turns out there's
a highly suboptimal loop searching the the best free array in
the freespace block - it decodes the block header before checking
each entry inside a loop, instead of decoding the header once before
running the entry search loop.
This makes a massive difference to create rates. Profile now looks
like this:
13.15% [kernel] [k] xfs_dir2_node_addname
3.52% [kernel] [k] xfs_dir3_leaf_check_int
3.11% [kernel] [k] xfs_log_commit_cil
And the wall time/average file create rate differences are
just as stark:
create time(sec) / rate (files/s)
File count vanilla patched
10k 0.54 / 18.5k 0.53 / 18.9k
20k 1.10 / 18.1k 1.05 / 19.0k
100k 4.21 / 23.8k 3.91 / 25.6k
200k 9.66 / 20,7k 7.37 / 27.1k
1M 86.61 / 11.5k 48.26 / 20.7k
2M 206.13 / 9.7k 129.71 / 15.4k
The larger the directory, the bigger the performance improvement.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/libxfs/xfs_dir2_node.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index 682e2bf370c7..bcf0d43cd6a8 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -1829,24 +1829,24 @@ xfs_dir2_node_addname_int(
*/
bests = dp->d_ops->free_bests_p(free);
dp->d_ops->free_hdr_from_disk(&freehdr, free);
- if (be16_to_cpu(bests[findex]) != NULLDATAOFF &&
- be16_to_cpu(bests[findex]) >= length)
- dbno = freehdr.firstdb + findex;
- else {
- /*
- * Are we done with the freeblock?
- */
- if (++findex == freehdr.nvalid) {
- /*
- * Drop the block.
- */
- xfs_trans_brelse(tp, fbp);
- fbp = NULL;
- if (fblk && fblk->bp)
- fblk->bp = NULL;
+ do {
+
+ if (be16_to_cpu(bests[findex]) != NULLDATAOFF &&
+ be16_to_cpu(bests[findex]) >= length) {
+ dbno = freehdr.firstdb + findex;
+ break;
}
+ } while (++findex < freehdr.nvalid);
+
+ /* Drop the block if we done with the freeblock */
+ if (findex == freehdr.nvalid) {
+ xfs_trans_brelse(tp, fbp);
+ fbp = NULL;
+ if (fblk)
+ fblk->bp = NULL;
}
}
+
/*
* If we don't have a data block, we need to allocate one and make
* the freespace entries refer to it.
--
2.15.0
next reply other threads:[~2018-01-03 6:27 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-03 6:27 Dave Chinner [this message]
2018-01-03 11:59 ` [PATCH] xfs: speed up directory bestfree block scanning Dave Chinner
2018-01-03 13:41 ` Brian Foster
2018-01-03 21:00 ` Dave Chinner
2018-01-04 14:04 ` Brian Foster
2018-01-04 21:52 ` Dave Chinner
2018-01-03 13:28 ` Brian Foster
2018-01-03 20:38 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180103062748.16400-1-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).