From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: [PATCH v2 00/11] xfs: introduce the free inode btree
Date: Tue, 19 Nov 2013 16:29:55 -0500 [thread overview]
Message-ID: <528BD853.8090900@redhat.com> (raw)
In-Reply-To: <20131113211017.GI6188@dastard>
On 11/13/2013 04:10 PM, Dave Chinner wrote:
...
>
> The problem can be demonstrated with a single CPU and a single
> spindle. Create a single AG filesystem of a 100GB, and populate it
> with 10 million inodes.
>
> Time how long it takes to create another 10000 inodes in a new
> directory. Measure CPU usage.
>
> Randomly delete 10,000 inodes from the original population to
> sparsely populate the inobt with 10000 free inodes.
>
> Time how long it takes to create another 10000 inodes in a new
> directory. Measure CPU usage.
>
> The difference in time and CPU will be diretly related to the
> addition time spent searching the inobt for free inodes...
>
Thanks for the suggestion, Dave. I've run some fs_mark tests along the
lines of what is described here. I create 10m files, randomly remove
~10k from that dataset and measure the process of allocating 10k new
inodes in both finobt and non-finobt scenarios (after a clean remount).
The tests run from a 4xcpu VM with 4GB RAM and against an isolated SATA
drive I had lying around (mapped directly via virtio). The drive is
formatted with a single VG/LV and as follows with xfs:
meta-data=/dev/mapper/testvg-testlv isize=512 agcount=1,
agsize=26214400 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Once the fs has been prepared with a random set of free inodes, the
following command is used to measure performance:
fs_mark -k -S 0 -D 4 -L 10 -n 1000 -s 0 -d /mnt/testdir
I've also collected some perf record data of these commands to compare
CPU usage. I can make the full/raw data available if desirable. Snippets
of the results are included below.
--- non-finobt, agi freecount = 9961 after random removal
- fs_mark
FSUse% Count Size Files/sec App Overhead
5 1000 0 1020.1 10811
5 2000 0 361.4 19498
5 3000 0 230.1 12154
5 4000 0 166.7 12816
5 5000 0 129.7 27409
5 6000 0 105.7 13946
5 7000 0 87.6 31792
5 8000 0 77.8 14921
5 9000 0 67.3 15597
5 10000 0 62.4 15835
- time
real 1m26.579s
user 0m0.120s
sys 1m26.113s
- perf report
6.21% :1994 [kernel.kallsyms] [k] memcmp
5.66% :1993 [kernel.kallsyms] [k] memcmp
4.84% :1992 [kernel.kallsyms] [k] memcmp
4.76% :1994 [xfs] [k] xfs_btree_check_sblock
4.46% :1993 [xfs] [k] xfs_btree_check_sblock
4.39% :1991 [kernel.kallsyms] [k] memcmp
3.88% :1992 [xfs] [k] xfs_btree_check_sblock
3.54% :1990 [kernel.kallsyms] [k] memcmp
3.38% :1991 [xfs] [k] xfs_btree_check_sblock
2.91% :1989 [kernel.kallsyms] [k] memcmp
2.89% :1990 [xfs] [k] xfs_btree_check_sblock
2.44% :1988 [kernel.kallsyms] [k] memcmp
2.31% :1989 [xfs] [k] xfs_btree_check_sblock
1.84% :1988 [xfs] [k] xfs_btree_check_sblock
1.65% :1987 [kernel.kallsyms] [k] memcmp
1.28% :1987 [xfs] [k] xfs_btree_check_sblock
1.12% :1994 [xfs] [k] xfs_btree_increment
1.08% :1994 [xfs] [k] xfs_btree_get_rec
1.04% :1993 [xfs] [k] xfs_btree_increment
1.00% :1993 [xfs] [k] xfs_btree_get_rec
0.99% :1986 [kernel.kallsyms] [k] memcmp
0.89% :1992 [xfs] [k] xfs_btree_increment
0.85% :1994 [xfs] [k] xfs_inobt_get_rec
0.84% :1992 [xfs] [k] xfs_btree_get_rec
0.77% :1991 [xfs] [k] xfs_btree_increment
0.77% :1986 [xfs] [k] xfs_btree_check_sblock
0.77% :1993 [xfs] [k] xfs_inobt_get_rec
0.75% :1991 [xfs] [k] xfs_btree_get_rec
0.69% :1992 [xfs] [k] xfs_inobt_get_rec
0.64% :1990 [xfs] [k] xfs_btree_increment
0.62% :1994 [xfs] [k] xfs_inobt_get_maxrecs
0.61% :1990 [xfs] [k] xfs_btree_get_rec
0.58% :1991 [xfs] [k] xfs_inobt_get_rec
...
--- finobt, agi freecount = 10137 after random removal
- fs_mark
FSUse% Count Size Files/sec App Overhead
5 1000 0 9210.0 8587
5 2000 0 5592.1 14933
5 3000 0 7095.4 11355
5 4000 0 5371.1 13613
5 5000 0 4919.3 14534
5 6000 0 4375.7 15813
5 7000 0 5011.3 15095
5 8000 0 4629.8 17902
5 9000 0 5622.9 12975
5 10000 0 5761.4 12203
- time
real 0m1.831s
user 0m0.104s
sys 0m1.384s
- perf report
1.82% :2520 [kernel.kallsyms] [k] lock_acquire
1.65% :2519 [kernel.kallsyms] [k] lock_acquire
1.65% :2525 [kernel.kallsyms] [k] lock_acquire
1.45% :2523 [kernel.kallsyms] [k] lock_acquire
1.44% :2524 [kernel.kallsyms] [k] lock_acquire
1.34% :2521 [kernel.kallsyms] [k] lock_acquire
1.27% :2522 [kernel.kallsyms] [k] lock_acquire
1.18% :2526 [kernel.kallsyms] [k] lock_acquire
1.15% :2527 [kernel.kallsyms] [k] lock_acquire
1.09% :2525 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.03% :2524 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.88% :2520 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.83% :2523 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.81% :2521 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.79% :2519 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.79% :2522 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.76% :2519 [kernel.kallsyms] [k] kmem_cache_free
0.76% :2520 [kernel.kallsyms] [k] kmem_cache_free
0.73% :2526 [kernel.kallsyms] [k] kmem_cache_free
...
0.30% :2525 [xfs] [k] xfs_dir3_leaf_check_int
0.28% :2525 [kernel.kallsyms] [k] memcpy
0.27% :2527 [kernel.kallsyms] [k] security_compute_sid.part.14
0.26% :2520 [kernel.kallsyms] [k] memcpy
0.26% :2523 [xfs] [k] _xfs_buf_find
0.26% :2526 [xfs] [k] _xfs_buf_find
Summarized, the results show a nice improvement for inode allocation
into a set of inode chunks with random free inode availability. The 10k
inode allocation reduces from ~90s to ~2s and CPU usage from XFS drops
way down in the perf profile.
I haven't extensively tested the following, but a quick 1 million inode
allocation test on a fresh, single AG fs shows a slight degradation with
the finobt enabled in terms of time to complete:
fs_mark -k -S 0 -D 4 -L 10 -n 100000 -s 0 -d /mnt/bigdir
- non-finobt
real 1m35.349s
user 0m4.555s
sys 1m29.749s
- finobt
real 1m42.396s
user 0m4.326s
sys 1m37.152s
Brian
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-11-19 21:30 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-13 14:36 [PATCH v2 00/11] xfs: introduce the free inode btree Brian Foster
2013-11-13 14:36 ` [PATCH v2 01/11] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
2013-11-13 16:17 ` Christoph Hellwig
2013-11-13 14:36 ` [PATCH v2 02/11] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
2013-11-13 16:18 ` Christoph Hellwig
2013-11-13 14:36 ` [PATCH v2 03/11] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
2013-11-13 14:37 ` [PATCH v2 04/11] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
2013-11-13 14:37 ` [PATCH v2 05/11] xfs: insert newly allocated inode chunks into the finobt Brian Foster
2013-11-13 14:37 ` [PATCH v2 06/11] xfs: use and update the finobt on inode allocation Brian Foster
2013-11-13 14:37 ` [PATCH v2 07/11] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
2013-11-13 14:37 ` [PATCH v2 08/11] xfs: update the finobt on inode free Brian Foster
2013-11-13 14:37 ` [PATCH v2 09/11] xfs: add finobt support to growfs Brian Foster
2013-11-13 14:37 ` [PATCH v2 10/11] xfs: report finobt status in fs geometry Brian Foster
2013-11-13 14:37 ` [PATCH v2 11/11] xfs: enable the finobt feature on v5 superblocks Brian Foster
2013-11-13 16:17 ` [PATCH v2 00/11] xfs: introduce the free inode btree Christoph Hellwig
2013-11-13 17:55 ` Brian Foster
2013-11-13 21:10 ` Dave Chinner
2013-11-19 21:29 ` Brian Foster [this message]
2013-11-19 22:17 ` Dave Chinner
2013-11-17 22:43 ` Michael L. Semon
2013-11-18 22:38 ` Michael L. Semon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=528BD853.8090900@redhat.com \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.