From: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
To: "Ellis H. Wilson III" <ellisw@panasas.com>,
Qu Wenruo <quwenruo.btrfs@gmx.com>,
Nikolay Borisov <nborisov@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: Status of FST and mount times
Date: Fri, 16 Feb 2018 15:20:02 +0100 [thread overview]
Message-ID: <82cda32d-8069-4a27-7f78-cf3242eeeb36@mendix.com> (raw)
In-Reply-To: <db1e0ee1-c1ae-cc5c-842c-1caa714ef62b@panasas.com>
On 02/16/2018 03:12 PM, Ellis H. Wilson III wrote:
> On 02/15/2018 08:55 PM, Qu Wenruo wrote:
>> On 2018年02月16日 00:30, Ellis H. Wilson III wrote:
>>> Very helpful information. Thank you Qu and Hans!
>>>
>>> I have about 1.7TB of homedir data newly rsync'd data on a single
>>> enterprise 7200rpm HDD and the following output for btrfs-debug:
>>>
>>> extent tree key (EXTENT_TREE ROOT_ITEM 0) 543384862720 level 2
>>> total bytes 6001175126016
>>> bytes used 1832557875200
>>>
>>> Hans' (very cool) tool reports:
>>> ROOT_TREE 624.00KiB 0( 38) 1( 1)
>>> EXTENT_TREE 327.31MiB 0( 20881) 1( 66) 2( 1)
>>
>> Extent tree is not so large, a little unexpected to see such slow mount.
>>
>> BTW, how many chunks do you have?
>>
>> It could be checked by:
>>
>> # btrfs-debug-tree -t chunk <device> | grep CHUNK_ITEM | wc -l
>
> Since yesterday I've doubled the size by copying the homdir dataset in
> again. Here are new stats:
>
> extent tree key (EXTENT_TREE ROOT_ITEM 0) 385990656 level 2
> total bytes 6001175126016
> bytes used 3663525969920
>
> $ sudo btrfs-debug-tree -t chunk /dev/sdb | grep CHUNK_ITEM | wc -l
> 3454
>
> $ sudo ./show_metadata_tree_sizes.py /mnt/btrfs/
> ROOT_TREE 1.14MiB 0( 72) 1( 1)
> EXTENT_TREE 644.27MiB 0( 41101) 1( 131) 2( 1)
> CHUNK_TREE 384.00KiB 0( 23) 1( 1)
> DEV_TREE 272.00KiB 0( 16) 1( 1)
> FS_TREE 11.55GiB 0(754442) 1( 2179) 2( 5) 3( 2)
> CSUM_TREE 3.50GiB 0(228593) 1( 791) 2( 2) 3( 1)
> QUOTA_TREE 0.00B
> UUID_TREE 16.00KiB 0( 1)
> FREE_SPACE_TREE 0.00B
> DATA_RELOC_TREE 16.00KiB 0( 1)
>
> The old mean mount time was 4.319s. It now takes 11.537s for the
> doubled dataset. Again please realize this is on an old version of
> BTRFS (4.5.5), so perhaps newer ones will perform better, but I'd still
> like to understand this delay more. Should I expect this to scale in
> this way all the way up to my proposed 60-80TB filesystem so long as the
> file size distribution stays roughly similar? That would definitely be
> in terms of multiple minutes at that point.
Well, imagine you have a big tree (an actual real life tree outside) and
you need to pick things (e.g. apples) which are hanging everywhere.
So, what you need to to is climb the tree, climb on a branch all the way
to the end where the first apple is... climb back, climb up a bit, go
onto the next branch to the end for the next apple... etc etc....
The bigger the tree is, the longer it keeps you busy, because the apples
will be semi-evenly distributed around the full tree, and they're always
hanging at the end of the branch. The speed with which you can climb
around (random read disk access IO speed for btrfs, because your disk
cache is empty when first mounting) determines how quickly you're done.
So, yes.
>>> Taking 100 snapshots (no changes between snapshots however) of the above
>>> subvolume doesn't appear to impact mount/umount time.
>>
>> 100 unmodified snapshots won't affect mount time.
>>
>> It needs new extents, which can be created by overwriting extents in
>> snapshots.
>> So it won't really cause much difference if all these snapshots are all
>> unmodified.
>
> Good to know, thanks!
>
>>> Snapshot creation
>>> and deletion both operate at between 0.25s to 0.5s.
>>
>> IIRC snapshot deletion is delayed, so the real work doesn't happen when
>> "btrfs sub del" returns.
>
> I was using btrfs sub del -C for the deletions, so I believe (if that
> command truly waits for the subvolume to be utterly gone) it captures
> the entirety of the snapshot.
>
> Best,
>
> ellis
--
Hans van Kranenburg
next prev parent reply other threads:[~2018-02-16 14:20 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-14 16:00 Status of FST and mount times Ellis H. Wilson III
2018-02-14 17:08 ` Nikolay Borisov
2018-02-14 17:21 ` Ellis H. Wilson III
2018-02-15 1:42 ` Qu Wenruo
2018-02-15 2:15 ` Duncan
2018-02-15 3:49 ` Qu Wenruo
2018-02-15 11:12 ` Hans van Kranenburg
2018-02-15 16:30 ` Ellis H. Wilson III
2018-02-16 1:55 ` Qu Wenruo
2018-02-16 14:12 ` Ellis H. Wilson III
2018-02-16 14:20 ` Hans van Kranenburg [this message]
2018-02-16 14:42 ` Ellis H. Wilson III
2018-02-16 14:55 ` Ellis H. Wilson III
2018-02-17 0:59 ` Qu Wenruo
2018-02-20 14:59 ` Ellis H. Wilson III
2018-02-20 15:41 ` Austin S. Hemmelgarn
2018-02-21 1:49 ` Qu Wenruo
2018-02-21 14:49 ` Ellis H. Wilson III
2018-02-21 15:03 ` Hans van Kranenburg
2018-02-21 15:19 ` Ellis H. Wilson III
2018-02-21 15:56 ` Hans van Kranenburg
2018-02-22 12:41 ` Austin S. Hemmelgarn
2018-02-21 21:27 ` E V
2018-02-22 0:53 ` Qu Wenruo
2018-02-15 5:54 ` Chris Murphy
2018-02-14 23:24 ` Duncan
2018-02-15 15:42 ` Ellis H. Wilson III
2018-02-15 16:51 ` Austin S. Hemmelgarn
2018-02-15 16:58 ` Ellis H. Wilson III
2018-02-15 17:57 ` Austin S. Hemmelgarn
2018-02-15 6:14 ` Chris Murphy
2018-02-15 16:45 ` Ellis H. Wilson III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=82cda32d-8069-4a27-7f78-cf3242eeeb36@mendix.com \
--to=hans.van.kranenburg@mendix.com \
--cc=ellisw@panasas.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).