From: Chandan Babu R <chandanrlinux@gmail.com>
To: linux-xfs@vger.kernel.org
Cc: david@fromorbit.com, darrick.wong@oracle.com
Subject: Maximum height of rmapbt when reflink feature is enabled
Date: Mon, 30 Nov 2020 14:35:21 +0530 [thread overview]
Message-ID: <3275346.ciGmp8L3Sz@garuda> (raw)
The comment in xfs_rmapbt_compute_maxlevels() mentions that with
reflink enabled, XFS will run out of AG blocks before reaching maximum
levels of XFS_BTREE_MAXLEVELS (i.e. 9). This is easy to prove for 4k
block size case:
Considering theoretical limits, maximum height of rmapbt can be,
max btree height = Log_(min_recs)(total recs)
max_rmapbt_height = Log_45(2^64) = 12.
Detailed calculation:
nr-levels = 1; nr-leaf-blks = 2^64 / 84 = 2e17;
nr-levels = 2; nr-blks = 2e17 / 45 = 5e15;
nr-levels = 3; nr-blks = 5e15 / 45 = 1e14;
nr-levels = 4; nr-blks = 1e14 / 45 = 2e12;
nr-levels = 5; nr-blks = 2e12 / 45 = 5e10;
nr-levels = 6; nr-blks = 5e10 / 45 = 1e9;
nr-levels = 7; nr-blks = 1e9 / 45 = 3e7;
nr-levels = 8; nr-blks = 3e7 / 45 = 6e5;
nr-levels = 9; nr-blks = 6e5 / 45 = 1e4;
nr-levels = 10; nr-blks = 1e4 / 45 = 3e2;
nr-levels = 11; nr-blks = 3e2 / 45 = 6;
nr-levels = 12; nr-blks = 1;
Total number of blocks = 2e17
Here, 84 is the minimum number of leaf records and 45 is the minimum
number of node records in the rmapbt when using 4k block size. 2^64 is
the maximum possible rmapbt records
(i.e. max_rmap_entries_per_disk_block (2^32) * max_nr_agblocks
(2^32)).
i.e. theoretically rmapbt height can go upto 12.
But as the comment in xfs_rmapbt_compute_maxlevels() suggests, we will
run out of per-ag blocks trying to build an rmapbt of height
XFS_BTREE_MAXLEVELS (i.e. 9).
Since number of nodes grows as a geometric series,
nr_nodes (roughly) = (45^9 - 1) / (45 - 1) = 10e12
i.e. 10e12 blocks > max ag blocks (2^32 == 4e9)
However, with 1k block size we are not close to consuming all of 2^32
AG blocks as shown by the below calculations,
- rmapbt with maximum of 9 levels will have roughly (11^9 - 1) / (11 -
1) = 2e8 blocks.
- 11 is the minimum number of recs in a non-leaf node with 1k block size.
- Also, Total number of records (roughly) = (nr_leaves * 11) = 11^8 * 11
= 2e9 (this value will be used later).
- refcountbt
- Maximum number of records theoretically = maximum number of blocks
in an AG = 2^32
- Total (leaves and non-leaf nodes) blocks required to hold 2^32 records
Leaf min recs = 20; Node min recs = 60 (with 1k as the block size).
- Detailed calculation:
nr-levels = 1; nr-leaf-blks = 2^32 / 20 = 2e8;
nr-levels = 2; nr-blks = 2e8 / 60 = 4e6
nr-levels = 3; nr-blks = 4e6 / 60 = 6e4
nr-levels = 4; nr-blks = 6e4 / 60 = 1.0e3
nr-levels = 5; nr-blks = 1.0e3 / 60 = 2e1
nr-levels = 6; nr-blks = 1
- Total block count = 2e8
- Bmbt (assuming all the rmapbt records have the same inode as owner)
- Total (leaves and non-leaf nodes) blocks required to hold 2e9 records
Leaf min recs = 29; Node min recs = 29 (with 1k as the block size).
(2e9 is the maximum rmapbt records with rmapbt height 9 and 1k block size).
nr-levels = 1; nr-leaf-blks = 2e9 / 29 = 7e7
nr-levels = 2; nr-blks = 7e7 / 29 = 2e6
nr-levels = 3; nr-blks = 2e6 / 29 = 8e4
nr-levels = 4; nr-blks = 8e4 / 29 = 3e3
nr-levels = 5; nr-blks = 3e3 / 29 = 1e2
nr-levels = 6; nr-blks = 1e2 / 29 = 3
nr-levels = 7; nr-blks = 1
- Total block count = 7e7
Total blocks used across rmapbt, refcountbt and bmbt = 2e8 + 2e8 + 7e7 = 5e8.
Since 5e8 < 4e9(i.e. 2^32), we have not run out of blocks trying to
build a rmapbt with XFS_BTREE_MAXLEVELS (i.e 9) levels.
Please let me know if my understanding is incorrect.
I have come across a "log reservation" calculation issue when
increasing XFS_BTREE_MAXLEVELS to 10 which is in turn required for
extending data fork extent count to 48 bits. To proceed further, I
need to have a correct understanding of problem I have described w.r.t
1k filesystem block size.
--
chandan
next reply other threads:[~2020-11-30 9:06 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-30 9:05 Chandan Babu R [this message]
2020-11-30 19:26 ` Maximum height of rmapbt when reflink feature is enabled Darrick J. Wong
2020-11-30 22:03 ` Dave Chinner
2020-12-01 13:12 ` Chandan Babu R
2020-12-01 16:22 ` Darrick J. Wong
2020-11-30 21:51 ` Dave Chinner
2020-12-01 13:12 ` Chandan Babu R
2020-12-03 21:55 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3275346.ciGmp8L3Sz@garuda \
--to=chandanrlinux@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox