public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chandan Babu R <chandanrlinux@gmail.com>
To: linux-xfs@vger.kernel.org
Cc: david@fromorbit.com, darrick.wong@oracle.com
Subject: Maximum height of rmapbt when reflink feature is enabled
Date: Mon, 30 Nov 2020 14:35:21 +0530	[thread overview]
Message-ID: <3275346.ciGmp8L3Sz@garuda> (raw)

The comment in xfs_rmapbt_compute_maxlevels() mentions that with
reflink enabled, XFS will run out of AG blocks before reaching maximum
levels of XFS_BTREE_MAXLEVELS (i.e. 9).  This is easy to prove for 4k
block size case:

Considering theoretical limits, maximum height of rmapbt can be,
max btree height = Log_(min_recs)(total recs)
max_rmapbt_height = Log_45(2^64) = 12.

Detailed calculation:
nr-levels = 1; nr-leaf-blks = 2^64 / 84 = 2e17;
nr-levels = 2; nr-blks = 2e17 / 45 = 5e15;
nr-levels = 3; nr-blks = 5e15 / 45 = 1e14;
nr-levels = 4; nr-blks = 1e14 / 45 = 2e12;
nr-levels = 5; nr-blks = 2e12 / 45 = 5e10;
nr-levels = 6; nr-blks = 5e10 / 45 = 1e9;
nr-levels = 7; nr-blks = 1e9 / 45 = 3e7;
nr-levels = 8; nr-blks = 3e7 / 45 = 6e5;
nr-levels = 9; nr-blks = 6e5 / 45 = 1e4;
nr-levels = 10; nr-blks = 1e4 / 45 = 3e2;
nr-levels = 11; nr-blks = 3e2 / 45 = 6;
nr-levels = 12; nr-blks = 1;

Total number of blocks = 2e17

Here, 84 is the minimum number of leaf records and 45 is the minimum
number of node records in the rmapbt when using 4k block size. 2^64 is
the maximum possible rmapbt records
(i.e. max_rmap_entries_per_disk_block (2^32) * max_nr_agblocks
(2^32)).

i.e. theoretically rmapbt height can go upto 12.

But as the comment in xfs_rmapbt_compute_maxlevels() suggests, we will
run out of per-ag blocks trying to build an rmapbt of height
XFS_BTREE_MAXLEVELS (i.e. 9).

Since number of nodes grows as a geometric series,
nr_nodes (roughly) = (45^9 - 1) / (45 - 1) = 10e12

i.e. 10e12 blocks > max ag blocks (2^32 == 4e9)


However, with 1k block size we are not close to consuming all of 2^32
AG blocks as shown by the below calculations,

 - rmapbt with maximum of 9 levels will have roughly (11^9 - 1) / (11 -
   1) = 2e8 blocks.
   - 11 is the minimum number of recs in a non-leaf node with 1k block size.
   - Also, Total number of records (roughly) = (nr_leaves * 11) = 11^8 * 11
     = 2e9 (this value will be used later).
 
 - refcountbt
   - Maximum number of records theoretically = maximum number of blocks
     in an AG = 2^32
   - Total (leaves and non-leaf nodes) blocks required to hold 2^32 records
     Leaf min recs = 20;  Node min recs = 60 (with 1k as the block size).
     - Detailed calculation:
 	    nr-levels = 1; nr-leaf-blks = 2^32 / 20 = 2e8;
 	    nr-levels = 2; nr-blks = 2e8 / 60 = 4e6
 	    nr-levels = 3; nr-blks = 4e6 / 60 = 6e4
 	    nr-levels = 4; nr-blks = 6e4 / 60 = 1.0e3
 	    nr-levels = 5; nr-blks = 1.0e3 / 60 = 2e1
 	    nr-levels = 6; nr-blks = 1
 
     - Total block count = 2e8
 
 - Bmbt (assuming all the rmapbt records have the same inode as owner)
   - Total (leaves and non-leaf nodes) blocks required to hold 2e9 records
     Leaf min recs = 29;  Node min recs = 29 (with 1k as the block size).
     (2e9 is the maximum rmapbt records with rmapbt height 9 and 1k block size).
       nr-levels = 1; nr-leaf-blks = 2e9 / 29 = 7e7
       nr-levels = 2; nr-blks = 7e7 / 29 = 2e6
       nr-levels = 3; nr-blks = 2e6 / 29 = 8e4
       nr-levels = 4; nr-blks = 8e4 / 29 = 3e3
       nr-levels = 5; nr-blks = 3e3 / 29 = 1e2
       nr-levels = 6; nr-blks = 1e2 / 29 = 3
       nr-levels = 7; nr-blks = 1
 
   - Total block count = 7e7
 
 Total blocks used across rmapbt, refcountbt and bmbt = 2e8 + 2e8 + 7e7 = 5e8.
 
Since 5e8 < 4e9(i.e. 2^32), we have not run out of blocks trying to
build a rmapbt with XFS_BTREE_MAXLEVELS (i.e 9) levels.

Please let me know if my understanding is incorrect.

I have come across a "log reservation" calculation issue when
increasing XFS_BTREE_MAXLEVELS to 10 which is in turn required for
extending data fork extent count to 48 bits. To proceed further, I
need to have a correct understanding of problem I have described w.r.t
1k filesystem block size.

-- 
chandan




             reply	other threads:[~2020-11-30  9:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30  9:05 Chandan Babu R [this message]
2020-11-30 19:26 ` Maximum height of rmapbt when reflink feature is enabled Darrick J. Wong
2020-11-30 22:03   ` Dave Chinner
2020-12-01 13:12     ` Chandan Babu R
2020-12-01 16:22       ` Darrick J. Wong
2020-11-30 21:51 ` Dave Chinner
2020-12-01 13:12   ` Chandan Babu R
2020-12-03 21:55     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3275346.ciGmp8L3Sz@garuda \
    --to=chandanrlinux@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox