public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chandan Babu R <chandanrlinux@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>, linux-xfs@vger.kernel.org
Subject: Re: Maximum height of rmapbt when reflink feature is enabled
Date: Tue, 01 Dec 2020 18:42:51 +0530	[thread overview]
Message-ID: <2114686.IuJF2Ahm34@garuda> (raw)
In-Reply-To: <20201130220347.GI2842436@dread.disaster.area>

On Tue, 01 Dec 2020 09:03:47 +1100, Dave Chinner  wrote:
> On Mon, Nov 30, 2020 at 11:26:05AM -0800, Darrick J. Wong wrote:
> > On Mon, Nov 30, 2020 at 02:35:21PM +0530, Chandan Babu R wrote:
> > > I have come across a "log reservation" calculation issue when
> > > increasing XFS_BTREE_MAXLEVELS to 10 which is in turn required for
> > 
> > Hmm.  That will increase the size of the btree cursor structure even
> > farther.  It's already gotten pretty bad with the realtime rmap and
> > reflink patchsets since the realtime volume can have 2^63 blocks, which
> > implies a theoretical maximum rtrmapbt height of 21 levels and a maximum
> > rtrefcountbt height of 13 levels.
> 
> The cursor is dynamically allocated, yes? So what we need to do is
> drop the idea that the btree is a fixed size and base it's size on
> the actual number of levels iwe calculated for that the btree it is
> being allocated for, right?
> 
> > (These heights are absurd, since they imply a data device of 2^63
> > blocks...)
> > 
> > I suspect that we need to split MAXLEVELS into two values -- one for
> > per-AG btrees, and one for per-file btrees,
> 
> We already do that. XFS_BTREE_MAXLEVELS is supposed to only be for
> per-AG btrees.  It is not used for BMBTs at all, they use
> mp->m_bm_maxlevels[] which have max height calculations done at
> mount time.
> 
> The problem is the cursor, because right now max mp->m_bm_maxlevels
> fits within the XFS_BTREE_MAXLEVELS limit for the per-AG trees as
> well, because everything is limited to less than 2^32 records...
> 
> > and then refactor the btree
> > cursor so that the level data are a single VLA at the end.  I started a
> > patchset to do all that[1], but it's incomplete.
> > 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/commit/?h=btree-dynamic-depth&id=692f761838dd821cd8cc5b3d1c66d6b1ac8ec05b
>

Darrick, I will rebase my "Extend data fork extent count field" patches on top
your patch with required fixes applied. Please let me know if you have any
objection to it.

> Yeah, this, along with dynamic sizing of the rmapbt based
> on the physical AG size when refcount is enabled...
> 
> And then we just don't have to care about the 1kB block size case at
> all....
> 

-- 
chandan




  reply	other threads:[~2020-12-01 13:13 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30  9:05 Maximum height of rmapbt when reflink feature is enabled Chandan Babu R
2020-11-30 19:26 ` Darrick J. Wong
2020-11-30 22:03   ` Dave Chinner
2020-12-01 13:12     ` Chandan Babu R [this message]
2020-12-01 16:22       ` Darrick J. Wong
2020-11-30 21:51 ` Dave Chinner
2020-12-01 13:12   ` Chandan Babu R
2020-12-03 21:55     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2114686.IuJF2Ahm34@garuda \
    --to=chandanrlinux@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox