Re: [PATCH V8 15/19] xfs: Directory's data fork extent counter can never overflow

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Chandan Babu R <chandan.babu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH V8 15/19] xfs: Directory's data fork extent counter can never overflow
Date: Fri, 01 Apr 2022 13:16:49 +0530	[thread overview]
Message-ID: <87fsmx6yae.fsf@debian-BULLSEYE-live-builder-AMD64> (raw)
In-Reply-To: <20220401012712.GJ1544202@dread.disaster.area>

On 01 Apr 2022 at 06:57, Dave Chinner wrote:
> On Tue, Mar 29, 2022 at 08:43:33PM -0700, Darrick J. Wong wrote:
>> But then the second question is: what's the maximum height of a dabtree
>> that indexes an xattr structure?  I don't think there's any maximum
>> limit within XFS on the number of attrs you can set on a file, is there?
>
> Nope. But the attr btree is a different beast to the dirv2
> structure - it's a hashed index btree with the xattr records in the
> leaf, it's not an index of a fixed max size external data structure.
>
>> At least until you hit the iext_max_count check.  I think the VFS
>> institutes its own limit of 64k for the llistxattr buffer, but that's
>> about all I can think of.
>
>
>> I suppose right now the xattr structure can't grow larger than 2^(16+21)
>> blocks in size, which is 2^49 bytes, but that's a mix of attr leaves and
>> dabtree blocks, unlike directories, right?
>
> Yes. ALso remote xattr blocks, which aren't stored in the dabtree
> at all, but are indexed by the attr fork BMBT address space.
>
> So I think for XFS_DA_NODE_MAXDEPTH = 5, the xattr tree with a 4kB
> block size (minimum allowed, IIRC), we can index approximately
> (500^4) individual xattrs in the btree index (assuming 4 levels of
> index and 1 level of leaf nodes containing xattrs).
>
> My math calculates that to be about 62.5 billion xattrs before we
> run out of index space in a 5 level tree with a 4kB block size.
> Hence I suspect we'll run out of attr fork extents even on a 32 bit
> extent count before we run out of xattr index space.
>
> Also, 62.5 billion xattrs is more links than we can have to a single
> inode (4billion), so we're not going to exhaust the xattr index
> space just with parent pointers...
>

Attr dabtree blocks can be of 1k block size. My analysis of dabtree w.r.t
parent pointers had led me to this,

An inode could have 2^32 hard links with each link having 255 byte sized name.

- Size of one xattr
  Name length + Value length = 16 + 255 = 271 bytes
  16 comes from the size of the following structure,
      struct xfs_parent_name_rec {
      	__be64  p_ino;
      	__be32  p_gen;
      	__be32  p_diroffset;
      };

- sizeof(xfs_attr_leaf_hdr_t) = 32
- sizeof(xfs_attr_leaf_entry_t) = 8
- Number of entries in a 1k leaf block
  (1024 - sizeof(xfs_attr_leaf_hdr_t)) / (8 + 271)
  = (1024 - 32) / 279 = 992 / 279
  = floor(3.55)
  = 3

- Nr leaves = (2^32 / 3) * 3 (magicpct) = ~4.3 billion
- Nr entries per node
  = (1024 - sizeof(struct xfs_da3_node_hdr)) / sizeof(struct xfs_da_node_entry)
  = (1024 - 64) / 8
  = 120 entries

- Nr entries at level (n - 1) = 4.3 billion / 120 = 36 million
- Nr entries at level (n - 2) = 36 million / 120 = 300k
- Nr entries at level (n - 3) = 300k / 120 = 2.5k
- Nr entries at level (n - 4) = 2.5k / 120 = 20
- Nr entries at level (n - 5) = 20 / 120 = 1

Hence with 1024 block size, we could require 1 leaf level and 5 non-leaf
levels. This breaches the maximum height (i.e. XFS_DA_NODE_MAXDEPTH) allowed
for a dabtree.

>> > immediately how many blocks can be in the XFS_DIR2_LEAF_SPACE
>> > segement....
>> > 
>> > We also know the maximum number of individual directory blocks in
>> > the 32GB segment (fixed at 32GB / dir block size), so the free space
>> > array is also a fixed size at (32GB / dir block size / free space
>> > entries per block).
>> > 
>> > It's easy to just use (96GB / block size) and that will catch most
>> > corruptions with no risk of a false positive detection, but we could
>> > quite easily refine this to something like:
>> > 
>> > data	(32GB +				
>> > leaf	 btree blocks(XFS_DA_NODE_MAXDEPTH) +
>> > freesp	 (32GB / free space records per block))
>> > frags					/ filesystem block size
>> 
>> I think we ought to do a more careful study of XFS_DA_NODE_MAXDEPTH,
>
> *nod*
>
> ISTR that Chandan had already done some of this when he
> characterised the attr fork btree behaviour w.r.t. suitabililty for
> large numbers of parent pointers before starting this extent count
> work.
>

Sorry, I had forgotten about that. I searched through my notes and found that
we had reached the same conclusion w.r.t 4k block size. But with 1k block size,
we could breach the limit set by XFS_DA_NODE_MAXDEPTH.

-- 
chandan

next prev parent reply	other threads:[~2022-04-01  7:47 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-21  5:17 [PATCH V8 00/19] xfs: Extend per-inode extent counters Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 01/19] xfs: Move extent count limits to xfs_format.h Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 02/19] xfs: Define max extent length based on on-disk format definition Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 03/19] xfs: Introduce xfs_iext_max_nextents() helper Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 04/19] xfs: Use xfs_extnum_t instead of basic data types Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 05/19] xfs: Introduce xfs_dfork_nextents() helper Chandan Babu R
2022-03-24 21:31   ` Dave Chinner
2022-03-21  5:17 ` [PATCH V8 06/19] xfs: Use basic types to define xfs_log_dinode's di_nextents and di_anextents Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 07/19] xfs: Promote xfs_extnum_t and xfs_aextnum_t to 64 and 32-bits respectively Chandan Babu R
2022-03-24 21:33   ` Dave Chinner
2022-03-21  5:17 ` [PATCH V8 08/19] xfs: Introduce XFS_SB_FEAT_INCOMPAT_NREXT64 and associated per-fs feature bit Chandan Babu R
2022-03-24 21:37   ` Dave Chinner
2022-03-24 21:40     ` Darrick J. Wong
2022-03-25  6:01       ` Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 09/19] xfs: Introduce XFS_FSOP_GEOM_FLAGS_NREXT64 Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 10/19] xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers Chandan Babu R
2022-03-24 21:38   ` Dave Chinner
2022-03-21  5:17 ` [PATCH V8 11/19] xfs: Use uint64_t to count maximum blocks that can be used by BMBT Chandan Babu R
2022-03-24 21:42   ` Dave Chinner
2022-03-25  6:01     ` Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 12/19] xfs: Introduce macros to represent new maximum extent counts for data/attr forks Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 13/19] xfs: Replace numbered inode recovery error messages with descriptive ones Chandan Babu R
2022-03-24 21:47   ` Dave Chinner
2022-03-21  5:17 ` [PATCH V8 14/19] xfs: Introduce per-inode 64-bit extent counters Chandan Babu R
2022-03-24 21:53   ` Dave Chinner
2022-03-25  6:02     ` Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 15/19] xfs: Directory's data fork extent counter can never overflow Chandan Babu R
2022-03-24 22:14   ` Dave Chinner
2022-03-25  6:02     ` Chandan Babu R
2022-03-25  6:37       ` Chandan Babu R
2022-03-29  5:22     ` Chandan Babu R
2022-03-29  6:23       ` Dave Chinner
2022-03-30  3:43         ` Darrick J. Wong
2022-03-30 15:39           ` Chandan Babu R
2022-03-30 15:52             ` Darrick J. Wong
2022-04-01  1:27           ` Dave Chinner
2022-04-01  7:46             ` Chandan Babu R [this message]
2022-03-21  5:17 ` [PATCH V8 16/19] xfs: Conditionally upgrade existing inodes to use large extent counters Chandan Babu R
2022-03-24 22:28   ` Dave Chinner
2022-03-25  6:02     ` Chandan Babu R
2022-03-21  5:17 ` [PATCH V8 17/19] xfs: Decouple XFS_IBULK flags from XFS_IWALK flags Chandan Babu R
2022-03-24 22:28   ` Dave Chinner
2022-03-21  5:17 ` [PATCH V8 18/19] xfs: Enable bulkstat ioctl to support 64-bit per-inode extent counters Chandan Babu R
2022-03-24 22:33   ` Dave Chinner
2022-03-21  5:17 ` [PATCH V8 19/19] xfs: Add XFS_SB_FEAT_INCOMPAT_NREXT64 to the list of supported flags Chandan Babu R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fsmx6yae.fsf@debian-BULLSEYE-live-builder-AMD64 \
    --to=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox