From: Bill O'Donnell <billodo@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: xfs <linux-xfs@vger.kernel.org>, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v2] xfsdocs: capture some information about dirs vs. attrs and how they use dabtrees
Date: Tue, 5 May 2020 08:03:19 -0500 [thread overview]
Message-ID: <20200505130319.GA88638@redhat.com> (raw)
In-Reply-To: <20200413194600.GC6742@magnolia>
On Mon, Apr 13, 2020 at 12:46:00PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Dave and I had a short discussion about whether or not xattr trees
> needed to have the same free space tracking that directories have, and
> a comparison of how each of the two metadata types interact with
> dabtrees resulted. I've reworked this a bit to make it flow better as a
> book chapter, so here we go.
>
> Original-mail: https://lore.kernel.org/linux-xfs/20200404085203.1908-1-chandanrlinux@gmail.com/T/#mdd12ad06cf5d635772cc38946fc5b22e349e136f
> Originally-from: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> v2: various fixes suggested by Dave; reflow the paragraphs about
> directories to describe the relations between dabtree and dirents only once;
> don't talk about an unnamed "we".
> ---
Looks good to me.
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
> .../extended_attributes.asciidoc | 55 ++++++++++++++++++++
> 1 file changed, 55 insertions(+)
>
> diff --git a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> index 99f7b35..b7a6007 100644
> --- a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> +++ b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> @@ -910,3 +910,58 @@ Log sequence number of the last write to this block.
>
> Filesystems formatted prior to v5 do not have this header in the remote block.
> Value data begins immediately at offset zero.
> +
> +== Key Differences Between Directories and Extended Attributes
> +
> +Though directories and extended attributes can take advantage of the same
> +variable length record btree structures (i.e. the dabtree) to map name hashes
> +to directory entry records (dirent records) or extended attribute records,
> +there are major differences in the ways that each of those users embed the
> +btree within the information that they are storing. The directory dabtree leaf
> +nodes contain mappings between a name hash and the location of a dirent record
> +inside the directory entry segment. Extended attributes, on the other hand,
> +store attribute records directly in the leaf nodes of the dabtree.
> +
> +When XFS adds or removes an attribute record in any dabtree, it splits or
> +merges leaf nodes of the tree based on where the name hash index determines a
> +record needs to be inserted into or removed. In the attribute dabtree, XFS
> +splits or merges sparse leaf nodes of the dabtree as a side effect of inserting
> +or removing attribute records.
> +
> +Directories, however, are subject to stricter constraints. The userspace
> +readdir/seekdir/telldir directory cookie API places a requirement on the
> +directory structure that dirent record cookie cannot change for the life of the
> +dirent record. XFS uses the dirent record's logical offset into the directory
> +data segment as the cookie, and hence the dirent record cannot change location.
> +Therefore, XFS cannot store dirent records in the leaf nodes of the dabtree
> +because the offset into the tree would change as other entries are inserted and
> +removed.
> +
> +Dirent records are therefore stored within directory data blocks, all of which
> +are mapped in the first directory segment. The directory dabtree is mapped
> +into the second directory segment. Therefore, directory blocks require
> +external free space tracking because they are not part of the dabtree itself.
> +Because the dabtree only stores pointers to dirent records in the first data
> +segment, there is no need to leave holes in the dabtree itself. The dabtree
> +splits or merges leaf nodes as required as pointers to the directory data
> +segment are added or removed, and needs no free space tracking.
> +
> +When XFS adds a dirent record, it needs to find the best-fitting free space in
> +the directory data segment to turn into the new record. This requires a free
> +space index for the directory data segment. The free space index is held in
> +the third directory segment. Once XFS has used the free space index to find
> +the block with that best free space, it modifies the directory data block and
> +updates the dabtree to point the name hash at the new record. When XFS removes
> +dirent records, it leaves hole in the data segment so that the rest of the
> +entries do not move, and removes the corresponding dabtree name hash mapping.
> +
> +Note that for small directories, XFS collapses the name hash mappings and
> +the free space information into the directory data blocks to save space.
> +
> +In summary, the requirement for a free space map in the directory structure
> +results from storing the dirent records externally to the dabtree. Attribute
> +records are stored directly in the dabtree leaf nodes of the dabtree (except
> +for remote attribute values which can be anywhere in the attr fork address
> +space) and do not need external free space tracking to determine where to best
> +insert them. As a result, extended attributes exhibit nearly perfect scaling
> +until the computer runs out of memory.
>
next prev parent reply other threads:[~2020-05-05 13:03 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-13 19:46 [PATCH v2] xfsdocs: capture some information about dirs vs. attrs and how they use dabtrees Darrick J. Wong
2020-05-04 20:53 ` Darrick J. Wong
2020-05-05 13:03 ` Bill O'Donnell [this message]
2020-05-05 13:55 ` Brian Foster
2020-05-05 16:20 ` Darrick J. Wong
2020-05-05 16:30 ` Bill O'Donnell
2020-05-05 17:48 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200505130319.GA88638@redhat.com \
--to=billodo@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).