From: Bill O'Donnell <billodo@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>,
xfs <linux-xfs@vger.kernel.org>,
Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v2] xfsdocs: capture some information about dirs vs. attrs and how they use dabtrees
Date: Tue, 5 May 2020 11:30:24 -0500 [thread overview]
Message-ID: <20200505163024.GA96092@redhat.com> (raw)
In-Reply-To: <20200505162039.GR5703@magnolia>
On Tue, May 05, 2020 at 09:20:39AM -0700, Darrick J. Wong wrote:
> On Tue, May 05, 2020 at 09:55:51AM -0400, Brian Foster wrote:
> > On Mon, Apr 13, 2020 at 12:46:00PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > > Dave and I had a short discussion about whether or not xattr trees
> > > needed to have the same free space tracking that directories have, and
> > > a comparison of how each of the two metadata types interact with
> > > dabtrees resulted. I've reworked this a bit to make it flow better as a
> > > book chapter, so here we go.
> > >
> > > Original-mail: https://lore.kernel.org/linux-xfs/20200404085203.1908-1-chandanrlinux@gmail.com/T/#mdd12ad06cf5d635772cc38946fc5b22e349e136f
> > > Originally-from: Dave Chinner <david@fromorbit.com>
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > > v2: various fixes suggested by Dave; reflow the paragraphs about
> > > directories to describe the relations between dabtree and dirents only once;
> > > don't talk about an unnamed "we".
> > > ---
> > > .../extended_attributes.asciidoc | 55 ++++++++++++++++++++
> > > 1 file changed, 55 insertions(+)
> > >
> > > diff --git a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> > > index 99f7b35..b7a6007 100644
> > > --- a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> > > +++ b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> > > @@ -910,3 +910,58 @@ Log sequence number of the last write to this block.
> > >
> > > Filesystems formatted prior to v5 do not have this header in the remote block.
> > > Value data begins immediately at offset zero.
> > > +
> > > +== Key Differences Between Directories and Extended Attributes
> > > +
> > > +Though directories and extended attributes can take advantage of the same
> > > +variable length record btree structures (i.e. the dabtree) to map name hashes
> > > +to directory entry records (dirent records) or extended attribute records,
> > > +there are major differences in the ways that each of those users embed the
> > > +btree within the information that they are storing. The directory dabtree leaf
> > > +nodes contain mappings between a name hash and the location of a dirent record
> > > +inside the directory entry segment. Extended attributes, on the other hand,
> > > +store attribute records directly in the leaf nodes of the dabtree.
>
> Hmm, both you and Bill are right; maybe I don't want to have a
> three-sentence paragraph where the first sentence constitutes 60% of the
> words in that paragraph. :)
I like this a lot better, thanks!
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
>
> "Directories and extended attributes share the function of mapping names
> to information, but the differences in the functionality requirements
> applied to each type of structure influence their respective internal
> formats. Directories map variable length names to iterable directory
> entry records (dirent records), whereas extended attributes map variable
> length names to non-iterable attribute records. Both structures can
> take advantage of variable length record btree structures (i.e the
> dabtree) to map name hashes, but there are major differences in the way
> each type of structure integrate the dabtree index within the
> information being stored. The directory dabtree leaf nodes contain
> mappings between a name hash and the location of a dirent record inside
> the directory entry segment. Extended attributes, on the other hand,
> store attribute records directly in the leaf nodes of the dabtree."
>
> How about that instead?
>
> --D
>
> > > +
> >
> > Does the above mean to say "there are major differences in the ways each
> > of these users embed information in the btree" as opposed to "embed the
> > btree within the information?" The latter wording confuses me a bit,
> > otherwise the rest looks good to me:
> >
> > Reviewed-by: Brian Foster <bfoster@redhat.com>
> >
> > > +When XFS adds or removes an attribute record in any dabtree, it splits or
> > > +merges leaf nodes of the tree based on where the name hash index determines a
> > > +record needs to be inserted into or removed. In the attribute dabtree, XFS
> > > +splits or merges sparse leaf nodes of the dabtree as a side effect of inserting
> > > +or removing attribute records.
> > > +
> > > +Directories, however, are subject to stricter constraints. The userspace
> > > +readdir/seekdir/telldir directory cookie API places a requirement on the
> > > +directory structure that dirent record cookie cannot change for the life of the
> > > +dirent record. XFS uses the dirent record's logical offset into the directory
> > > +data segment as the cookie, and hence the dirent record cannot change location.
> > > +Therefore, XFS cannot store dirent records in the leaf nodes of the dabtree
> > > +because the offset into the tree would change as other entries are inserted and
> > > +removed.
> > > +
> > > +Dirent records are therefore stored within directory data blocks, all of which
> > > +are mapped in the first directory segment. The directory dabtree is mapped
> > > +into the second directory segment. Therefore, directory blocks require
> > > +external free space tracking because they are not part of the dabtree itself.
> > > +Because the dabtree only stores pointers to dirent records in the first data
> > > +segment, there is no need to leave holes in the dabtree itself. The dabtree
> > > +splits or merges leaf nodes as required as pointers to the directory data
> > > +segment are added or removed, and needs no free space tracking.
> > > +
> > > +When XFS adds a dirent record, it needs to find the best-fitting free space in
> > > +the directory data segment to turn into the new record. This requires a free
> > > +space index for the directory data segment. The free space index is held in
> > > +the third directory segment. Once XFS has used the free space index to find
> > > +the block with that best free space, it modifies the directory data block and
> > > +updates the dabtree to point the name hash at the new record. When XFS removes
> > > +dirent records, it leaves hole in the data segment so that the rest of the
> > > +entries do not move, and removes the corresponding dabtree name hash mapping.
> > > +
> > > +Note that for small directories, XFS collapses the name hash mappings and
> > > +the free space information into the directory data blocks to save space.
> > > +
> > > +In summary, the requirement for a free space map in the directory structure
> > > +results from storing the dirent records externally to the dabtree. Attribute
> > > +records are stored directly in the dabtree leaf nodes of the dabtree (except
> > > +for remote attribute values which can be anywhere in the attr fork address
> > > +space) and do not need external free space tracking to determine where to best
> > > +insert them. As a result, extended attributes exhibit nearly perfect scaling
> > > +until the computer runs out of memory.
> > >
> >
>
next prev parent reply other threads:[~2020-05-05 16:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-13 19:46 [PATCH v2] xfsdocs: capture some information about dirs vs. attrs and how they use dabtrees Darrick J. Wong
2020-05-04 20:53 ` Darrick J. Wong
2020-05-05 13:03 ` Bill O'Donnell
2020-05-05 13:55 ` Brian Foster
2020-05-05 16:20 ` Darrick J. Wong
2020-05-05 16:30 ` Bill O'Donnell [this message]
2020-05-05 17:48 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200505163024.GA96092@redhat.com \
--to=billodo@redhat.com \
--cc=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).