linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Kleikamp <shaggy@austin.ibm.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>,
	Akshat Aranya <aaranya@cs.sunysb.edu>,
	linux-fsdevel@vger.kernel.org
Subject: Re: Expected getdents behaviour
Date: Fri, 16 Sep 2005 06:57:18 -0500	[thread overview]
Message-ID: <1126871838.9350.16.camel@kleikamp.austin.ibm.com> (raw)
In-Reply-To: <20050916033945.GA11047@thunk.org>

On Thu, 2005-09-15 at 23:39 -0400, Theodore Ts'o wrote:
> JFS also uses a b-tree (or more than one b-tree) to index its
> directory, and also gives the same guarantee, but it implements it in
> an entirely different way.

Here are the detail of jfs's implementation.  Each directory entry is
assigned sequential "index" when it is created, which is stored in the
directory entry in the dtree (the primary directory data structure).  In
addition, an entry is created in the "index table", which is an array
keyed on the index.  Initially, this table can be stored directly in the
inode, but as it grows, the table is stored in other blocks, and is
addressed by a b-tree anchored in the inode.  For an existing entry, the
table stores the address of the dtree page, and the offset within the
page.  This table is updated whenever a change to the directory moves
the entry.  When an entry is deleted, the table contains the index of
the next existing entry in dtree-order.

Resuming a search means getting the table element for the index,
checking a flag whether it exists or has been deleted, following the
chain of "next" indices until an existing one is found, and getting the
dtree page address and offset of the entry.

The disadvantages are the need to update the directory table as entries
are moved, and the fact that the index table does not shrink as entries
are removed.  (The table is completely truncated if all entries are
removed from a directory, but that isn't typically going to happen.)

> Two different filesystems; two different strategies; both implement
> this guarantee without needing to lock the directory against
> modifications during the readdir() scan.  Take a look at the
> implementations closely.
> 
> (Hint: both involve walking the b-tree and returning readdir() entries
> in tree order.  The details though of which b-tree, how the b-tree is
> indexed, how the index is stored in file descriptor offset, and how
> telldir/seekdir are implemented are different in the two filesystems,
> however.)

-- 
David Kleikamp
IBM Linux Technology Center


  reply	other threads:[~2005-09-16 11:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-15 13:57 Expected getdents behaviour Akshat Aranya
2005-09-15 14:03 ` Peter Staubach
2005-09-15 14:07 ` Anton Altaparmakov
2005-09-15 14:12   ` Anton Altaparmakov
2005-09-15 14:45     ` Miklos Szeredi
2005-09-15 15:17       ` Anton Altaparmakov
2005-09-15 16:41         ` Jan Blunck
2005-09-15 17:46           ` Jörn Engel
2005-09-15 18:19             ` Theodore Ts'o
2005-09-15 21:04               ` Anton Altaparmakov
2005-09-16  7:50                 ` Nikita Danilov
2005-09-15 21:47               ` Jörn Engel
2005-09-16  7:29               ` Nikita Danilov
2005-09-16 11:58                 ` Theodore Ts'o
2005-09-15 21:00             ` Anton Altaparmakov
2005-09-15 21:15               ` Charles P. Wright
2005-09-15 21:19                 ` Anton Altaparmakov
2005-09-15 20:28           ` Anton Altaparmakov
2005-09-15 16:51         ` Miklos Szeredi
2005-09-15 21:17           ` Anton Altaparmakov
2005-09-15 15:51     ` Theodore Ts'o
2005-09-15 16:52       ` Bryan Henderson
2005-09-15 16:57         ` Jeremy Allison
2005-09-15 20:51           ` Anton Altaparmakov
2005-09-15 20:50         ` Anton Altaparmakov
2005-09-15 23:41           ` Bryan Henderson
2005-09-15 20:25       ` Anton Altaparmakov
2005-09-16  3:39         ` Theodore Ts'o
2005-09-16 11:57           ` Dave Kleikamp [this message]
2005-09-15 18:08     ` Nikita Danilov
2005-09-16 11:23       ` Miklos Szeredi
2005-09-16  1:28   ` tridge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1126871838.9350.16.camel@kleikamp.austin.ibm.com \
    --to=shaggy@austin.ibm.com \
    --cc=aaranya@cs.sunysb.edu \
    --cc=aia21@cam.ac.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).