From: Matt Keenan <tank.en.mate@gmail.com>
To: hooanon05@yahoo.co.jp
Cc: bharata@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, hch@infradead.org,
Jan Blunck <jblunck@suse.de>,
"Josef 'Jeff' Sipek" <jsipek@cs.sunysb.edu>
Subject: Re: [RFC] Union Mount: Readdir approaches
Date: Sat, 08 Sep 2007 00:04:01 +0100 [thread overview]
Message-ID: <46E1D8E1.2060702@gmail.com> (raw)
In-Reply-To: <E1ITYJK-0000xP-8h@jroun>
hooanon05@yahoo.co.jp wrote:
> Hello Bharata,
> I am developing a linux stackable/unification filesystem too.
>
> Bharata B Rao:
>
>> Questions
>> ---------
>>
> :::
>
>> First of all, should we even expect a sane lseek(2) on union mounted
>> directories ? If not, will the Approach 2, which works uniformly for
>> all filesystem types be acceptable ?
>>
>> If lseek(2) needs to be supported, then how do we define the seek behaviour
>> when two different types of directories are union'ed ? For eg. how do we define
>> lseek(2) on a union of ext2 directory on top of a nfs directory ? Since both
>> of them use different encoding methods for filp->f_pos, how do we establish
>> a common lseek(2) behaviour here ?
>>
>> And finally, what is the use case for directory seek ? Would anybody walk
>> directory by directory by seeking into a directory file ?
>>
>
> Although I don't remember exactly, NFS or smbfs seek for a
> directory. Additionally, any user process can call seekdir or
> something. So I believe any stackable/unification filesystem should
> support it.
>
> Here is my approach. While I don't think it is the best approach since
> it consumes much memory and cpu, I hope it help you. (or assist
> you? Sorry, I don't know correct English word)
>
> - the stackable fs has its own inode, file, dentry object, which has an
> array for the underlying inode pointers. and the whiteout is a regular
> file with a special name, instead of a flag in inode. this is the most
> different architecture from your unification embeded in VFS.
> - the vritual dir inode object has a cache for its child entries. it is
> called vdir. the cache has a version and a customizlable lifetime too.
> - all the existing underlying (same-named) dir are opened too. the file
> objects are stored in the virtual file object as an array.
> - the virtual file object has a cache for its child entries too. it is a
> copy of the one in the inode object.
>
> When the first readdir is issued:
> - call vfs_readdir for every underlying opened dir (file) object.
> - store every entry to either the hash table for the result or the
> whiteout, when the same-named entry didn't exist in the tables.
> - to improvement the performance, the allocated memory for the hash
> tables are managed in a pointer array. and the elements are
> concatinated logically by the pointer.
> - the pointer for the result-table, the version, and the currect jiffies
> are set to vdir, which is a cache in an inode.
> - all cache are copied to a member in a file object.
> - the index of the cache memory block and the offset in an array is
> handled as the seek position.
>
> In the case of the application issued this sequence:
> - opendir()
> - readdir()
> - creat or unlink an entry under the dir
> - readdir()
>
> When an entry under the dir was removed or added, the inode version will
> be updated. Since readdir can compare it with the cached version or the
> lifetime (jiffies) in the file object, it can refresh the entries. But
> in this case, it doesn't, since the file position is not 0. If the
> application needs the latest entries, it has to call rewinddir.
> The cache in the file object will updated only the case of obsoleted AND
> the file position is 0.
>
> When a dir who has already its vdir is opened, the cache in the inode
> object will be used without calling vfs_readdir, after checking the
> version and the lifetime which are stored in the inode object. If it is
> obsoleted, vfs_readdir will be called again in order to update the cache
> in the inode.
>
>
This sounds like a good approach. How does aufs handle low memory
situations? Union mounts seem to be quite common on low memory embedded
systems. Is there a way for the VM to signal to aufs/the union
filesystem to trim its cache? Also on the memory consumption front I
guess you could get the union fs to refer to a singleton name entry
directly instead of creating a new virtual inode et al. This may lead to
some unusualness though for mounts over different filesystems that have
different length directory files (eg vfat and ext3). This does run
counter to the model described above in some ways.
Matt
next prev parent reply other threads:[~2007-09-07 23:08 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-07 5:46 [RFC] Union Mount: Readdir approaches Bharata B Rao
2007-09-07 7:31 ` hooanon05
2007-09-07 7:58 ` Bharata B Rao
2007-09-07 17:39 ` Josef 'Jeff' Sipek
2007-09-07 17:54 ` Erez Zadok
2007-09-10 5:15 ` Bharata B Rao
2007-09-10 2:16 ` hooanon05
[not found] ` <20070911165547.GA26515@filer.fsl.cs.sunysb.edu>
2007-09-12 2:05 ` hooanon05
2007-09-10 3:46 ` Bharata B Rao
2007-09-07 11:54 ` Al Boldi
2007-09-07 12:49 ` hooanon05
2007-09-12 10:46 ` Al Boldi
2007-09-12 18:25 ` Jan Engelhardt
2007-09-13 2:15 ` hooanon05
2007-09-13 5:32 ` Al Boldi
2007-09-13 5:52 ` hooanon05
2007-09-13 6:29 ` Jan Engelhardt
2007-09-07 23:04 ` Matt Keenan [this message]
2007-09-10 2:17 ` hooanon05
2007-09-07 14:39 ` Jan Engelhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46E1D8E1.2060702@gmail.com \
--to=tank.en.mate@gmail.com \
--cc=bharata@linux.vnet.ibm.com \
--cc=hch@infradead.org \
--cc=hooanon05@yahoo.co.jp \
--cc=jblunck@suse.de \
--cc=jsipek@cs.sunysb.edu \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox