From: hooanon05@yahoo.co.jp
To: "David P. Quigley" <dpquigl@tycho.nsa.gov>
Cc: Theodore Tso <tytso@mit.edu>, Tomas M <tomas@slax.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: New filesystem for Linux kernel
Date: Thu, 26 Feb 2009 13:21:26 +0900 [thread overview]
Message-ID: <7326.1235622086@jrobl> (raw)
In-Reply-To: <1235577214.15148.77.camel@moss-terrapins.epoch.ncsc.mil>
Thank you for searching.
"David P. Quigley":
> "And unionfs is the wrong thing do use for this. Unioning is a complex
> namespace operation and needs to be implemented in the VFS or at least
> needs a lot of help from the VFS. Getting namespace cache coherency
> and especially locking right is impossible with out that."
>
> I'd suggest getting the VFS maintainers to chime in on your code. If
> their opinion on this has changed then you are in much better shape for
> getting AUFS2 merged.
It may not be apropriate to ask you "especially locking right" in
detail. But if it means what I am guessing, this description may be the
answer.
(from [RFC 3/8] Aufs2: lookup)
Revalidate Dentry and UDBA (User's Direct Branch Access)
----------------------------------------------------------------------
Generally VFS helpers re-validate a dentry as a part of lookup.
0. digging down the directory hierarchy.
1. lock the parent dir by its i_mutex.
2. lookup the final (child) entry.
3. revalidate it.
4. call the actual operation (create, unlink, etc.)
5. unlock the parent dir
If the filesystem implements its ->d_revalidate() (step 3), then it is
called. Actually aufs implements it and checks the dentry on a branch is
still valid.
But it is not enough. Because aufs has to release the lock for the
parent dir on a branch at the end of ->lookup() (step 2) and
->d_revalidate() (step 3) while the i_mutex of the aufs dir is still
held by VFS.
If the file on a branch is changed directly, eg. bypassing aufs, after
aufs released the lock, then the subsequent operation may cause
something unpleasant result.
This situation is a result of VFS architecture, ->lookup() and
->d_revalidate() is separated. But I never say it is wrong. It is a good
design from VFS's point of view. It is just not suitable for sub-VFS
character in aufs.
Aufs supports such case by three level of revalidation which is
selectable by user.
1. Simple Revalidate
Addition to the native flow in VFS's, confirm the child-parent
relationship on the branch just after locking the parent dir on the
branch in the "actual operation" (step 4). When this validation
fails, aufs returns EBUSY. ->d_revalidate() (step 3) in aufs still
checks the validation of the dentry on branches.
2. Monitor Changes Internally by Inotify
Addition to above, in the "actual operation" (step 4) aufs re-lookup
the dentry on the branch, and returns EBUSY if it finds different
dentry.
Additionally, aufs sets the inotify watch for every dir on branches
during it is in cache. When the event is notified, aufs registers a
function to kernel 'events' thread by schedule_work(). And the
function sets some special status to the cached aufs dentry and inode
private data. If they are not cached, then aufs has nothing to
do. When the same file is accessed through aufs (step 0-3) later,
aufs will detect the status and refresh all necessary data.
In this mode, aufs has to ignore the event which is fired by aufs
itself.
3. No Extra Validation
This is the simplest test and doesn't add any additional revalidation
test, and skip therevalidatin in step 4. It is useful and improves
aufs performance when system surely hide the aufs branches from user,
by over-mounting something (or another method).
----------------------------------------------------------------------
> This may sound like a copout but unfortunately it seems my logs were on
> my hard drive that died a few months back. Regardless though since you
> did a major rewrite for AUFS2 those comments could possibly no longer be
> valid. Regardless since there was a major rewrite since your last review
> several people should review the code base.
I have no objection about reviewing, entirely agreed.
Because I could guess it is hard work to read 40k lines, I posted
documents which describe design first.
J. R. Okajima
next prev parent reply other threads:[~2009-02-26 4:22 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-23 7:31 [RFC 0/8] Aufs2 documents hooanon05
2009-02-23 7:33 ` [RFC 1/8] Aufs2: introduction hooanon05
2009-02-23 7:34 ` [RFC 2/8] Aufs2: structure hooanon05
2009-02-23 9:13 ` Tomas M
2009-02-23 9:22 ` Tomas M
2009-02-24 8:13 ` New filesystem for Linux kernel Tomas M
2009-02-24 11:52 ` Miklos Szeredi
2009-02-24 13:18 ` hooanon05
2009-02-24 13:45 ` Tarkan Erimer
2009-02-24 13:57 ` hooanon05
2009-02-24 14:16 ` Tarkan Erimer
2009-02-24 14:50 ` Miklos Szeredi
2009-02-24 16:26 ` hooanon05
2009-02-25 10:28 ` Miklos Szeredi
2009-02-26 4:09 ` hooanon05
2009-02-26 5:51 ` hooanon05
2009-02-26 5:55 ` hooanon05
2009-02-24 14:15 ` Theodore Tso
2009-02-24 15:18 ` David P. Quigley
2009-02-24 15:41 ` hooanon05
2009-02-25 15:53 ` David P. Quigley
2009-02-26 4:21 ` hooanon05 [this message]
2009-02-25 7:31 ` Tomas M
2009-02-25 9:33 ` David Newall
2009-02-25 8:12 ` Tomas M
2009-02-26 14:31 ` Amit Kucheria
2009-02-23 14:23 ` [RFC 2/8] Aufs2: structure hooanon05
2009-02-23 7:35 ` [RFC 3/8] Aufs2: lookup hooanon05
2009-02-23 7:36 ` [RFC 4/8] Aufs2: branch hooanon05
2009-02-23 7:36 ` [RFC 5/8] Aufs2: wbr_policy hooanon05
2009-02-23 7:37 ` [RFC 6/8] Aufs2: fmode_exec hooanon05
2009-02-23 7:37 ` [RFC 7/8] Aufs2: mmap hooanon05
2009-02-23 9:18 ` Tomas M
2009-02-23 14:39 ` hooanon05
2009-02-23 7:38 ` [RFC 8/8] Aufs2: plan hooanon05
2009-02-25 17:50 ` [RFC 0/8] Aufs2 documents David P. Quigley
2009-02-25 19:07 ` Matthew Wilcox
2009-02-26 4:54 ` hooanon05
2009-02-26 17:20 ` David P. Quigley
2009-02-27 14:27 ` hooanon05
2009-02-27 18:17 ` David P. Quigley
2009-02-28 8:04 ` hooanon05
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7326.1235622086@jrobl \
--to=hooanon05@yahoo.co.jp \
--cc=dpquigl@tycho.nsa.gov \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tomas@slax.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox