linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: hooanon05@yahoo.co.jp
To: Arnd Bergmann <arnd@arndb.de>
Cc: Jamie Lokier <jamie@shareable.org>,
	Phillip Lougher <phillip@lougher.demon.co.uk>,
	David Newall <davidn@davidnewall.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	hch@lst.de
Subject: Re: [RFC 0/7] [RFC] cramfs: fake write support
Date: Mon, 02 Jun 2008 19:36:32 +0900	[thread overview]
Message-ID: <9159.1212402992@jrobl> (raw)
In-Reply-To: <200806020912.49721.arnd@arndb.de>


Arnd Bergmann:
> Without reading either again, the top problems in unionfs at the time were:
> * data inconsistency problems when simultaneously accessing the underlying
>   fs and the union.
> * duplication of dentry and inode data structures in the union wastes
>   memory and cpu cycles.
> * whiteouts are in the same namespace as regular files, so conflicts are
>   possible.
> * mounting a large number of aufs on top of each other eventually
>   overflows the kernel stack, e.g. in readdir.
> * allowing multiple writable branches (instead of just stacking
>   one rw copy on a number of ro file systems) is confusing to the user
>   and complicates the implementation a lot.
> 
> With the exception of the last two, I assumed that these were all
> unfixable with a file system based approach (including the hypothetical
> union-tmpfs). If you have addressed them, how?

I will try explain individually.
Here are what I implemented in AUFS.
Any comments are welcome.

> * data inconsistency problems when simultaneously accessing the underlying
>   fs and the union.
Aufs has three levels of detecting the direct-access to the lower
(branch) filesystems (ie. bypassing aufs). I guess the most strict level
is a good answer for your question. It is based on the inotify
feature. Aufs sets inotify-watch to every accessed directories on lower
fs. During those inodes are cached, aufs receives the inotify event for
thier children/files and marks the aufs data for the file is
obsoleted. When the file is accessed later, aufs retrives the latest
inode (or dentry) again.
The inotify-watch will be removed when the aufs dir inode is discarded
from cache.


> * duplication of dentry and inode data structures in the union wastes
>   memory and cpu cycles.

Aufs has its own dentry and inode object as normal fs has. And they have
pointers to the corresponding ones on the lower fs. If you make a union
from two real filesystems, then aufs inode will have (at most) two
pointers as its private data.
Do you mean having pointers is a duplicataion?


> * whiteouts are in the same namespace as regular files, so conflicts are
>   possible.

Yes, that's right.
Aufs reserves ".wh." as a whiteout prefix, and prohibits users to handle
such filename inside aufs. It might be a problem as you wrote, but users
can create/remove them directly on the lower fs and I have never
received request about this reserved prefix.


> * mounting a large number of aufs on top of each other eventually
>   overflows the kernel stack, e.g. in readdir.

Aufs readdir operation consumes memory, but it is not stack. If it was
implemented as a recursive function, it might cause the stack
overflow. But actually it is a loop.
The memory is used for stroing entry names and eliminating whiteout-ed
ones, and the result will be cached for a specified time. So the memory
(other than stack) will be consumed.


> * allowing multiple writable branches (instead of just stacking
>   one rw copy on a number of ro file systems) is confusing to the user
>   and complicates the implementation a lot.

Probably you are right. Initially aufs had only one policy to select the
writable branch. But several users requested another policy such as
round-robin or most-free-spece, and aufs has implemented them.
I don't guess uers will be confused by these policies. While I tried it
should be simple, I guess some people will say it is complex.


Junjiro Okajima



  reply	other threads:[~2008-06-02 10:37 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-31 15:37 [RFC 0/7] [RFC] cramfs: fake write support arnd
2008-05-31 18:56 ` David Newall
2008-05-31 20:40   ` Arnd Bergmann
2008-06-01  3:54     ` Phillip Lougher
2008-06-01  8:52       ` Arnd Bergmann
2008-06-01 12:28       ` Jamie Lokier
2008-06-01 21:49         ` Arnd Bergmann
2008-06-02  2:48           ` hooanon05
2008-06-02  3:25             ` Erez Zadok
2008-06-02  7:51               ` Arnd Bergmann
2008-06-02 18:13                 ` Erez Zadok
2008-06-03  2:02                   ` Phillip Lougher
2008-06-02  3:51             ` Erez Zadok
2008-06-02 11:07               ` Jamie Lokier
2008-06-02  4:37             ` Erez Zadok
2008-06-02  6:07               ` Bharata B Rao
2008-06-02  7:17               ` Jan Engelhardt
2008-06-02  7:12             ` Arnd Bergmann
2008-06-02 10:36               ` hooanon05 [this message]
2008-06-02 11:15                 ` Arnd Bergmann
2008-06-02 12:56                   ` hooanon05
2008-06-02 14:13                     ` Arnd Bergmann
2008-06-02 14:33                       ` hooanon05
2008-06-02 15:01                         ` Arnd Bergmann
2008-06-03 11:04                           ` hooanon05
2008-06-02 14:54                   ` Evgeniy Polyakov
2008-06-02 17:42                     ` Arnd Bergmann
2008-06-02 15:35               ` Erez Zadok
2008-06-01  6:02     ` David Newall
2008-06-01  9:11       ` Jan Engelhardt
2008-06-01 16:25       ` Jörn Engel
2008-06-01  3:19 ` Phillip Lougher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9159.1212402992@jrobl \
    --to=hooanon05@yahoo.co.jp \
    --cc=arnd@arndb.de \
    --cc=davidn@davidnewall.com \
    --cc=hch@lst.de \
    --cc=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phillip@lougher.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).