linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Valerie Aurora <vaurora@redhat.com>
To: "J. R. Okajima" <hooanon05@yahoo.co.jp>
Cc: Neil Brown <neilb@suse.de>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Miklos Szeredi <miklos@szeredi.hu>, Jan Blunck <jblunck@suse.de>,
	Christoph Hellwig <hch@infradead.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 14/39] union-mount: Union mounts documentation
Date: Mon, 23 Aug 2010 20:05:05 -0400	[thread overview]
Message-ID: <20100824000505.GA20909@shell> (raw)
In-Reply-To: <16318.1282181699@jrobl>

On Thu, Aug 19, 2010 at 10:34:59AM +0900, J. R. Okajima wrote:
> 
> Valerie Aurora:
> > According Al Viro, unionfs has some fundamental architectural problems
> > that prevents it from being correct and leads to crashes:
> >
> > http://lkml.indiana.edu/hypermail/linux/kernel/0802.0/0839.html
> >
> > The main question for me is whether aufs has fixed these problems.  If
> > it hasn't, then it can't be bug-free.
> 
> Although I don't understand fully your question, aufs actually verifies
> the parent-child relationship after lock_rename() on the writable layer.
> Such verification is done in other operations too.
> And aufs provides three options to specify the level of
> verification. When the highest (most strict) level is given, aufs_rename
> lookup again after lock_rename() and compares the got parent and the
> given (cached) parent.
> Does this answer your question correctly?

First, my theory when writing any file system code is that whenever Al
Viro says, "You can deadlock easily" or "It violates the locking
rules" that I have to understand the problem and fix it.  I understand
why union mounts doesn't have the problems unionfs had when Al wrote
this email (because lower layers are not writable).  But since aufs
allows directories on lower layers to be renamed in the way that
creates the problems Al describes, I assume it has this same problem
until the author understands the unionfs problem and can describe why
aufs didn't inherit it (or fixed it, or whatever).

Second, why isn't the most strict level of lookup the only option?  It
seems like anything else is a bug.

Third, you have this odd circular inheritance problem that comes from
moving a child directory on the lower layer to the path of its parent,
and vice versa.  From Al's email:

> If you allow a mix of old and new mappings, you can easily run into the
> situations when at some moment X1 covers Y1, X2 covers Y2, X2 is a descendent
> of X1 and Y1 is a descendent of Y2. You *really* don't want to go there -
> if nothing else, defining behaviour of copyup in face of that insanity
> will be very painful.

I understand the circular inheritance problem but find this hard to
explain better than Al does above.  But here's an example of how you
get there:

Start with parent_dir1/child_dir1 covering parent_dir2/child_dir2
 thread 1 does a union lookup and gets:
  parent_dir1 covering parent_dir2
  child_dir1 covering child_dir2
  parent_dir1 parent of child_dir1
  parent_dir2 parent of child_dir2
 thread 2 swaps parent_dir2 with child_dir2 (using rename and a tmp dir)
  now lower fs looks like: child_dir2/parent_dir2

Who inherits what?  Does thread 1 see parent_dir2 as a descendant of
child_dir2 which is a descendant of parent_dir2 through the union with
parent_dir1?  Can you sanely define the behavior here?

Fourth, you have a potential deadlock now.  Say thread 1 is operating
with the belief that parent_dir1/child_dir1 covers
parent_dir2/child_dir2.  parent_dir2/child_dir2 gets renamed such that
the two switch places, as described above.  And thread 2 is directly
accessing the lower file system, now with child_dir2/parent_dir2.  The
locking order for thread 1 is:

parent_dir2 -> parent_dir1 -> child_dir1 -> child_dir2

For thread 2, it is:

child_dir2 -> parent_dir2

So if thread 1 gets a lock on parent_dir2, and then thread 2 gets a
lock on child_dir2, they will deadlock.  In general, this situation
violates the fundamental assumptions of correct directory locking,
described in Documentation/filesystems/directory-locking.

That's my attempt to explain Al's email, anyway. :) All errors are my
own.

> > Think about the case of two different RPM package database files.  One
> > contains the info from newly installed packages on the top layer file
> > system.  The lower layer contains info from packages newly installed
> > on the lower file system.  You don't want either file; you want the
> > merged packaged database showing the info for all packages installed
> > on both layers.  Any practical file system based system is only going
> > to be able to pick one file or the other, and it's going to be wrong
> > in some cases.
> 
> Let me make sure.
> Do you mean something like this?
> - a user makes a union
> - fileA exists on the lower layer but upper
> - modify fileA in the union
>   --> the file is copied-up and updated on the upper layer.
> - modify fileA on the lower layer directly (by-passing union)
>   --> the file on the lower is updated.
> - and the user will not see the uptodate fileA in the union, lack of the
>   modification made on the lower directly.
> 
> Then I'd say it is an expected behaviour. Simply the upper file hides
> the lower.

I am not arguing with you and I agree that this is the expected
behavior.  I wrote about this case just to show that there is a case
in which what the user "wants" in an upgrade situation is impossible
to do automatically in the file system.  So you need to have a smart
tool to do an upgrade of the lower layer file system.  And I argue
that smart tool should deal with all cases of a file copied up to the
topmost file system that covers an updated file on the lower file
system, instead of putting this policy decision into the VFS.

-VAL

  reply	other threads:[~2010-08-24  0:05 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-08 15:52 [PATCH 00/39] Union mounts - return d_ino from lower fs Valerie Aurora
2010-08-08 15:52 ` [PATCH 01/39] VFS: Comment follow_mount() and friends Valerie Aurora
2010-08-08 15:52 ` [PATCH 02/39] VFS: Make lookup_hash() return a struct path Valerie Aurora
2010-08-08 15:52 ` [PATCH 03/39] VFS: Add read-only users count to superblock Valerie Aurora
2010-08-08 15:52 ` [PATCH 04/39] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora
2010-08-08 15:52 ` [PATCH 05/39] whiteout/NFSD: Don't return information about whiteouts to userspace Valerie Aurora
2010-08-08 15:52 ` [PATCH 06/39] whiteout: Add vfs_whiteout() and whiteout inode operation Valerie Aurora
2010-08-08 15:52 ` [PATCH 07/39] whiteout: Set opaque flag if new directory was previously a whiteout Valerie Aurora
2010-08-08 15:52 ` [PATCH 08/39] whiteout: Allow removal of a directory with whiteouts Valerie Aurora
2010-08-08 15:52 ` [PATCH 09/39] whiteout: tmpfs whiteout support Valerie Aurora
2010-08-08 15:52 ` [PATCH 10/39] whiteout: Split of ext2_append_link() from ext2_add_link() Valerie Aurora
2010-08-08 15:52 ` [PATCH 11/39] whiteout: ext2 whiteout support Valerie Aurora
2010-08-08 15:52 ` [PATCH 12/39] whiteout: jffs2 " Valerie Aurora
2010-08-08 15:52 ` [PATCH 13/39] fallthru: Basic fallthru definitions Valerie Aurora
2010-08-08 15:52 ` [PATCH 14/39] union-mount: Union mounts documentation Valerie Aurora
2010-08-09 22:56   ` Neil Brown
2010-08-11  1:51     ` J. R. Okajima
2010-08-17 20:44     ` Valerie Aurora
2010-08-17 22:53       ` Neil Brown
2010-08-18  0:15         ` Luca Barbieri
2010-08-18 19:04         ` Valerie Aurora
2010-08-18  1:23       ` J. R. Okajima
2010-08-18 18:55         ` Valerie Aurora
2010-08-19  1:34           ` J. R. Okajima
2010-08-24  0:05             ` Valerie Aurora [this message]
2010-08-24  2:28               ` J. R. Okajima
2010-08-24 20:48                 ` Valerie Aurora
2010-08-25  2:59                   ` Christian Stroetmann
2010-08-25  5:03                   ` J. R. Okajima
2010-08-08 15:52 ` [PATCH 15/39] union-mount: Introduce MNT_UNION and MS_UNION flags Valerie Aurora
2010-08-08 15:52 ` [PATCH 16/39] union-mount: Introduce union_dir structure and basic operations Valerie Aurora
2010-08-08 15:52 ` [PATCH 17/39] union-mount: Free union dirs on removal from dcache Valerie Aurora
2010-08-08 15:52 ` [PATCH 18/39] union-mount: Support for union mounting file systems Valerie Aurora
2010-08-08 15:52 ` [PATCH 19/39] union-mount: Implement union lookup Valerie Aurora
2010-08-13 13:49   ` Miklos Szeredi
2010-08-17 21:44     ` Valerie Aurora
2010-08-18  8:11       ` Miklos Szeredi
2010-08-08 15:52 ` [PATCH 20/39] union-mount: Call do_whiteout() on unlink and rmdir in unions Valerie Aurora
2010-08-08 15:52 ` [PATCH 21/39] union-mount: Copy up directory entries on first readdir() Valerie Aurora
2010-08-08 15:52 ` [PATCH 22/39] union-mount: Add generic_readdir_fallthru() helper Valerie Aurora
2010-08-08 15:52 ` [PATCH 23/39] fallthru: ext2 fallthru support Valerie Aurora
2010-08-13 13:52   ` Miklos Szeredi
2010-08-17 21:08     ` Valerie Aurora
2010-08-17 22:28     ` Valerie Aurora
2010-08-08 15:52 ` [PATCH 24/39] fallthru: jffs2 " Valerie Aurora
2010-08-08 15:52 ` [PATCH 25/39] fallthru: tmpfs " Valerie Aurora
2010-08-08 15:52 ` [PATCH 26/39] VFS: Split inode_permission() and create path_permission() Valerie Aurora
2010-08-08 15:52 ` [PATCH 27/39] VFS: Create user_path_nd() to lookup both parent and target Valerie Aurora
2010-08-08 15:52 ` [PATCH 28/39] union-mount: In-kernel file copyup routines Valerie Aurora
2010-08-08 15:52 ` [PATCH 29/39] union-mount: Implement union-aware access()/faccessat() Valerie Aurora
2010-08-08 15:52 ` [PATCH 30/39] union-mount: Implement union-aware link() Valerie Aurora
2010-08-08 15:52 ` [PATCH 31/39] union-mount: Implement union-aware rename() Valerie Aurora
2010-08-08 15:52 ` [PATCH 32/39] union-mount: Implement union-aware writable open() Valerie Aurora
2010-08-08 15:52 ` [PATCH 33/39] union-mount: Implement union-aware chown() Valerie Aurora
2010-08-08 15:52 ` [PATCH 34/39] union-mount: Implement union-aware truncate() Valerie Aurora
2010-08-08 15:52 ` [PATCH 35/39] union-mount: Implement union-aware chmod()/fchmodat() Valerie Aurora
2010-08-08 15:52 ` [PATCH 36/39] union-mount: Implement union-aware lchown() Valerie Aurora
2010-08-08 15:52 ` [PATCH 37/39] union-mount: Implement union-aware utimensat() Valerie Aurora
2010-08-08 15:52 ` [PATCH 38/39] union-mount: Implement union-aware setxattr() Valerie Aurora
2010-08-08 15:52 ` [PATCH 39/39] union-mount: Implement union-aware lsetxattr() Valerie Aurora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100824000505.GA20909@shell \
    --to=vaurora@redhat.com \
    --cc=hch@infradead.org \
    --cc=hooanon05@yahoo.co.jp \
    --cc=jblunck@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neilb@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).