Re: [PATCH 17/39] union-mount: Union mounts documentation

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Valerie Aurora <vaurora@redhat.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	Jan Blunck <jblunck@suse.de>
Subject: Re: [PATCH 17/39] union-mount: Union mounts documentation
Date: Wed, 5 May 2010 09:19:20 -0400	[thread overview]
Message-ID: <20100505131920.GB7534@shell> (raw)
In-Reply-To: <20100504211209.GC4360@shareable.org>

On Tue, May 04, 2010 at 10:12:09PM +0100, Jamie Lokier wrote:
> Valerie Aurora wrote:
> > +File copyup: Create a file on the top layer that has the same metadata
> > +and contents as the file with the same pathname on the bottom layer.
> 
> Can copyup be interrupted?  E.g. if I chmod an 80GB file, will the
> chmod() system call pause for a couple of hours, or can I control-C it?

The right behavior is that you should be able to control-C it, but I
doubt that currently works.  Let me look into testing and implementing
this.

> > +This deviation from standard is due to technical limitations of the
> > +union mount implementation.  Specifically, we would need to replace an
> > +open file descriptor from the lower layer with an open file descriptor
> > +for a file with matching pathname and contents on the upper layer,
> > +which is difficult to do.  We avoid this in other system calls by
> > +doing the copyup before the file is opened.  Unionfs doesn't encounter
> > +this problem because it creates a dummy file struct which redirects or
> > +fans out operations to the struct files for the underlying file
> > +systems.
> > +
> > +From an application's point of view, the result of an in-kernel file
> > +copyup is the logical equivalent of another application updating the
> > +file via the rename() pattern: creat() a new file, copy the data over,
> > +make changes the copy, and rename() over the old version.  Any
> > +existing open file descriptors for that file (including those in the
> > +same application) refer to a now invisible object that used to have
> > +the same pathname.  Only opens that occur after the copyup will see
> > +updates to the file.
> 
> Does it apply the same permission checks that a program doing
> copy+rename would have to pass?  I guess that is just write access to
> the directory.

Yes.

> Does it effectively "rename" all hard links referring to the file, to
> point to the new version, or does it only affect the path that was
> used by the writer/modifier, leaving the other links continue to refer
> to the original file?

In order to update all the hard links to a file, we would have to walk
the entire file system searching for links with a matching inode
number and copy them up too.  We're never going to do a
file-system-wide walk, so we won't do that.  The other hard links
still point to the old copy of the file.  We hope applications don't
commonly depend on this.

> > + - File copyup on open(O_DIRECT)
> 
> Why is O_DIRECT relevant?  O_DIRECT doesn't imply writing, and
> copy+rename behaviour is the same with O_DIRECT as not.
> 
> Some programs use O_DIRECT to read very large files, without intending
> they will ever be modified.  For example, qemu using O_DIRECT to
> access a disk image backing file.

You're right, this is a mistake.

> > +NFS interaction
> > +===============
> > +
> > +NFS is currently not supported as either type of layer.  NFS as
> > +read-only layer requires support from the server to honor the
> > +read-only guarantee needed for the bottom layer.  To do this, the
> > +server needs to revoke access to clients requesting read-only file
> > +systems if the exported file system is remounted read-write or
> > +unmounted (during which arbitrary changes can occur).  Some recent
> > +discussion:
> > +
> > +http://markmail.org/message/3mkgnvo4pswxd7lp
> > +
> > +NFS as the read-write layer would require implementation of the
> > +->whiteout() and ->fallthru() methods.  DT_WHT directory entries are
> > +theoretically already supported.
> > +
> > +Also, technically the requirement for a readdir() cookie that is
> > +stable across reboots comes only from file systems exported via NFSv2:
> > +
> > +http://oss.oracle.com/pipermail/btrfs-devel/2008-January/000463.html
> > +
> > +Todo:
> > +
> > +- Guarantee really really read-only on NFS exports
> > +- Implement whiteout()/fallthru() for NFS
> 
> I'm finding it hard to imagine _guaranteeing_ really read-only.  All
> you can guarantee is that the NFS says it is read-only.
> 
> For example, a userspace NFS server cannot prevent the filesystem it's
> serving from changing.

We're discussing how to detect this now.

> Is this not a problem with other network filesystems like CIFS, P9, FUSE?

Each file system that wants to support union mounts will need to
implement the features necessary for that layer (hard read-only for
the lower layer, whiteouts and fallthrus for the upper layer).

> > +Known non-POSIX behaviors
> > +-------------------------
> > +
> > +- Link count may be wrong for files on bottom layer with > 1 link count
> 
> Can you say a bit more about what will be seen?

Sure, I'll write up an example.

> > +- File copyup is the logical equivalent of an update via copy +
> > +  rename().  Any existing open file descriptors will continue to refer
> > +  to the read-only copy on the bottom layer and will not see any
> > +  changes that occur after the copy-up.
> 
> I can imagine some database-like programs getting confused by that.
> 
> Maybe it would be better to fail copyup operations when the file is
> currently open O_RDONLY by anyone, analogous to the way writable
> mounts are refused when any union holds it read-only?
> 
> Are there uses likely to be broken by that behaviour?

That's an interesting question.  In general, this seems like a bad
idea - any process can prevent another process from writing to a file
by opening it.  This is like chmod'ing it to 444.

-VAL

next prev parent reply	other threads:[~2010-05-05 13:19 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-03 23:11 [RFC PATCH 00/39] Union mounts with xattrs Valerie Aurora
2010-05-03 23:12 ` [PATCH 01/39] VFS: Comment follow_mount() and friends Valerie Aurora
2010-05-03 23:12 ` [PATCH 02/39] VFS: Make lookup_hash() return a struct path Valerie Aurora
2010-05-03 23:12 ` [PATCH 03/39] VFS: Add read-only users count to superblock Valerie Aurora
2010-05-03 23:12 ` [PATCH 04/39] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora
2010-05-03 23:12 ` [PATCH 05/39] whiteout/NFSD: Don't return information about whiteouts to userspace Valerie Aurora
     [not found]   ` <1272928358-20854-6-git-send-email-vaurora-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-05-03 23:37     ` Neil Brown
2010-05-06 18:01       ` Valerie Aurora
2010-05-06 21:18         ` Neil Brown
2010-05-17 19:51           ` Valerie Aurora
2010-05-03 23:12 ` [PATCH 06/39] whiteout: Add vfs_whiteout() and whiteout inode operation Valerie Aurora
2010-05-03 23:12 ` [PATCH 07/39] whiteout: Set S_OPAQUE inode flag when creating directories Valerie Aurora
2010-05-03 23:12 ` [PATCH 08/39] whiteout: Allow removal of a directory with whiteouts Valerie Aurora
2010-05-03 23:12 ` [PATCH 09/39] whiteout: tmpfs whiteout support Valerie Aurora
2010-05-03 23:12 ` [PATCH 10/39] whiteout: Split of ext2_append_link() from ext2_add_link() Valerie Aurora
2010-05-03 23:12 ` [PATCH 11/39] whiteout: ext2 whiteout support Valerie Aurora
2010-05-03 23:12 ` [PATCH 12/39] whiteout: jffs2 " Valerie Aurora
2010-05-03 23:12 ` [PATCH 13/39] fallthru: Basic fallthru definitions Valerie Aurora
2010-05-03 23:12 ` [PATCH 14/39] fallthru: ext2 fallthru support Valerie Aurora
2010-05-03 23:12 ` [PATCH 15/39] fallthru: jffs2 " Valerie Aurora
2010-05-03 23:12 ` [PATCH 16/39] fallthru: tmpfs " Valerie Aurora
2010-05-03 23:12 ` [PATCH 17/39] union-mount: Union mounts documentation Valerie Aurora
2010-05-04  1:54   ` Valdis.Kletnieks
2010-05-05 13:06     ` Valerie Aurora
2010-05-04 21:12   ` Jamie Lokier
2010-05-05 13:19     ` Valerie Aurora [this message]
2010-05-03 23:12 ` [PATCH 18/39] union-mount: Introduce MNT_UNION and MS_UNION flags Valerie Aurora
2010-05-03 23:12 ` [PATCH 19/39] union-mount: Introduce union_mount structure and basic operations Valerie Aurora
2010-05-03 23:12 ` [PATCH 20/39] union-mount: Drive the union cache via dcache Valerie Aurora
2010-05-03 23:12 ` [PATCH 21/39] union-mount: Implement union lookup Valerie Aurora
2010-05-03 23:12 ` [PATCH 22/39] union-mount: Support for mounting union mount file systems Valerie Aurora
2010-05-03 23:12 ` [PATCH 23/39] union-mount: Call do_whiteout() on unlink and rmdir in unions Valerie Aurora
2010-05-03 23:12 ` [PATCH 24/39] union-mount: Copy up directory entries on first readdir() Valerie Aurora
2010-05-03 23:12 ` [PATCH 25/39] VFS: Split inode_permission() and create path_permission() Valerie Aurora
2010-05-03 23:12 ` [PATCH 26/39] VFS: Create user_path_nd() to lookup both parent and target Valerie Aurora
2010-05-03 23:12 ` [PATCH 27/39] union-mount: In-kernel copyup routines Valerie Aurora
2010-05-04  1:40   ` Valdis.Kletnieks
2010-05-07 14:45     ` Valerie Aurora
2010-05-03 23:12 ` [PATCH 28/39] union-mount: In-kernel copyup of xattrs Valerie Aurora
2010-05-03 23:12 ` [PATCH 29/39] union-mount: Implement union-aware access()/faccessat() Valerie Aurora
2010-05-03 23:12 ` [PATCH 30/39] union-mount: Implement union-aware link() Valerie Aurora
2010-05-03 23:12 ` [PATCH 31/39] union-mount: Implement union-aware rename() Valerie Aurora
2010-05-03 23:12 ` [PATCH 32/39] union-mount: Implement union-aware writable open() Valerie Aurora
2010-05-03 23:12 ` [PATCH 33/39] union-mount: Implement union-aware chown() Valerie Aurora
2010-05-03 23:12 ` [PATCH 34/39] union-mount: Implement union-aware truncate() Valerie Aurora
2010-05-03 23:12 ` [PATCH 35/39] union-mount: Implement union-aware chmod()/fchmodat() Valerie Aurora
2010-05-03 23:12 ` [PATCH 36/39] union-mount: Implement union-aware lchown() Valerie Aurora
2010-05-03 23:12 ` [PATCH 37/39] union-mount: Implement union-aware utimensat() Valerie Aurora
2010-05-03 23:12 ` [PATCH 38/39] union-mount: Implement union-aware setxattr() Valerie Aurora
2010-05-03 23:12 ` [PATCH 39/39] union-mount: Implement union-aware lsetxattr() Valerie Aurora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100505131920.GB7534@shell \
    --to=vaurora@redhat.com \
    --cc=hch@infradead.org \
    --cc=jamie@shareable.org \
    --cc=jblunck@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).