From: Valerie Aurora <vaurora@redhat.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
Jan Blunck <jblunck@suse.de>
Subject: Re: [PATCH 17/39] union-mount: Union mounts documentation
Date: Wed, 5 May 2010 09:19:20 -0400 [thread overview]
Message-ID: <20100505131920.GB7534@shell> (raw)
In-Reply-To: <20100504211209.GC4360@shareable.org>
On Tue, May 04, 2010 at 10:12:09PM +0100, Jamie Lokier wrote:
> Valerie Aurora wrote:
> > +File copyup: Create a file on the top layer that has the same metadata
> > +and contents as the file with the same pathname on the bottom layer.
>
> Can copyup be interrupted? E.g. if I chmod an 80GB file, will the
> chmod() system call pause for a couple of hours, or can I control-C it?
The right behavior is that you should be able to control-C it, but I
doubt that currently works. Let me look into testing and implementing
this.
> > +This deviation from standard is due to technical limitations of the
> > +union mount implementation. Specifically, we would need to replace an
> > +open file descriptor from the lower layer with an open file descriptor
> > +for a file with matching pathname and contents on the upper layer,
> > +which is difficult to do. We avoid this in other system calls by
> > +doing the copyup before the file is opened. Unionfs doesn't encounter
> > +this problem because it creates a dummy file struct which redirects or
> > +fans out operations to the struct files for the underlying file
> > +systems.
> > +
> > +From an application's point of view, the result of an in-kernel file
> > +copyup is the logical equivalent of another application updating the
> > +file via the rename() pattern: creat() a new file, copy the data over,
> > +make changes the copy, and rename() over the old version. Any
> > +existing open file descriptors for that file (including those in the
> > +same application) refer to a now invisible object that used to have
> > +the same pathname. Only opens that occur after the copyup will see
> > +updates to the file.
>
> Does it apply the same permission checks that a program doing
> copy+rename would have to pass? I guess that is just write access to
> the directory.
Yes.
> Does it effectively "rename" all hard links referring to the file, to
> point to the new version, or does it only affect the path that was
> used by the writer/modifier, leaving the other links continue to refer
> to the original file?
In order to update all the hard links to a file, we would have to walk
the entire file system searching for links with a matching inode
number and copy them up too. We're never going to do a
file-system-wide walk, so we won't do that. The other hard links
still point to the old copy of the file. We hope applications don't
commonly depend on this.
> > + - File copyup on open(O_DIRECT)
>
> Why is O_DIRECT relevant? O_DIRECT doesn't imply writing, and
> copy+rename behaviour is the same with O_DIRECT as not.
>
> Some programs use O_DIRECT to read very large files, without intending
> they will ever be modified. For example, qemu using O_DIRECT to
> access a disk image backing file.
You're right, this is a mistake.
> > +NFS interaction
> > +===============
> > +
> > +NFS is currently not supported as either type of layer. NFS as
> > +read-only layer requires support from the server to honor the
> > +read-only guarantee needed for the bottom layer. To do this, the
> > +server needs to revoke access to clients requesting read-only file
> > +systems if the exported file system is remounted read-write or
> > +unmounted (during which arbitrary changes can occur). Some recent
> > +discussion:
> > +
> > +http://markmail.org/message/3mkgnvo4pswxd7lp
> > +
> > +NFS as the read-write layer would require implementation of the
> > +->whiteout() and ->fallthru() methods. DT_WHT directory entries are
> > +theoretically already supported.
> > +
> > +Also, technically the requirement for a readdir() cookie that is
> > +stable across reboots comes only from file systems exported via NFSv2:
> > +
> > +http://oss.oracle.com/pipermail/btrfs-devel/2008-January/000463.html
> > +
> > +Todo:
> > +
> > +- Guarantee really really read-only on NFS exports
> > +- Implement whiteout()/fallthru() for NFS
>
> I'm finding it hard to imagine _guaranteeing_ really read-only. All
> you can guarantee is that the NFS says it is read-only.
>
> For example, a userspace NFS server cannot prevent the filesystem it's
> serving from changing.
We're discussing how to detect this now.
> Is this not a problem with other network filesystems like CIFS, P9, FUSE?
Each file system that wants to support union mounts will need to
implement the features necessary for that layer (hard read-only for
the lower layer, whiteouts and fallthrus for the upper layer).
> > +Known non-POSIX behaviors
> > +-------------------------
> > +
> > +- Link count may be wrong for files on bottom layer with > 1 link count
>
> Can you say a bit more about what will be seen?
Sure, I'll write up an example.
> > +- File copyup is the logical equivalent of an update via copy +
> > + rename(). Any existing open file descriptors will continue to refer
> > + to the read-only copy on the bottom layer and will not see any
> > + changes that occur after the copy-up.
>
> I can imagine some database-like programs getting confused by that.
>
> Maybe it would be better to fail copyup operations when the file is
> currently open O_RDONLY by anyone, analogous to the way writable
> mounts are refused when any union holds it read-only?
>
> Are there uses likely to be broken by that behaviour?
That's an interesting question. In general, this seems like a bad
idea - any process can prevent another process from writing to a file
by opening it. This is like chmod'ing it to 444.
-VAL
next prev parent reply other threads:[~2010-05-05 13:19 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-03 23:11 [RFC PATCH 00/39] Union mounts with xattrs Valerie Aurora
2010-05-03 23:12 ` [PATCH 01/39] VFS: Comment follow_mount() and friends Valerie Aurora
2010-05-03 23:12 ` [PATCH 02/39] VFS: Make lookup_hash() return a struct path Valerie Aurora
2010-05-03 23:12 ` [PATCH 03/39] VFS: Add read-only users count to superblock Valerie Aurora
2010-05-03 23:12 ` [PATCH 04/39] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora
2010-05-03 23:12 ` [PATCH 05/39] whiteout/NFSD: Don't return information about whiteouts to userspace Valerie Aurora
[not found] ` <1272928358-20854-6-git-send-email-vaurora-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-05-03 23:37 ` Neil Brown
2010-05-06 18:01 ` Valerie Aurora
2010-05-06 21:18 ` Neil Brown
2010-05-17 19:51 ` Valerie Aurora
2010-05-03 23:12 ` [PATCH 06/39] whiteout: Add vfs_whiteout() and whiteout inode operation Valerie Aurora
2010-05-03 23:12 ` [PATCH 07/39] whiteout: Set S_OPAQUE inode flag when creating directories Valerie Aurora
2010-05-03 23:12 ` [PATCH 08/39] whiteout: Allow removal of a directory with whiteouts Valerie Aurora
2010-05-03 23:12 ` [PATCH 09/39] whiteout: tmpfs whiteout support Valerie Aurora
2010-05-03 23:12 ` [PATCH 10/39] whiteout: Split of ext2_append_link() from ext2_add_link() Valerie Aurora
2010-05-03 23:12 ` [PATCH 11/39] whiteout: ext2 whiteout support Valerie Aurora
2010-05-03 23:12 ` [PATCH 12/39] whiteout: jffs2 " Valerie Aurora
2010-05-03 23:12 ` [PATCH 13/39] fallthru: Basic fallthru definitions Valerie Aurora
2010-05-03 23:12 ` [PATCH 14/39] fallthru: ext2 fallthru support Valerie Aurora
2010-05-03 23:12 ` [PATCH 15/39] fallthru: jffs2 " Valerie Aurora
2010-05-03 23:12 ` [PATCH 16/39] fallthru: tmpfs " Valerie Aurora
2010-05-03 23:12 ` [PATCH 17/39] union-mount: Union mounts documentation Valerie Aurora
2010-05-04 1:54 ` Valdis.Kletnieks
2010-05-05 13:06 ` Valerie Aurora
2010-05-04 21:12 ` Jamie Lokier
2010-05-05 13:19 ` Valerie Aurora [this message]
2010-05-03 23:12 ` [PATCH 18/39] union-mount: Introduce MNT_UNION and MS_UNION flags Valerie Aurora
2010-05-03 23:12 ` [PATCH 19/39] union-mount: Introduce union_mount structure and basic operations Valerie Aurora
2010-05-03 23:12 ` [PATCH 20/39] union-mount: Drive the union cache via dcache Valerie Aurora
2010-05-03 23:12 ` [PATCH 21/39] union-mount: Implement union lookup Valerie Aurora
2010-05-03 23:12 ` [PATCH 22/39] union-mount: Support for mounting union mount file systems Valerie Aurora
2010-05-03 23:12 ` [PATCH 23/39] union-mount: Call do_whiteout() on unlink and rmdir in unions Valerie Aurora
2010-05-03 23:12 ` [PATCH 24/39] union-mount: Copy up directory entries on first readdir() Valerie Aurora
2010-05-03 23:12 ` [PATCH 25/39] VFS: Split inode_permission() and create path_permission() Valerie Aurora
2010-05-03 23:12 ` [PATCH 26/39] VFS: Create user_path_nd() to lookup both parent and target Valerie Aurora
2010-05-03 23:12 ` [PATCH 27/39] union-mount: In-kernel copyup routines Valerie Aurora
2010-05-04 1:40 ` Valdis.Kletnieks
2010-05-07 14:45 ` Valerie Aurora
2010-05-03 23:12 ` [PATCH 28/39] union-mount: In-kernel copyup of xattrs Valerie Aurora
2010-05-03 23:12 ` [PATCH 29/39] union-mount: Implement union-aware access()/faccessat() Valerie Aurora
2010-05-03 23:12 ` [PATCH 30/39] union-mount: Implement union-aware link() Valerie Aurora
2010-05-03 23:12 ` [PATCH 31/39] union-mount: Implement union-aware rename() Valerie Aurora
2010-05-03 23:12 ` [PATCH 32/39] union-mount: Implement union-aware writable open() Valerie Aurora
2010-05-03 23:12 ` [PATCH 33/39] union-mount: Implement union-aware chown() Valerie Aurora
2010-05-03 23:12 ` [PATCH 34/39] union-mount: Implement union-aware truncate() Valerie Aurora
2010-05-03 23:12 ` [PATCH 35/39] union-mount: Implement union-aware chmod()/fchmodat() Valerie Aurora
2010-05-03 23:12 ` [PATCH 36/39] union-mount: Implement union-aware lchown() Valerie Aurora
2010-05-03 23:12 ` [PATCH 37/39] union-mount: Implement union-aware utimensat() Valerie Aurora
2010-05-03 23:12 ` [PATCH 38/39] union-mount: Implement union-aware setxattr() Valerie Aurora
2010-05-03 23:12 ` [PATCH 39/39] union-mount: Implement union-aware lsetxattr() Valerie Aurora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100505131920.GB7534@shell \
--to=vaurora@redhat.com \
--cc=hch@infradead.org \
--cc=jamie@shareable.org \
--cc=jblunck@suse.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).