linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Valerie Aurora <vaurora@redhat.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Felix Fietkau <nbd@openwrt.org>
Subject: Multiple read-only layers in union mounts (was Re: [PATCH 6/6] union-mount: Copy up directory entries on first readdir())
Date: Wed, 3 Mar 2010 16:53:49 -0500	[thread overview]
Message-ID: <20100303215348.GE19689@shell> (raw)
In-Reply-To: <1267567889-4637-7-git-send-email-vaurora@redhat.com>

I think we can now have multiple read-only layers in a union without
lock ordering problems.  From the comment to union_copyup_dir():

/**
 * union_copyup_dir - copy up low-level directory entries to topmost dir
 *
 * readdir() is difficult to support on union file systems for two
 * reasons: We must eliminate duplicates and apply whiteouts, and we
 * must return something in f_pos that lets us restart in the same
 * place when we return.  Our solution is to, on first readdir() of
 * the directory, copy up all visible entries from the low-level file
 * systems and mark the entries that refer to low-level file system
 * objects as "fallthru" entries.
 *
 * Locking strategy: We hold the topmost dir's i_mutex on entry.  We
 * grab the i_mutex on lower directories one by one.  So the locking
 * order is:
 *
 * Writable/topmost layers > Read-only/lower layers
 *
 * So there is no problem with lock ordering for union stacks with
 * multiple lower layers.  E.g.:
 *
 * (topmost) A->B->C (bottom)
 * (topmost) D->C->B (bottom)
 *
 * (Not that we support more than two layers at the moment.)
 */

Does this seem correct?

-VAL

On Tue, Mar 02, 2010 at 02:11:29PM -0800, Valerie Aurora wrote:
> readdir() in union mounts is implemented by copying up all visible
> directory entries from the lower level directories to the topmost
> directory.  Directory entries that refer to lower level file system
> objects are marked as "fallthru" in the topmost directory.
> 
> Thanks to Felix Fietkau <nbd@openwrt.org> for a bug fix.
> 
> Signed-off-by: Valerie Aurora <vaurora@redhat.com>
> Signed-off-by: Felix Fietkau <nbd@openwrt.org>
> ---
>  fs/readdir.c          |    9 +++
>  fs/union.c            |  160 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/union.h |    2 +
>  3 files changed, 171 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/readdir.c b/fs/readdir.c
> index 3a48491..da71515 100644
> --- a/fs/readdir.c
> +++ b/fs/readdir.c
> @@ -16,6 +16,8 @@
>  #include <linux/security.h>
>  #include <linux/syscalls.h>
>  #include <linux/unistd.h>
> +#include <linux/union.h>
> +#include <linux/mount.h>
>  
>  #include <asm/uaccess.h>
>  
> @@ -36,9 +38,16 @@ int vfs_readdir(struct file *file, filldir_t filler, void *buf)
>  
>  	res = -ENOENT;
>  	if (!IS_DEADDIR(inode)) {
> +		if (IS_UNIONED_DIR(&file->f_path) && !IS_OPAQUE(inode)) {
> +			res = union_copyup_dir(&file->f_path);
> +			if (res)
> +				goto out_unlock;
> +		}
> +
>  		res = file->f_op->readdir(file, buf, filler);
>  		file_accessed(file);
>  	}
> +out_unlock:
>  	mutex_unlock(&inode->i_mutex);
>  out:
>  	return res;
> diff --git a/fs/union.c b/fs/union.c
> index ed852e5..b66989a 100644
> --- a/fs/union.c
> +++ b/fs/union.c
> @@ -5,6 +5,7 @@
>   * Copyright (C) 2007-2009 Novell Inc.
>   *
>   *   Author(s): Jan Blunck (j.blunck@tu-harburg.de)
> + *              Valerie Aurora <vaurora@redhat.com>
>   *
>   * This program is free software; you can redistribute it and/or modify it
>   * under the terms of the GNU General Public License as published by the Free
> @@ -22,6 +23,8 @@
>  #include <linux/fs_struct.h>
>  #include <linux/union.h>
>  #include <linux/namei.h>
> +#include <linux/file.h>
> +#include <linux/security.h>
>  
>  /*
>   * This is borrowed from fs/inode.c. The hashtable for lookups. Somebody
> @@ -467,3 +470,160 @@ out:
>  	mnt_drop_write(parent->mnt);
>  	return dentry;
>  }
> +
> +/**
> + * union_copyup_dir_one - copy up a single directory entry
> + *
> + * Individual directory entry copyup function for union_copyup_dir.
> + * We get the entries from higher level layers first.
> + */
> +
> +static int union_copyup_dir_one(void *buf, const char *name, int namlen,
> +				loff_t offset, u64 ino, unsigned int d_type)
> +{
> +	struct dentry *topmost_dentry = (struct dentry *) buf;
> +	struct dentry *dentry;
> +	int err = 0;
> +
> +	switch (namlen) {
> +	case 2:
> +		if (name[1] != '.')
> +			break;
> +	case 1:
> +		if (name[0] != '.')
> +			break;
> +		return 0;
> +	}
> +
> +	/* Lookup this entry in the topmost directory */
> +	dentry = lookup_one_len(name, topmost_dentry, namlen);
> +
> +	if (IS_ERR(dentry)) {
> +		printk(KERN_WARNING "%s: error looking up %s\n", __func__,
> +		       dentry->d_name.name);
> +		err = PTR_ERR(dentry);
> +		goto out;
> +	}
> +
> +	/*
> +	 * If the entry already exists, one of the following is true:
> +	 * it was already copied up (due to an earlier lookup), an
> +	 * entry with the same name already exists on the topmost file
> +	 * system, it is a whiteout, or it is a fallthru.  In each
> +	 * case, the top level entry masks any entries from lower file
> +	 * systems, so don't copy up this entry.
> +	 */
> +	if (dentry->d_inode || d_is_whiteout(dentry) || d_is_fallthru(dentry))
> +		goto out_dput;
> +
> +	/*
> +	 * If the entry doesn't exist, create a fallthru entry in the
> +	 * topmost file system.  All possible directory types are
> +	 * used, so each file system must implement its own way of
> +	 * storing a fallthru entry.
> +	 */
> +	err = topmost_dentry->d_inode->i_op->fallthru(topmost_dentry->d_inode,
> +						      dentry);
> +out_dput:
> +	dput(dentry);
> +out:
> +	return err;
> +}
> +
> +/**
> + * union_copyup_dir - copy up low-level directory entries to topmost dir
> + *
> + * readdir() is difficult to support on union file systems for two
> + * reasons: We must eliminate duplicates and apply whiteouts, and we
> + * must return something in f_pos that lets us restart in the same
> + * place when we return.  Our solution is to, on first readdir() of
> + * the directory, copy up all visible entries from the low-level file
> + * systems and mark the entries that refer to low-level file system
> + * objects as "fallthru" entries.
> + *
> + * Locking strategy: We hold the topmost dir's i_mutex on entry.  We
> + * grab the i_mutex on lower directories one by one.  So the locking
> + * order is:
> + *
> + * Writable/topmost layers > Read-only/lower layers
> + *
> + * So there is no problem with lock ordering for union stacks with
> + * multiple lower layers.  E.g.:
> + *
> + * (topmost) A->B->C (bottom)
> + * (topmost) D->C->B (bottom)
> + *
> + * (Not that we support more than two layers at the moment.)
> + */
> +
> +int union_copyup_dir(struct path *topmost_path)
> +{
> +	struct dentry *topmost_dentry = topmost_path->dentry;
> +	struct path path = *topmost_path;
> +	int res = 0;
> +
> +	BUG_ON(IS_OPAQUE(topmost_dentry->d_inode));
> +
> +	res = mnt_want_write(topmost_path->mnt);
> +	if (res)
> +		return res;
> +	/*
> +	 * Mark this dir opaque to show that we have already copied up
> +	 * the lower entries.  Only fallthru entries pass through to
> +	 * the underlying file system.
> +	 */
> +	topmost_dentry->d_inode->i_flags |= S_OPAQUE;
> +	mark_inode_dirty(topmost_dentry->d_inode);
> +
> +	path_get(&path);
> +	while (union_down_one(&path.mnt, &path.dentry)) {
> +		struct file * ftmp;
> +		struct inode * inode;
> +
> +		/* XXX Permit fallthrus on lower-level? Would need to
> +		 * pass in opaque flag to union_copyup_dir_one() and
> +		 * only copy up fallthru entries there.  We allow
> +		 * fallthrus in lower level opaque directories on
> +		 * lookup, so for consistency we should do one or the
> +		 * other in both places. */
> +		if (IS_OPAQUE(path.dentry->d_inode))
> +			break;
> +
> +		/* dentry_open() doesn't get a path reference itself */
> +		path_get(&path);
> +		ftmp = dentry_open(path.dentry, path.mnt,
> +				   O_RDONLY | O_DIRECTORY | O_NOATIME,
> +				   current_cred());
> +		if (IS_ERR(ftmp)) {
> +			printk (KERN_ERR "unable to open dir %s for "
> +				"directory copyup: %ld\n",
> +				path.dentry->d_name.name, PTR_ERR(ftmp));
> +			path_put(&path);
> +			continue;
> +		}
> +
> +		inode = path.dentry->d_inode;
> +		mutex_lock(&inode->i_mutex);
> +
> +		res = -ENOENT;
> +		if (IS_DEADDIR(inode))
> +			goto out_fput;
> +		/*
> +		 * Read the whole directory, calling our directory
> +		 * entry copyup function on each entry.  Pass in the
> +		 * topmost dentry as our private data so we can create
> +		 * new entries in the topmost directory.
> +		 */
> +		res = ftmp->f_op->readdir(ftmp, topmost_dentry,
> +					  union_copyup_dir_one);
> +out_fput:
> +		mutex_unlock(&inode->i_mutex);
> +		fput(ftmp);
> +
> +		if (res)
> +			break;
> +	}
> +	path_put(&path);
> +	mnt_drop_write(topmost_path->mnt);
> +	return res;
> +}
> diff --git a/include/linux/union.h b/include/linux/union.h
> index 6eaeae8..11ebc1d 100644
> --- a/include/linux/union.h
> +++ b/include/linux/union.h
> @@ -53,6 +53,7 @@ extern struct dentry * union_create_topmost_dir(struct path *, struct qstr *,
>  extern int attach_mnt_union(struct vfsmount *, struct vfsmount *,
>  			    struct dentry *);
>  extern void detach_mnt_union(struct vfsmount *);
> +extern int union_copyup_dir(struct path *);
>  
>  #else /* CONFIG_UNION_MOUNT */
>  
> @@ -66,6 +67,7 @@ extern void detach_mnt_union(struct vfsmount *);
>  #define union_create_topmost_dir(x, y, z)	({ BUG(); (NULL); })
>  #define attach_mnt_union(x, y, z)	do { } while (0)
>  #define detach_mnt_union(x)		do { } while (0)
> +#define union_copyup_dir(x)		({ BUG(); (0); })
>  
>  #endif	/* CONFIG_UNION_MOUNT */
>  #endif	/* __KERNEL__ */
> -- 
> 1.5.6.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-03-03 21:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-02 22:11 [RFC PATCH 0/6] Union mount core rewrite v1 Valerie Aurora
2010-03-02 22:11 ` [PATCH 1/6] union-mount: Introduce union_mount structure and basic operations Valerie Aurora
2010-03-02 22:11   ` [PATCH 2/6] union-mount: Drive the union cache via dcache Valerie Aurora
2010-03-02 22:11     ` [PATCH 3/6] union-mount: Implement union lookup Valerie Aurora
2010-03-02 22:11       ` [PATCH 4/6] union-mount: Support for mounting union mount file systems Valerie Aurora
2010-03-02 22:11         ` [PATCH 5/6] union-mount: Call do_whiteout() on unlink and rmdir in unions Valerie Aurora
2010-03-02 22:11           ` [PATCH 6/6] union-mount: Copy up directory entries on first readdir() Valerie Aurora
2010-03-03 21:53             ` Valerie Aurora [this message]
2010-03-03 17:35     ` [PATCH 2/6] union-mount: Drive the union cache via dcache Miklos Szeredi
2010-03-03 21:49       ` Valerie Aurora
2010-03-04 16:34         ` Miklos Szeredi
2010-03-09 19:22           ` Valerie Aurora
2010-03-03 17:33   ` [PATCH 1/6] union-mount: Introduce union_mount structure and basic operations Miklos Szeredi
2010-03-03 20:45     ` Valerie Aurora
2010-03-04 16:24       ` Miklos Szeredi
2010-03-09 19:49         ` Valerie Aurora
2010-03-03 17:28 ` [RFC PATCH 0/6] Union mount core rewrite v1 Miklos Szeredi
2010-03-03 20:31   ` Valerie Aurora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100303215348.GE19689@shell \
    --to=vaurora@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nbd@openwrt.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).