public inbox for linux-unionfs@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/4] ovl: optimize dir iteration
@ 2016-12-22 17:25 Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Miklos,

This patch series implements dir iteration optimizations
using fs support for setting file types on rename.

Implementing support for xfs was very easy, although this
is just a demo patch, because proper support would require
a new feature flag and relaxing xfs_repair file type checks.

The xfs patch is based on top of my cleanup series, but is
not dependent on it in any significant way.

I tested on the following setup of upper/lower on same base xfs:

/dev-lower_layer on /base type xfs (rw,relatime,attr2,inode64,noquota)
/dev-lower_layer on /lower type xfs (rw,relatime,attr2,inode64,noquota)
/dev-lower_layer on /upper type xfs (rw,relatime,attr2,inode64,noquota)
overlay on /mnt type overlay (rw,relatime,lowerdir=/lower,upperdir=/upper/0,workdir=/upper/work)

Generated some whiteouts some copy ups and some opaque objects:

root@kvm-xfstests:~# rm /mnt/a/pointless100 
root@kvm-xfstests:~# rmdir /mnt/a/empty100
root@kvm-xfstests:~# touch /mnt/a/foo100 
root@kvm-xfstests:~# touch /mnt/a/dir100 
root@kvm-xfstests:~# touch /mnt/a/newfile
root@kvm-xfstests:~# mkdir /mnt/a/newdir

Used this tool I introduced to xfstests to print the resulting dtypes:

root@kvm-xfstests:~# ./xfstests/src/t_dir_type /upper/0/a/
. d
.. d
pointless100 w
empty100 w
foo100 u
dir100 u
newfile f
newdir d


What do you think of the proposed rename API?
To me, it makes some sense, because a request to re-classify
a directory entry is like a request to re-index it, which
is basically what rename is about. It may even make sense
to be able to call the rename API for changing dtype,
without changing the name/parent at all.

Amir.

Amir Goldstein (4):
  vfs: add RENAME_VFS_DTYPE vfs_rename() flag
  xfs: support RENAME_VFS_DTYPE flag
  ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode
  ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged()

 fs/overlayfs/copy_up.c   |  8 +++++++-
 fs/overlayfs/dir.c       | 12 +++++++++---
 fs/overlayfs/overlayfs.h |  5 +++++
 fs/overlayfs/readdir.c   |  2 +-
 fs/xfs/libxfs/xfs_dir2.c |  1 +
 fs/xfs/xfs_iops.c        | 11 ++++++++---
 include/linux/fs.h       | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h  |  4 ++++
 8 files changed, 65 insertions(+), 8 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Add support for extra rename flags that can be passed to
vfs_rename() from kernel code.

Define a new internal vfs flag RENAME_VFS_DTYPE.
This flag indicates that caller would like fs to set the dtype
of the target entry to a specified value.  The dtype value to use
is specified on the S_IFMT mask bits (12..15) of the rename flags.

For example, code can call vfs_rename(..., RENAME_DT_WHT) to set
the target dir entry type to DT_WHT, regardless of the value
of inode->i_mode.

File systems that supports the new RENAME_VFS_DTYPE flag would
check for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags)
instead of inode->i_mode to determine the value of dtype to store
in the directory entry.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 include/linux/fs.h      | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h |  4 ++++
 2 files changed, 34 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8f1580d..82ce8ca 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1559,6 +1559,36 @@ extern int vfs_symlink(struct inode *, struct dentry *, const char *);
 extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
+
+/* vfs_rename() extra flags */
+#define RENAME_VFS_FLAG(i)	(1 << (RENAME_UAPI_BITS + (i)))
+#define RENAME_VFS_DTYPE	RENAME_VFS_FLAG(0)	/* Set dest dtype */
+
+#define RENAME_VFS_FLAG_BITS	1
+
+#define RENAME_FLAGS_BITS	(RENAME_UAPI_BITS + RENAME_VFS_FLAG_BITS)
+#define RENAME_FLAGS_MASK	((1 << RENAME_FLAGS_BITS) - 1)
+
+/*
+ * S_IFMT bits (12..15) carry the dtype value to set for RENAME_VFS_DTYPE.
+ *
+ * For example, code can call vfs_rename(..., RENAME_DT_WHT) to set the
+ * target dir entry type to DT_WHT, regardless of inode->i_mode.
+ * file systems that supports the new RENAME_VFS_DTYPE flag, would check
+ * for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags) instead
+ * of inode->i_mode to determine the dtype to store in the directory entry.
+ */
+#define RENAME_DT_BITS		4
+#define RENAME_DT_SHIFT		12
+#define RENAME_DT_MASK		S_IFMT
+#define RENAME_DT(dt)		(((dt) << RENAME_DT_SHIFT) | RENAME_VFS_DTYPE)
+#define RENAME_DT_UNKNOWN	RENAME_DT(DT_UNKNOWN)
+#define RENAME_DT_WHT		RENAME_DT(DT_WHT)
+/* mode to use instead of i_mode when setting dtype */
+#define RENAME_DT_MODE(f)	((f) & RENAME_DT_MASK)
+
+#define RENAME_VFS_MASK		(RENAME_FLAGS_MASK | RENAME_DT_MASK)
+
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **, unsigned int);
 extern int vfs_whiteout(struct inode *, struct dentry *);
 
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 36da93f..11f0c6b 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -38,10 +38,14 @@
 #define SEEK_HOLE	4	/* seek to the next hole */
 #define SEEK_MAX	SEEK_HOLE
 
+/* renameat2(2) user api flags */
 #define RENAME_NOREPLACE	(1 << 0)	/* Don't overwrite target */
 #define RENAME_EXCHANGE		(1 << 1)	/* Exchange source and dest */
 #define RENAME_WHITEOUT		(1 << 2)	/* Whiteout source */
 
+#define RENAME_UAPI_BITS	3
+#define RENAME_UAPI_MASK	((1 << RENAME_UAPI_BITS) - 1)
+
 struct file_clone_range {
 	__s64 src_fd;
 	__u64 src_offset;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

If caller provided the target dtype to use and indicated that with
rename() flag RENAME_VFS_DTYPE, use RENAME_DT_MODE(flags) instead
of inode->i_mode to determine the value of dtype to store in the
directory entry.

Adding this functionality to official xfs code will require to add
a new feature flag to xfs directry naming on-disk format.

Without that new feature flag, xfs_repair will report that custom
dtype as a warning and set it back to the dtype value according to mode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/libxfs/xfs_dir2.c |  1 +
 fs/xfs/xfs_iops.c        | 11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 984530e..71c6b2b 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -49,6 +49,7 @@ const unsigned char xfs_dtype_to_ftype[DT_MAX] = {
 	[DT_FIFO]   = XFS_DIR3_FT_FIFO,
 	[DT_SOCK]   = XFS_DIR3_FT_SOCK,
 	[DT_LNK]    = XFS_DIR3_FT_SYMLINK,
+	[DT_WHT]    = XFS_DIR3_FT_WHT,
 };
 
 /*
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index d2da9ca..8574155 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -394,19 +394,24 @@ xfs_vn_rename(
 	unsigned int	flags)
 {
 	struct inode	*new_inode = d_inode(ndentry);
-	int		omode = 0;
+	int		omode = 0, nmode = 0;
 	struct xfs_name	oname;
 	struct xfs_name	nname;
 
-	if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT))
+	if (flags & ~RENAME_VFS_MASK)
 		return -EINVAL;
 
 	/* if we are exchanging files, we need to set i_mode of both files */
 	if (flags & RENAME_EXCHANGE)
 		omode = d_inode(ndentry)->i_mode;
+	/* if requested, use provided dtype for target */
+	if (flags & RENAME_VFS_DTYPE)
+		nmode = RENAME_DT_MODE(flags);
+	else
+		nmode = d_inode(odentry)->i_mode;
 
 	xfs_dentry_to_name(&oname, odentry, omode);
-	xfs_dentry_to_name(&nname, ndentry, d_inode(odentry)->i_mode);
+	xfs_dentry_to_name(&nname, ndentry, nmode);
 
 	return xfs_rename(XFS_I(odir), &oname, XFS_I(d_inode(odentry)),
 			  XFS_I(ndir), &nname,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Try to use the new vfs rename flag RENAME_DT_UNKNOWN, to request
from underlying file system to mark copy up directory entries as
DT_UNKNOWN instead of their actual file type.

When copy ups are classified as DT_UNKNOWN, ovl_dir_read_merged() can
distinguish between copy up to opaque objects without having to check
the extended attribute trusted.overlay.inode on the iterated entries.

This will allow to know if upper d_inode value needs to be substitued
with lower d_inode value.

Because this is only done for optimization, it is not a problem
if file system does not respect the new RENAME_DT_UNKNOWN flag, which
is most likely the case, so we retry the rename without the flag.

This patch does not implement the actual optimization.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 8 +++++++-
 fs/overlayfs/overlayfs.h | 5 +++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index f57043d..bcbcef5 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -298,7 +298,13 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out_cleanup;
 
-	err = ovl_do_rename(wdir, newdentry, udir, upper, 0);
+	/*
+	 * Mark copy up objects in upper as DT_UNKNOWN, so it easy to
+	 * distinguish them from opaque objects in iterate_dir().
+	 * File system repair tools may re-classify the file type
+	 * and that will break optimization, but not functionality.
+	 */
+	err = ovl_do_rename(wdir, newdentry, udir, upper, RENAME_DT_UNKNOWN);
 	if (err)
 		goto out_cleanup;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 8af450b..54162c0 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -113,6 +113,11 @@ static inline int ovl_do_rename(struct inode *olddir, struct dentry *olddentry,
 
 	err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL, flags);
 
+	/* retry without RENAME_VFS_DTYPE if fs does not support it */
+	if (err == -EINVAL && (flags & RENAME_VFS_DTYPE))
+		err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL,
+				 flags & RENAME_UAPI_MASK);
+
 	if (err) {
 		pr_debug("...rename(%pd2, %pd2, ...) = %i\n",
 			 olddentry, newdentry, err);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged()
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
                   ` (2 preceding siblings ...)
  2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Try to use the new vfs rename flag RENAME_DT_WHT, to request from
underlying file system to mark whiteout directory entries as DT_WHT
instead of DT_CHR.

When whiteouts are classified as DT_WHT, ovl_dir_read_merged() can
distinguish between whiteouts to real character devices without
having to stat the suspect inodes.

Because this is only done for optimization, it is not a problem
if file system does not respect the new RENAME_DT_WHT flag, which
is most likely the case, so we retry the rename without the flag.

Even if DT_WHT type is set on rename using an experimental file
system patch, file system repair tools may later re-classify the
file type. That would break the optimization, but will not break
the functionality of ovl_dir_read_merged().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c     | 12 +++++++++---
 fs/overlayfs/readdir.c |  2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 16e06dd..0e1c3f2 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -630,7 +630,13 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
 	struct dentry *upper;
 	struct dentry *opaquedir = NULL;
 	int err;
-	int flags = 0;
+	/*
+	 * Mark whiteout objects as DT_WHT instead of DT_CHR, so it easy
+	 * to distinguish them real character devices in iterate_dir().
+	 * File system repair tools may re-classify the file type
+	 * and that will break optimization, but not functionality.
+	 */
+	int flags = RENAME_DT_WHT;
 
 	if (WARN_ON(!workdir))
 		return -EROFS;
@@ -665,12 +671,12 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
 		goto out_dput_upper;
 
 	if (d_is_dir(upper))
-		flags = RENAME_EXCHANGE;
+		flags |= RENAME_EXCHANGE;
 
 	err = ovl_do_rename(wdir, whiteout, udir, upper, flags);
 	if (err)
 		goto kill_whiteout;
-	if (flags)
+	if (flags & RENAME_EXCHANGE)
 		ovl_cleanup(wdir, upper);
 
 	ovl_dentry_version_inc(dentry->d_parent);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index f241b4e..32c5c96 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -98,7 +98,7 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
 	p->len = len;
 	p->type = d_type;
 	p->ino = ino;
-	p->is_whiteout = false;
+	p->is_whiteout = (d_type == DT_WHT);
 
 	if (d_type == DT_CHR) {
 		p->next_maybe_whiteout = rdd->first_maybe_whiteout;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-12-22 17:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox