* [RFC][PATCH 0/4] ovl: optimize dir iteration
@ 2016-12-22 17:25 Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs
Miklos,
This patch series implements dir iteration optimizations
using fs support for setting file types on rename.
Implementing support for xfs was very easy, although this
is just a demo patch, because proper support would require
a new feature flag and relaxing xfs_repair file type checks.
The xfs patch is based on top of my cleanup series, but is
not dependent on it in any significant way.
I tested on the following setup of upper/lower on same base xfs:
/dev-lower_layer on /base type xfs (rw,relatime,attr2,inode64,noquota)
/dev-lower_layer on /lower type xfs (rw,relatime,attr2,inode64,noquota)
/dev-lower_layer on /upper type xfs (rw,relatime,attr2,inode64,noquota)
overlay on /mnt type overlay (rw,relatime,lowerdir=/lower,upperdir=/upper/0,workdir=/upper/work)
Generated some whiteouts some copy ups and some opaque objects:
root@kvm-xfstests:~# rm /mnt/a/pointless100
root@kvm-xfstests:~# rmdir /mnt/a/empty100
root@kvm-xfstests:~# touch /mnt/a/foo100
root@kvm-xfstests:~# touch /mnt/a/dir100
root@kvm-xfstests:~# touch /mnt/a/newfile
root@kvm-xfstests:~# mkdir /mnt/a/newdir
Used this tool I introduced to xfstests to print the resulting dtypes:
root@kvm-xfstests:~# ./xfstests/src/t_dir_type /upper/0/a/
. d
.. d
pointless100 w
empty100 w
foo100 u
dir100 u
newfile f
newdir d
What do you think of the proposed rename API?
To me, it makes some sense, because a request to re-classify
a directory entry is like a request to re-index it, which
is basically what rename is about. It may even make sense
to be able to call the rename API for changing dtype,
without changing the name/parent at all.
Amir.
Amir Goldstein (4):
vfs: add RENAME_VFS_DTYPE vfs_rename() flag
xfs: support RENAME_VFS_DTYPE flag
ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode
ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged()
fs/overlayfs/copy_up.c | 8 +++++++-
fs/overlayfs/dir.c | 12 +++++++++---
fs/overlayfs/overlayfs.h | 5 +++++
fs/overlayfs/readdir.c | 2 +-
fs/xfs/libxfs/xfs_dir2.c | 1 +
fs/xfs/xfs_iops.c | 11 ++++++++---
include/linux/fs.h | 30 ++++++++++++++++++++++++++++++
include/uapi/linux/fs.h | 4 ++++
8 files changed, 65 insertions(+), 8 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs
Add support for extra rename flags that can be passed to
vfs_rename() from kernel code.
Define a new internal vfs flag RENAME_VFS_DTYPE.
This flag indicates that caller would like fs to set the dtype
of the target entry to a specified value. The dtype value to use
is specified on the S_IFMT mask bits (12..15) of the rename flags.
For example, code can call vfs_rename(..., RENAME_DT_WHT) to set
the target dir entry type to DT_WHT, regardless of the value
of inode->i_mode.
File systems that supports the new RENAME_VFS_DTYPE flag would
check for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags)
instead of inode->i_mode to determine the value of dtype to store
in the directory entry.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
include/linux/fs.h | 30 ++++++++++++++++++++++++++++++
include/uapi/linux/fs.h | 4 ++++
2 files changed, 34 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8f1580d..82ce8ca 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1559,6 +1559,36 @@ extern int vfs_symlink(struct inode *, struct dentry *, const char *);
extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
extern int vfs_rmdir(struct inode *, struct dentry *);
extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
+
+/* vfs_rename() extra flags */
+#define RENAME_VFS_FLAG(i) (1 << (RENAME_UAPI_BITS + (i)))
+#define RENAME_VFS_DTYPE RENAME_VFS_FLAG(0) /* Set dest dtype */
+
+#define RENAME_VFS_FLAG_BITS 1
+
+#define RENAME_FLAGS_BITS (RENAME_UAPI_BITS + RENAME_VFS_FLAG_BITS)
+#define RENAME_FLAGS_MASK ((1 << RENAME_FLAGS_BITS) - 1)
+
+/*
+ * S_IFMT bits (12..15) carry the dtype value to set for RENAME_VFS_DTYPE.
+ *
+ * For example, code can call vfs_rename(..., RENAME_DT_WHT) to set the
+ * target dir entry type to DT_WHT, regardless of inode->i_mode.
+ * file systems that supports the new RENAME_VFS_DTYPE flag, would check
+ * for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags) instead
+ * of inode->i_mode to determine the dtype to store in the directory entry.
+ */
+#define RENAME_DT_BITS 4
+#define RENAME_DT_SHIFT 12
+#define RENAME_DT_MASK S_IFMT
+#define RENAME_DT(dt) (((dt) << RENAME_DT_SHIFT) | RENAME_VFS_DTYPE)
+#define RENAME_DT_UNKNOWN RENAME_DT(DT_UNKNOWN)
+#define RENAME_DT_WHT RENAME_DT(DT_WHT)
+/* mode to use instead of i_mode when setting dtype */
+#define RENAME_DT_MODE(f) ((f) & RENAME_DT_MASK)
+
+#define RENAME_VFS_MASK (RENAME_FLAGS_MASK | RENAME_DT_MASK)
+
extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **, unsigned int);
extern int vfs_whiteout(struct inode *, struct dentry *);
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 36da93f..11f0c6b 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -38,10 +38,14 @@
#define SEEK_HOLE 4 /* seek to the next hole */
#define SEEK_MAX SEEK_HOLE
+/* renameat2(2) user api flags */
#define RENAME_NOREPLACE (1 << 0) /* Don't overwrite target */
#define RENAME_EXCHANGE (1 << 1) /* Exchange source and dest */
#define RENAME_WHITEOUT (1 << 2) /* Whiteout source */
+#define RENAME_UAPI_BITS 3
+#define RENAME_UAPI_MASK ((1 << RENAME_UAPI_BITS) - 1)
+
struct file_clone_range {
__s64 src_fd;
__u64 src_offset;
--
2.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs
If caller provided the target dtype to use and indicated that with
rename() flag RENAME_VFS_DTYPE, use RENAME_DT_MODE(flags) instead
of inode->i_mode to determine the value of dtype to store in the
directory entry.
Adding this functionality to official xfs code will require to add
a new feature flag to xfs directry naming on-disk format.
Without that new feature flag, xfs_repair will report that custom
dtype as a warning and set it back to the dtype value according to mode.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/xfs/libxfs/xfs_dir2.c | 1 +
fs/xfs/xfs_iops.c | 11 ++++++++---
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 984530e..71c6b2b 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -49,6 +49,7 @@ const unsigned char xfs_dtype_to_ftype[DT_MAX] = {
[DT_FIFO] = XFS_DIR3_FT_FIFO,
[DT_SOCK] = XFS_DIR3_FT_SOCK,
[DT_LNK] = XFS_DIR3_FT_SYMLINK,
+ [DT_WHT] = XFS_DIR3_FT_WHT,
};
/*
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index d2da9ca..8574155 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -394,19 +394,24 @@ xfs_vn_rename(
unsigned int flags)
{
struct inode *new_inode = d_inode(ndentry);
- int omode = 0;
+ int omode = 0, nmode = 0;
struct xfs_name oname;
struct xfs_name nname;
- if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT))
+ if (flags & ~RENAME_VFS_MASK)
return -EINVAL;
/* if we are exchanging files, we need to set i_mode of both files */
if (flags & RENAME_EXCHANGE)
omode = d_inode(ndentry)->i_mode;
+ /* if requested, use provided dtype for target */
+ if (flags & RENAME_VFS_DTYPE)
+ nmode = RENAME_DT_MODE(flags);
+ else
+ nmode = d_inode(odentry)->i_mode;
xfs_dentry_to_name(&oname, odentry, omode);
- xfs_dentry_to_name(&nname, ndentry, d_inode(odentry)->i_mode);
+ xfs_dentry_to_name(&nname, ndentry, nmode);
return xfs_rename(XFS_I(odir), &oname, XFS_I(d_inode(odentry)),
XFS_I(ndir), &nname,
--
2.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs
Try to use the new vfs rename flag RENAME_DT_UNKNOWN, to request
from underlying file system to mark copy up directory entries as
DT_UNKNOWN instead of their actual file type.
When copy ups are classified as DT_UNKNOWN, ovl_dir_read_merged() can
distinguish between copy up to opaque objects without having to check
the extended attribute trusted.overlay.inode on the iterated entries.
This will allow to know if upper d_inode value needs to be substitued
with lower d_inode value.
Because this is only done for optimization, it is not a problem
if file system does not respect the new RENAME_DT_UNKNOWN flag, which
is most likely the case, so we retry the rename without the flag.
This patch does not implement the actual optimization.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/copy_up.c | 8 +++++++-
fs/overlayfs/overlayfs.h | 5 +++++
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index f57043d..bcbcef5 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -298,7 +298,13 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
if (err)
goto out_cleanup;
- err = ovl_do_rename(wdir, newdentry, udir, upper, 0);
+ /*
+ * Mark copy up objects in upper as DT_UNKNOWN, so it easy to
+ * distinguish them from opaque objects in iterate_dir().
+ * File system repair tools may re-classify the file type
+ * and that will break optimization, but not functionality.
+ */
+ err = ovl_do_rename(wdir, newdentry, udir, upper, RENAME_DT_UNKNOWN);
if (err)
goto out_cleanup;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 8af450b..54162c0 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -113,6 +113,11 @@ static inline int ovl_do_rename(struct inode *olddir, struct dentry *olddentry,
err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL, flags);
+ /* retry without RENAME_VFS_DTYPE if fs does not support it */
+ if (err == -EINVAL && (flags & RENAME_VFS_DTYPE))
+ err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL,
+ flags & RENAME_UAPI_MASK);
+
if (err) {
pr_debug("...rename(%pd2, %pd2, ...) = %i\n",
olddentry, newdentry, err);
--
2.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged()
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
` (2 preceding siblings ...)
2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs
Try to use the new vfs rename flag RENAME_DT_WHT, to request from
underlying file system to mark whiteout directory entries as DT_WHT
instead of DT_CHR.
When whiteouts are classified as DT_WHT, ovl_dir_read_merged() can
distinguish between whiteouts to real character devices without
having to stat the suspect inodes.
Because this is only done for optimization, it is not a problem
if file system does not respect the new RENAME_DT_WHT flag, which
is most likely the case, so we retry the rename without the flag.
Even if DT_WHT type is set on rename using an experimental file
system patch, file system repair tools may later re-classify the
file type. That would break the optimization, but will not break
the functionality of ovl_dir_read_merged().
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/dir.c | 12 +++++++++---
fs/overlayfs/readdir.c | 2 +-
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 16e06dd..0e1c3f2 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -630,7 +630,13 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
struct dentry *upper;
struct dentry *opaquedir = NULL;
int err;
- int flags = 0;
+ /*
+ * Mark whiteout objects as DT_WHT instead of DT_CHR, so it easy
+ * to distinguish them real character devices in iterate_dir().
+ * File system repair tools may re-classify the file type
+ * and that will break optimization, but not functionality.
+ */
+ int flags = RENAME_DT_WHT;
if (WARN_ON(!workdir))
return -EROFS;
@@ -665,12 +671,12 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
goto out_dput_upper;
if (d_is_dir(upper))
- flags = RENAME_EXCHANGE;
+ flags |= RENAME_EXCHANGE;
err = ovl_do_rename(wdir, whiteout, udir, upper, flags);
if (err)
goto kill_whiteout;
- if (flags)
+ if (flags & RENAME_EXCHANGE)
ovl_cleanup(wdir, upper);
ovl_dentry_version_inc(dentry->d_parent);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index f241b4e..32c5c96 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -98,7 +98,7 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
p->len = len;
p->type = d_type;
p->ino = ino;
- p->is_whiteout = false;
+ p->is_whiteout = (d_type == DT_WHT);
if (d_type == DT_CHR) {
p->next_maybe_whiteout = rdd->first_maybe_whiteout;
--
2.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-12-22 17:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox