* [PATCH v5 0/5] Handle notifications on overlayfs fake path files
@ 2023-06-15 11:22 Amir Goldstein
2023-06-15 11:22 ` [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open() Amir Goldstein
` (5 more replies)
0 siblings, 6 replies; 12+ messages in thread
From: Amir Goldstein @ 2023-06-15 11:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Miklos Szeredi, Christoph Hellwig, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
Christian,
A little while ago, Jan and I realized that an unprivileged overlayfs
mount could be used to avert fanotify permission events that were
requested for an inode or sb on the underlying fs.
The [v1] patch set was an attempt to implement Miklos' suggestion
(opt-in to query the fake path) which turned out to affet the vfs in
many places, so Miklos and I agreed on a solution that will be less
intrusive for vfs (opt-in to query the real path).
The [v2] patch set took the less intrusive approach to vfs, but it
also tried a different approach of extending the d_real() interface,
which Miklos did not like.
The [v3] patch goes back to the less intrusive approach to vfs without
complicating d_real() interface, that Miklso and I agreed on during the
[v1] patch set review, so hopefully everyone can be happy with it.
This v5 patch set addresses review comments from yourself and from
Christoph on [v3] and [v4].
Since the patches are 95% vfs, I think it is better if they are merged
through the vfs tree.
I am still hoping to solicit an ACK from Miklos on the ovl change
in the last patch.
Thanks,
Amir.
Changes since [v4]:
- ACK from Jan for fsnotify patch
- Do not use backing_file for cachefiles (brauner)
- Consistent naming scheme *_*file_open() (brauner)
- Split patches and better documentation (hch)
Changes since [v3]:
- Rename struct file_fake to backing_file
- Rename helpers to open_backing_file(), backing_file_real_path()
- Rename FMODE_FAKE_PATH to FMODE_BACKING
- Separate flag from FMODE_NOACCOUNT
- inline the fast-path branch of f_real_path()
Changes since [v2]:
- Restore the file_fake container (Miklos)
- Re-arrange the v1 helpers (Christian)
Changes since [v1]:
- Drop the file_fake container
- Leave f_path fake and special case only fsnotify
[v4] https://lore.kernel.org/linux-unionfs/20230614074907.1943007-1-amir73il@gmail.com/
[v3] https://lore.kernel.org/linux-unionfs/20230611194706.1583818-1-amir73il@gmail.com/
[v2] https://lore.kernel.org/linux-unionfs/20230611132732.1502040-1-amir73il@gmail.com/
[v1] https://lore.kernel.org/linux-unionfs/20230609073239.957184-1-amir73il@gmail.com/
Amir Goldstein (5):
fs: rename {vfs,kernel}_tmpfile_open()
fs: use a helper for opening kernel internal files
fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers
fs: use backing_file container for internal files with "fake" f_path
ovl: enable fsnotify events on underlying real files
fs/cachefiles/namei.c | 10 ++---
fs/file_table.c | 91 ++++++++++++++++++++++++++++++++--------
fs/internal.h | 5 ++-
fs/namei.c | 24 ++++++-----
fs/open.c | 75 ++++++++++++++++++++++++++++-----
fs/overlayfs/file.c | 8 ++--
fs/overlayfs/overlayfs.h | 5 ++-
include/linux/fs.h | 32 ++++++++++----
include/linux/fsnotify.h | 3 +-
9 files changed, 192 insertions(+), 61 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open()
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
@ 2023-06-15 11:22 ` Amir Goldstein
2023-06-16 7:04 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 2/5] fs: use a helper for opening kernel internal files Amir Goldstein
` (4 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2023-06-15 11:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Miklos Szeredi, Christoph Hellwig, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
Overlayfs and cachefiles use vfs_open_tmpfile() to open a tmpfile
without accounting for nr_files.
Rename this helper to kernel_tmpfile_open() to better reflect this
helper is used for kernel internal users.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/cachefiles/namei.c | 6 +++---
fs/namei.c | 24 +++++++++++++-----------
fs/overlayfs/overlayfs.h | 5 +++--
include/linux/fs.h | 7 ++++---
4 files changed, 23 insertions(+), 19 deletions(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 82219a8f6084..6c7d4e97c219 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -451,9 +451,9 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
ret = cachefiles_inject_write_error();
if (ret == 0) {
- file = vfs_tmpfile_open(&nop_mnt_idmap, &parentpath, S_IFREG,
- O_RDWR | O_LARGEFILE | O_DIRECT,
- cache->cache_cred);
+ file = kernel_tmpfile_open(&nop_mnt_idmap, &parentpath, S_IFREG,
+ O_RDWR | O_LARGEFILE | O_DIRECT,
+ cache->cache_cred);
ret = PTR_ERR_OR_ZERO(file);
}
if (ret) {
diff --git a/fs/namei.c b/fs/namei.c
index e4fe0879ae55..36e335c39c44 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3703,7 +3703,7 @@ static int vfs_tmpfile(struct mnt_idmap *idmap,
}
/**
- * vfs_tmpfile_open - open a tmpfile for kernel internal use
+ * kernel_tmpfile_open - open a tmpfile for kernel internal use
* @idmap: idmap of the mount the inode was found from
* @parentpath: path of the base directory
* @mode: mode of the new tmpfile
@@ -3714,24 +3714,26 @@ static int vfs_tmpfile(struct mnt_idmap *idmap,
* hence this is only for kernel internal use, and must not be installed into
* file tables or such.
*/
-struct file *vfs_tmpfile_open(struct mnt_idmap *idmap,
- const struct path *parentpath,
- umode_t mode, int open_flag, const struct cred *cred)
+struct file *kernel_tmpfile_open(struct mnt_idmap *idmap,
+ const struct path *parentpath,
+ umode_t mode, int open_flag,
+ const struct cred *cred)
{
struct file *file;
int error;
file = alloc_empty_file_noaccount(open_flag, cred);
- if (!IS_ERR(file)) {
- error = vfs_tmpfile(idmap, parentpath, file, mode);
- if (error) {
- fput(file);
- file = ERR_PTR(error);
- }
+ if (IS_ERR(file))
+ return file;
+
+ error = vfs_tmpfile(idmap, parentpath, file, mode);
+ if (error) {
+ fput(file);
+ file = ERR_PTR(error);
}
return file;
}
-EXPORT_SYMBOL(vfs_tmpfile_open);
+EXPORT_SYMBOL(kernel_tmpfile_open);
static int do_tmpfile(struct nameidata *nd, unsigned flags,
const struct open_flags *op,
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index fcac4e2c56ab..6129f0984cf7 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -329,8 +329,9 @@ static inline struct file *ovl_do_tmpfile(struct ovl_fs *ofs,
struct dentry *dentry, umode_t mode)
{
struct path path = { .mnt = ovl_upper_mnt(ofs), .dentry = dentry };
- struct file *file = vfs_tmpfile_open(ovl_upper_mnt_idmap(ofs), &path, mode,
- O_LARGEFILE | O_WRONLY, current_cred());
+ struct file *file = kernel_tmpfile_open(ovl_upper_mnt_idmap(ofs), &path,
+ mode, O_LARGEFILE | O_WRONLY,
+ current_cred());
int err = PTR_ERR_OR_ZERO(file);
pr_debug("tmpfile(%pd2, 0%o) = %i\n", dentry, mode, err);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21a981680856..62237beeac2a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1672,9 +1672,10 @@ static inline int vfs_whiteout(struct mnt_idmap *idmap,
WHITEOUT_DEV);
}
-struct file *vfs_tmpfile_open(struct mnt_idmap *idmap,
- const struct path *parentpath,
- umode_t mode, int open_flag, const struct cred *cred);
+struct file *kernel_tmpfile_open(struct mnt_idmap *idmap,
+ const struct path *parentpath,
+ umode_t mode, int open_flag,
+ const struct cred *cred);
int vfs_mkobj(struct dentry *, umode_t,
int (*f)(struct dentry *, umode_t, void *),
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 2/5] fs: use a helper for opening kernel internal files
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
2023-06-15 11:22 ` [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open() Amir Goldstein
@ 2023-06-15 11:22 ` Amir Goldstein
2023-06-16 7:06 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers Amir Goldstein
` (3 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2023-06-15 11:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Miklos Szeredi, Christoph Hellwig, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
cachefiles uses kernel_open_tmpfile() to open kernel internal tmpfile
without accounting for nr_files.
cachefiles uses open_with_fake_path() for the same reason without the
need for a fake path.
Fork open_with_fake_path() to kernel_file_open() which only does the
noaccount part and use it in cachefiles.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/cachefiles/namei.c | 4 ++--
fs/open.c | 31 +++++++++++++++++++++++++++++++
include/linux/fs.h | 2 ++
3 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 6c7d4e97c219..499cf73f097b 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -560,8 +560,8 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
*/
path.mnt = cache->mnt;
path.dentry = dentry;
- file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
- d_backing_inode(dentry), cache->cache_cred);
+ file = kernel_file_open(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
+ d_backing_inode(dentry), cache->cache_cred);
if (IS_ERR(file)) {
trace_cachefiles_vfs_error(object, d_backing_inode(dentry),
PTR_ERR(file),
diff --git a/fs/open.c b/fs/open.c
index 005ca91a173b..c5da2b3eb105 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1121,6 +1121,37 @@ struct file *dentry_create(const struct path *path, int flags, umode_t mode,
}
EXPORT_SYMBOL(dentry_create);
+/**
+ * kernel_file_open - open a file for kernel internal use
+ * @path: path of the file to open
+ * @flags: open flags
+ * @inode: the inode
+ * @cred: credentials for open
+ *
+ * Open a file that is not accounted in nr_files.
+ * This is only for kernel internal use, and must not be installed into
+ * file tables or such.
+ */
+struct file *kernel_file_open(const struct path *path, int flags,
+ struct inode *inode, const struct cred *cred)
+{
+ struct file *f;
+ int error;
+
+ f = alloc_empty_file_noaccount(flags, cred);
+ if (IS_ERR(f))
+ return f;
+
+ f->f_path = *path;
+ error = do_dentry_open(f, inode, NULL);
+ if (error) {
+ fput(f);
+ f = ERR_PTR(error);
+ }
+ return f;
+}
+EXPORT_SYMBOL_GPL(kernel_file_open);
+
struct file *open_with_fake_path(const struct path *path, int flags,
struct inode *inode, const struct cred *cred)
{
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 62237beeac2a..1f8486e773af 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1676,6 +1676,8 @@ struct file *kernel_tmpfile_open(struct mnt_idmap *idmap,
const struct path *parentpath,
umode_t mode, int open_flag,
const struct cred *cred);
+struct file *kernel_file_open(const struct path *path, int flags,
+ struct inode *inode, const struct cred *cred);
int vfs_mkobj(struct dentry *, umode_t,
int (*f)(struct dentry *, umode_t, void *),
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
2023-06-15 11:22 ` [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open() Amir Goldstein
2023-06-15 11:22 ` [PATCH v5 2/5] fs: use a helper for opening kernel internal files Amir Goldstein
@ 2023-06-15 11:22 ` Amir Goldstein
2023-06-16 7:07 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path Amir Goldstein
` (2 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2023-06-15 11:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Miklos Szeredi, Christoph Hellwig, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
Use a common helper init_file() instead of __alloc_file() for
alloc_empty_file*() helpers and improrve the documentation.
This is needed for a follow up patch that allocates a backing_file
container.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/file_table.c | 41 ++++++++++++++++++++++++++---------------
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/fs/file_table.c b/fs/file_table.c
index 372653b92617..4bc713865212 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -131,20 +131,15 @@ static int __init init_fs_stat_sysctls(void)
fs_initcall(init_fs_stat_sysctls);
#endif
-static struct file *__alloc_file(int flags, const struct cred *cred)
+static int init_file(struct file *f, int flags, const struct cred *cred)
{
- struct file *f;
int error;
- f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL);
- if (unlikely(!f))
- return ERR_PTR(-ENOMEM);
-
f->f_cred = get_cred(cred);
error = security_file_alloc(f);
if (unlikely(error)) {
file_free_rcu(&f->f_rcuhead);
- return ERR_PTR(error);
+ return error;
}
atomic_long_set(&f->f_count, 1);
@@ -155,7 +150,7 @@ static struct file *__alloc_file(int flags, const struct cred *cred)
f->f_mode = OPEN_FMODE(flags);
/* f->f_version: 0 */
- return f;
+ return 0;
}
/* Find an unused file structure and return a pointer to it.
@@ -172,6 +167,7 @@ struct file *alloc_empty_file(int flags, const struct cred *cred)
{
static long old_max;
struct file *f;
+ int error;
/*
* Privileged users can go above max_files
@@ -185,9 +181,15 @@ struct file *alloc_empty_file(int flags, const struct cred *cred)
goto over;
}
- f = __alloc_file(flags, cred);
- if (!IS_ERR(f))
- percpu_counter_inc(&nr_files);
+ f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL);
+ if (unlikely(!f))
+ return ERR_PTR(-ENOMEM);
+
+ error = init_file(f, flags, cred);
+ if (unlikely(error))
+ return ERR_PTR(error);
+
+ percpu_counter_inc(&nr_files);
return f;
@@ -203,14 +205,23 @@ struct file *alloc_empty_file(int flags, const struct cred *cred)
/*
* Variant of alloc_empty_file() that doesn't check and modify nr_files.
*
- * Should not be used unless there's a very good reason to do so.
+ * This is only for kernel internal use, and the allocate file must not be
+ * installed into file tables or such.
*/
struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
{
- struct file *f = __alloc_file(flags, cred);
+ struct file *f;
+ int error;
+
+ f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL);
+ if (unlikely(!f))
+ return ERR_PTR(-ENOMEM);
+
+ error = init_file(f, flags, cred);
+ if (unlikely(error))
+ return ERR_PTR(error);
- if (!IS_ERR(f))
- f->f_mode |= FMODE_NOACCOUNT;
+ f->f_mode |= FMODE_NOACCOUNT;
return f;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
` (2 preceding siblings ...)
2023-06-15 11:22 ` [PATCH v5 3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers Amir Goldstein
@ 2023-06-15 11:22 ` Amir Goldstein
2023-06-16 7:15 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 5/5] ovl: enable fsnotify events on underlying real files Amir Goldstein
2023-06-20 10:57 ` [PATCH v5 0/5] Handle notifications on overlayfs fake path files Christian Brauner
5 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2023-06-15 11:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Miklos Szeredi, Christoph Hellwig, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
Overlayfs uses open_with_fake_path() to allocate internal kernel files,
with a "fake" path - whose f_path is not on the same fs as f_inode.
Allocate a container struct backing_file for those internal files, that
is used to hold the "fake" ovl path along with the real path.
backing_file_real_path() can be used to access the stored real path.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/file_table.c | 50 +++++++++++++++++++++++++++++++++++++++++++--
fs/internal.h | 5 +++--
fs/open.c | 46 ++++++++++++++++++++++++++++++-----------
fs/overlayfs/file.c | 4 ++--
include/linux/fs.h | 23 ++++++++++++++++-----
5 files changed, 105 insertions(+), 23 deletions(-)
diff --git a/fs/file_table.c b/fs/file_table.c
index 4bc713865212..34a832504369 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -44,18 +44,40 @@ static struct kmem_cache *filp_cachep __read_mostly;
static struct percpu_counter nr_files __cacheline_aligned_in_smp;
+/* Container for backing file with optional real path */
+struct backing_file {
+ struct file file;
+ struct path real_path;
+};
+
+static inline struct backing_file *backing_file(struct file *f)
+{
+ return container_of(f, struct backing_file, file);
+}
+
+struct path *backing_file_real_path(struct file *f)
+{
+ return &backing_file(f)->real_path;
+}
+EXPORT_SYMBOL_GPL(backing_file_real_path);
+
static void file_free_rcu(struct rcu_head *head)
{
struct file *f = container_of(head, struct file, f_rcuhead);
put_cred(f->f_cred);
- kmem_cache_free(filp_cachep, f);
+ if (unlikely(f->f_mode & FMODE_BACKING))
+ kfree(backing_file(f));
+ else
+ kmem_cache_free(filp_cachep, f);
}
static inline void file_free(struct file *f)
{
security_file_free(f);
- if (!(f->f_mode & FMODE_NOACCOUNT))
+ if (unlikely(f->f_mode & FMODE_BACKING))
+ path_put(backing_file_real_path(f));
+ else
percpu_counter_dec(&nr_files);
call_rcu(&f->f_rcuhead, file_free_rcu);
}
@@ -226,6 +248,30 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
return f;
}
+/*
+ * Variant of alloc_empty_file() that allocates a backing_file container
+ * and doesn't check and modify nr_files.
+ *
+ * This is only for kernel internal use, and the allocate file must not be
+ * installed into file tables or such.
+ */
+struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
+{
+ struct backing_file *ff;
+ int error;
+
+ ff = kzalloc(sizeof(struct backing_file), GFP_KERNEL);
+ if (unlikely(!ff))
+ return ERR_PTR(-ENOMEM);
+
+ error = init_file(&ff->file, flags, cred);
+ if (unlikely(error))
+ return ERR_PTR(error);
+
+ ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
+ return &ff->file;
+}
+
/**
* alloc_file - allocate and initialize a 'struct file'
*
diff --git a/fs/internal.h b/fs/internal.h
index bd3b2810a36b..9c31078e0d16 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -97,8 +97,9 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
/*
* file_table.c
*/
-extern struct file *alloc_empty_file(int, const struct cred *);
-extern struct file *alloc_empty_file_noaccount(int, const struct cred *);
+struct file *alloc_empty_file(int flags, const struct cred *cred);
+struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
+struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
static inline void put_file_access(struct file *file)
{
diff --git a/fs/open.c b/fs/open.c
index c5da2b3eb105..059d7024155a 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1152,23 +1152,45 @@ struct file *kernel_file_open(const struct path *path, int flags,
}
EXPORT_SYMBOL_GPL(kernel_file_open);
-struct file *open_with_fake_path(const struct path *path, int flags,
- struct inode *inode, const struct cred *cred)
+/**
+ * backing_file_open - open a backing file for kernel internal use
+ * @path: path of the file to open
+ * @flags: open flags
+ * @path: path of the backing file
+ * @cred: credentials for open
+ *
+ * Open a file whose f_inode != d_inode(f_path.dentry).
+ * the returned struct file is embedded inside a struct backing_file
+ * container which is used to store the @real_path of the inode.
+ * The @real_path can be accessed by backing_file_real_path() and the
+ * real dentry can be accessed with file_dentry().
+ * The kernel internal backing file is not accounted in nr_files.
+ * This is only for kernel internal use, and must not be installed into
+ * file tables or such.
+ */
+struct file *backing_file_open(const struct path *path, int flags,
+ const struct path *real_path,
+ const struct cred *cred)
{
- struct file *f = alloc_empty_file_noaccount(flags, cred);
- if (!IS_ERR(f)) {
- int error;
+ struct file *f;
+ int error;
- f->f_path = *path;
- error = do_dentry_open(f, inode, NULL);
- if (error) {
- fput(f);
- f = ERR_PTR(error);
- }
+ f = alloc_empty_backing_file(flags, cred);
+ if (IS_ERR(f))
+ return f;
+
+ f->f_path = *path;
+ path_get(real_path);
+ *backing_file_real_path(f) = *real_path;
+ error = do_dentry_open(f, d_inode(real_path->dentry), NULL);
+ if (error) {
+ fput(f);
+ f = ERR_PTR(error);
}
+
return f;
}
-EXPORT_SYMBOL(open_with_fake_path);
+EXPORT_SYMBOL_GPL(backing_file_open);
#define WILL_CREATE(flags) (flags & (O_CREAT | __O_TMPFILE))
#define O_PATH_FLAGS (O_DIRECTORY | O_NOFOLLOW | O_PATH | O_CLOEXEC)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 39737c2aaa84..c610ae35b0b9 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -61,8 +61,8 @@ static struct file *ovl_open_realfile(const struct file *file,
if (!inode_owner_or_capable(real_idmap, realinode))
flags &= ~O_NOATIME;
- realfile = open_with_fake_path(&file->f_path, flags, realinode,
- current_cred());
+ realfile = backing_file_open(&file->f_path, flags, realpath,
+ current_cred());
}
revert_creds(old_cred);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1f8486e773af..ecf85a46fb2a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -171,6 +171,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
/* File supports non-exclusive O_DIRECT writes from multiple threads */
#define FMODE_DIO_PARALLEL_WRITE ((__force fmode_t)0x1000000)
+/* File is embedded in backing_file object */
+#define FMODE_BACKING ((__force fmode_t)0x2000000)
+
/* File was opened by fanotify and shouldn't generate fanotify events */
#define FMODE_NONOTIFY ((__force fmode_t)0x4000000)
@@ -2352,11 +2355,21 @@ static inline struct file *file_open_root_mnt(struct vfsmount *mnt,
return file_open_root(&(struct path){.mnt = mnt, .dentry = mnt->mnt_root},
name, flags, mode);
}
-extern struct file * dentry_open(const struct path *, int, const struct cred *);
-extern struct file *dentry_create(const struct path *path, int flags,
- umode_t mode, const struct cred *cred);
-extern struct file * open_with_fake_path(const struct path *, int,
- struct inode*, const struct cred *);
+struct file *dentry_open(const struct path *path, int flags,
+ const struct cred *creds);
+struct file *dentry_create(const struct path *path, int flags, umode_t mode,
+ const struct cred *cred);
+struct file *backing_file_open(const struct path *path, int flags,
+ const struct path *real_path,
+ const struct cred *cred);
+struct path *backing_file_real_path(struct file *f);
+static inline const struct path *f_real_path(struct file *f)
+{
+ if (unlikely(f->f_mode & FMODE_BACKING))
+ return backing_file_real_path(f);
+ return &f->f_path;
+}
+
static inline struct file *file_clone_open(struct file *file)
{
return dentry_open(&file->f_path, file->f_flags, file->f_cred);
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 5/5] ovl: enable fsnotify events on underlying real files
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
` (3 preceding siblings ...)
2023-06-15 11:22 ` [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path Amir Goldstein
@ 2023-06-15 11:22 ` Amir Goldstein
2023-06-20 10:57 ` [PATCH v5 0/5] Handle notifications on overlayfs fake path files Christian Brauner
5 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2023-06-15 11:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Miklos Szeredi, Christoph Hellwig, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
Overlayfs creates the real underlying files with fake f_path, whose
f_inode is on the underlying fs and f_path on overlayfs.
Those real files were open with FMODE_NONOTIFY, because fsnotify code was
not prapared to handle fsnotify hooks on files with fake path correctly
and fanotify would report unexpected event->fd with fake overlayfs path,
when the underlying fs was being watched.
Teach fsnotify to handle events on the real files, and do not set real
files to FMODE_NONOTIFY to allow operations on real file (e.g. open,
access, modify, close) to generate async and permission events.
Because fsnotify does not have notifications on address space
operations, we do not need to worry about ->vm_file not reporting
events to a watched overlayfs when users are accessing a mapped
overlayfs file.
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/file.c | 4 ++--
include/linux/fsnotify.h | 3 ++-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index c610ae35b0b9..cb53c84108fc 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -34,8 +34,8 @@ static char ovl_whatisit(struct inode *inode, struct inode *realinode)
return 'm';
}
-/* No atime modification nor notify on underlying */
-#define OVL_OPEN_FLAGS (O_NOATIME | FMODE_NONOTIFY)
+/* No atime modification on underlying */
+#define OVL_OPEN_FLAGS (O_NOATIME)
static struct file *ovl_open_realfile(const struct file *file,
const struct path *realpath)
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index bb8467cd11ae..6f6cbc2dc49b 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -91,7 +91,8 @@ static inline void fsnotify_dentry(struct dentry *dentry, __u32 mask)
static inline int fsnotify_file(struct file *file, __u32 mask)
{
- const struct path *path = &file->f_path;
+ /* Overlayfs internal files have fake f_path */
+ const struct path *path = f_real_path(file);
if (file->f_mode & FMODE_NONOTIFY)
return 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open()
2023-06-15 11:22 ` [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open() Amir Goldstein
@ 2023-06-16 7:04 ` Christoph Hellwig
0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2023-06-16 7:04 UTC (permalink / raw)
To: Amir Goldstein
Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Christoph Hellwig,
David Howells, Al Viro, linux-fsdevel, linux-unionfs
On Thu, Jun 15, 2023 at 02:22:25PM +0300, Amir Goldstein wrote:
> Overlayfs and cachefiles use vfs_open_tmpfile() to open a tmpfile
> without accounting for nr_files.
>
> Rename this helper to kernel_tmpfile_open() to better reflect this
> helper is used for kernel internal users.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
The subject is a little weird as it suggest you rename two things, but
except for that, this looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/5] fs: use a helper for opening kernel internal files
2023-06-15 11:22 ` [PATCH v5 2/5] fs: use a helper for opening kernel internal files Amir Goldstein
@ 2023-06-16 7:06 ` Christoph Hellwig
0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2023-06-16 7:06 UTC (permalink / raw)
To: Amir Goldstein
Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Christoph Hellwig,
David Howells, Al Viro, linux-fsdevel, linux-unionfs
On Thu, Jun 15, 2023 at 02:22:26PM +0300, Amir Goldstein wrote:
> +/**
> + * kernel_file_open - open a file for kernel internal use
> + * @path: path of the file to open
> + * @flags: open flags
> + * @inode: the inode
> + * @cred: credentials for open
> + *
> + * Open a file that is not accounted in nr_files.
> + * This is only for kernel internal use, and must not be installed into
> + * file tables or such.
This reads a litte odd. How about:
* Open a file for use by in-kernel consumers. The file is not accounted
* against nr_files and must not be installed into the file descriptor table.
With that:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers
2023-06-15 11:22 ` [PATCH v5 3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers Amir Goldstein
@ 2023-06-16 7:07 ` Christoph Hellwig
0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2023-06-16 7:07 UTC (permalink / raw)
To: Amir Goldstein
Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Christoph Hellwig,
David Howells, Al Viro, linux-fsdevel, linux-unionfs
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path
2023-06-15 11:22 ` [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path Amir Goldstein
@ 2023-06-16 7:15 ` Christoph Hellwig
2023-06-16 7:44 ` Amir Goldstein
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2023-06-16 7:15 UTC (permalink / raw)
To: Amir Goldstein
Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Christoph Hellwig,
David Howells, Al Viro, linux-fsdevel, linux-unionfs
> - if (!(f->f_mode & FMODE_NOACCOUNT))
> + if (unlikely(f->f_mode & FMODE_BACKING))
> + path_put(backing_file_real_path(f));
> + else
> percpu_counter_dec(&nr_files);
This is still missing the earlier pointed out fix that we still need
the FMODE_NOACCOUNT check, isn't it?
> + * This is only for kernel internal use, and the allocate file must not be
> + * installed into file tables or such.
I'd use the same blurb I'd suggest for the previous patch here as well.
> +/**
> + * backing_file_open - open a backing file for kernel internal use
> + * @path: path of the file to open
> + * @flags: open flags
> + * @path: path of the backing file
> + * @cred: credentials for open
> + *
> + * Open a file whose f_inode != d_inode(f_path.dentry).
> + * the returned struct file is embedded inside a struct backing_file
> + * container which is used to store the @real_path of the inode.
> + * The @real_path can be accessed by backing_file_real_path() and the
> + * real dentry can be accessed with file_dentry().
> + * The kernel internal backing file is not accounted in nr_files.
> + * This is only for kernel internal use, and must not be installed into
> + * file tables or such.
> + */
I still find this comment not very descriptive. Here is my counter
suggestion, which I'm also not totally happy with, and which might not
even be factually correct as I'm trying to understand the use case a bit
better by reading the code:
* Open a backing file for a stackable file system (e.g. overlayfs).
* For these backing files that reside on the underlying file system, we still
* want to be able to return the path of the upper file in the stackable file
* system. This is done by embedding the returned file into a container
* structure that also stores the path on the upper file system, which can be
* retreived using backing_file_real_path().
*
* The return file is not accounted in nr_files and must not be installed
* into the file descriptor table.
> +static inline const struct path *f_real_path(struct file *f)
Question from an earlier revision: why f_real_path and not file_real_path?
Also this really needs a kerneldoc comment explaining when it should
be used.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path
2023-06-16 7:15 ` Christoph Hellwig
@ 2023-06-16 7:44 ` Amir Goldstein
0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2023-06-16 7:44 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Christian Brauner, Jan Kara, Miklos Szeredi, David Howells,
Al Viro, linux-fsdevel, linux-unionfs
On Fri, Jun 16, 2023 at 10:15 AM Christoph Hellwig <hch@lst.de> wrote:
>
> > - if (!(f->f_mode & FMODE_NOACCOUNT))
> > + if (unlikely(f->f_mode & FMODE_BACKING))
> > + path_put(backing_file_real_path(f));
> > + else
> > percpu_counter_dec(&nr_files);
>
> This is still missing the earlier pointed out fix that we still need
> the FMODE_NOACCOUNT check, isn't it?
Yes, I forgot and Christian has already fixed this on his branch.
>
> > + * This is only for kernel internal use, and the allocate file must not be
> > + * installed into file tables or such.
>
> I'd use the same blurb I'd suggest for the previous patch here as well.
>
> > +/**
> > + * backing_file_open - open a backing file for kernel internal use
> > + * @path: path of the file to open
> > + * @flags: open flags
> > + * @path: path of the backing file
> > + * @cred: credentials for open
> > + *
> > + * Open a file whose f_inode != d_inode(f_path.dentry).
> > + * the returned struct file is embedded inside a struct backing_file
> > + * container which is used to store the @real_path of the inode.
> > + * The @real_path can be accessed by backing_file_real_path() and the
> > + * real dentry can be accessed with file_dentry().
> > + * The kernel internal backing file is not accounted in nr_files.
> > + * This is only for kernel internal use, and must not be installed into
> > + * file tables or such.
> > + */
>
> I still find this comment not very descriptive. Here is my counter
> suggestion, which I'm also not totally happy with, and which might not
> even be factually correct as I'm trying to understand the use case a bit
> better by reading the code:
>
> * Open a backing file for a stackable file system (e.g. overlayfs).
> * For these backing files that reside on the underlying file system, we still
> * want to be able to return the path of the upper file in the stackable file
> * system. This is done by embedding the returned file into a container
> * structure that also stores the path on the upper file system, which can be
> * retreived using backing_file_real_path().
It is the other way around.
Those ovl files currently in master have the ovl path in f_path
and xfs inode in f_inode.
After this change, f_path is still ovl and f_inode is still xfs, but
backing_file_real_path() can be used to get the xfs path.
Using the terminology "upper file in the stackable file" above
is different from what "upper file" means in overlayfs terminology.
So maybe:
* Open a backing file for a stackable filesystem (e.g. overlayfs).
* @path may be on the stackable filesystem and backing inode on the
* underlying filesystem. In this case, we want to be able to return the
* @real_path of the backing inode. This is done by embedding the
* returned file into a container structure that also stores the path of the
* backing inode on the underlying filesystem, which can be retrieved
* using backing_file_real_path().
> *
> * The return file is not accounted in nr_files and must not be installed
> * into the file descriptor table.
>
> > +static inline const struct path *f_real_path(struct file *f)
>
> Question from an earlier revision: why f_real_path and not file_real_path?
>
I missed this question.
I don't really mind.
Considering the documentation below, perhaps it should be called
file_inode_path()
> Also this really needs a kerneldoc comment explaining when it should
> be used.
How about this:
/*
* file_real_path - get the path corresponding to f_inode
*
* When opening a backing file for a stackable filesystem (e.g. overlayfs)
* f_path may be on the stackable filesystem and f_inode on the
* underlying filesystem. When the path associated with f_inode is
* needed, this helper should be used instead of accessing f_path directly.
*/
Thanks,
Amir.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 0/5] Handle notifications on overlayfs fake path files
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
` (4 preceding siblings ...)
2023-06-15 11:22 ` [PATCH v5 5/5] ovl: enable fsnotify events on underlying real files Amir Goldstein
@ 2023-06-20 10:57 ` Christian Brauner
5 siblings, 0 replies; 12+ messages in thread
From: Christian Brauner @ 2023-06-20 10:57 UTC (permalink / raw)
To: Amir Goldstein
Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Christoph Hellwig,
David Howells, Al Viro, linux-fsdevel, linux-unionfs
On Thu, 15 Jun 2023 14:22:24 +0300, Amir Goldstein wrote:
> Christian,
>
> A little while ago, Jan and I realized that an unprivileged overlayfs
> mount could be used to avert fanotify permission events that were
> requested for an inode or sb on the underlying fs.
>
> The [v1] patch set was an attempt to implement Miklos' suggestion
> (opt-in to query the fake path) which turned out to affet the vfs in
> many places, so Miklos and I agreed on a solution that will be less
> intrusive for vfs (opt-in to query the real path).
>
> [...]
I incorporated all the fixes suggested by Christoph and the missing
decrement of nr_files.
---
Applied to the vfs.backing.file branch of the vfs/vfs.git tree.
Patches in the vfs.backing.file branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.backing.file
[1/5] fs: rename {vfs,kernel}_tmpfile_open()
https://git.kernel.org/vfs/vfs/c/d56e0ddb8fc3
[2/5] fs: use a helper for opening kernel internal files
https://git.kernel.org/vfs/vfs/c/cbb0b9d4bbcf
[3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers
https://git.kernel.org/vfs/vfs/c/8a05a8c31d06
[4/5] fs: use backing_file container for internal files with "fake" f_path
https://git.kernel.org/vfs/vfs/c/62d53c4a1dfe
[5/5] ovl: enable fsnotify events on underlying real files
https://git.kernel.org/vfs/vfs/c/bc2473c90fca
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-06-20 10:57 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-15 11:22 [PATCH v5 0/5] Handle notifications on overlayfs fake path files Amir Goldstein
2023-06-15 11:22 ` [PATCH v5 1/5] fs: rename {vfs,kernel}_tmpfile_open() Amir Goldstein
2023-06-16 7:04 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 2/5] fs: use a helper for opening kernel internal files Amir Goldstein
2023-06-16 7:06 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 3/5] fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers Amir Goldstein
2023-06-16 7:07 ` Christoph Hellwig
2023-06-15 11:22 ` [PATCH v5 4/5] fs: use backing_file container for internal files with "fake" f_path Amir Goldstein
2023-06-16 7:15 ` Christoph Hellwig
2023-06-16 7:44 ` Amir Goldstein
2023-06-15 11:22 ` [PATCH v5 5/5] ovl: enable fsnotify events on underlying real files Amir Goldstein
2023-06-20 10:57 ` [PATCH v5 0/5] Handle notifications on overlayfs fake path files Christian Brauner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).