linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/7] eventfs: Fixing dynamic creation
@ 2023-10-31 22:33 Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton

I found an issue with using a union between the rcu list head and
the "is_freed" boolean (word). That is, rcu list is a single link
list where the last element has a NULL pointer. That means, if
the eventfs_inode is the last element (which it likely will be)
it will not have its flag set to is_free.

Instead use a bit from nr_entries to be the is_freed flag.

Changes since v4: https://lore.kernel.org/linux-trace-kernel/20231031193109.018322397@goodmis.org/

- Updated on top of the change to separate out is_freed from the union

- Add another fix to only delete the eventfs_inode after the last
  dentry has cleared its reference (there's more races there without
  doing that)

- Make the top level eventfs directory the same as the rest in
  being removed.

Steven Rostedt (Google) (7):
      eventfs: Remove "is_freed" union with rcu head
      eventfs: Have a free_ei() that just frees the eventfs_inode
      eventfs: Test for ei->is_freed when accessing ei->dentry
      eventfs: Save ownership and mode
      eventfs: Hold eventfs_mutex when calling callback functions
      eventfs: Delete eventfs_inode when the last dentry is freed
      eventfs: Remove special processing of dput() of events directory

----
 fs/tracefs/event_inode.c | 431 +++++++++++++++++++++++++++++++----------------
 fs/tracefs/internal.h    |  27 ++-
 include/linux/tracefs.h  |  43 +++++
 3 files changed, 353 insertions(+), 148 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  2023-11-01 12:19   ` Masami Hiramatsu
  2023-10-31 22:33 ` [PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton, stable

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The eventfs_inode->is_freed was a union with the rcu_head with the
assumption that when it was on the srcu list the head would contain a
pointer which would make "is_freed" true. But that was a wrong assumption
as the rcu head is a single link list where the last element is NULL.

Instead, split the nr_entries integer so that "is_freed" is one bit and
the nr_entries is the next 31 bits. As there shouldn't be more than 10
(currently there's at most 5 to 7 depending on the config), this should
not be a problem.

Cc: stable@vger.kernel.org
Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 fs/tracefs/event_inode.c | 2 ++
 fs/tracefs/internal.h    | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 754885dfe71c..2c2c75b2ad73 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -824,6 +824,8 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, struct list_head *head,
 		eventfs_remove_rec(ei_child, head, level + 1);
 	}
 
+	ei->is_freed = 1;
+
 	list_del_rcu(&ei->list);
 	list_add_tail(&ei->del_list, head);
 }
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 64fde9490f52..c7d88aaa949f 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -23,6 +23,7 @@ struct tracefs_inode {
  * @d_parent:   pointer to the parent's dentry
  * @d_children: The array of dentries to represent the files when created
  * @data:	The private data to pass to the callbacks
+ * @is_freed:	Flag set if the eventfs is on its way to be freed
  * @nr_entries: The number of items in @entries
  */
 struct eventfs_inode {
@@ -38,14 +39,13 @@ struct eventfs_inode {
 	 * Union - used for deletion
 	 * @del_list:	list of eventfs_inode to delete
 	 * @rcu:	eventfs_inode to delete in RCU
-	 * @is_freed:	node is freed if one of the above is set
 	 */
 	union {
 		struct list_head	del_list;
 		struct rcu_head		rcu;
-		unsigned long		is_freed;
 	};
-	int				nr_entries;
+	unsigned int			is_freed:1;
+	unsigned int			nr_entries:31;
 };
 
 static inline struct tracefs_inode *get_tracefs(const struct inode *inode)
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  2023-11-01 12:50   ` Masami Hiramatsu
  2023-10-31 22:33 ` [PATCH v5 3/7] eventfs: Test for ei->is_freed when accessing ei->dentry Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

As the eventfs_inode is freed in two different locations, make a helper
function free_ei() to make sure all the allocated fields of the
eventfs_inode is freed.

This requires renaming the existing free_ei() which is called by the srcu
handler to free_rcu_ei() and have free_ei() just do the freeing, where
free_rcu_ei() will call it.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changse since v4: https://lore.kernel.org/all/20231031193428.133533311@goodmis.org/T/#u

- Rebased to this patch series

 fs/tracefs/event_inode.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 2c2c75b2ad73..0331d9bd568b 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -129,6 +129,13 @@ static struct dentry *create_dir(const char *name, struct dentry *parent)
 	return eventfs_end_creating(dentry);
 }
 
+static void free_ei(struct eventfs_inode *ei)
+{
+	kfree_const(ei->name);
+	kfree(ei->d_children);
+	kfree(ei);
+}
+
 /**
  * eventfs_set_ei_status_free - remove the dentry reference from an eventfs_inode
  * @ti: the tracefs_inode of the dentry
@@ -168,9 +175,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
 			eventfs_remove_dir(ei_child);
 		}
 
-		kfree_const(ei->name);
-		kfree(ei->d_children);
-		kfree(ei);
+		free_ei(ei);
 		return;
 	}
 
@@ -784,13 +789,11 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
 	return ERR_PTR(-ENOMEM);
 }
 
-static void free_ei(struct rcu_head *head)
+static void free_rcu_ei(struct rcu_head *head)
 {
 	struct eventfs_inode *ei = container_of(head, struct eventfs_inode, rcu);
 
-	kfree_const(ei->name);
-	kfree(ei->d_children);
-	kfree(ei);
+	free_ei(ei);
 }
 
 /**
@@ -883,7 +886,7 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
 		for (i = 0; i < ei->nr_entries; i++)
 			unhook_dentry(&ei->d_children[i], &dentry_list);
 		unhook_dentry(&ei->dentry, &dentry_list);
-		call_srcu(&eventfs_srcu, &ei->rcu, free_ei);
+		call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
 	}
 	mutex_unlock(&eventfs_mutex);
 
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/7] eventfs: Test for ei->is_freed when accessing ei->dentry
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 4/7] eventfs: Save ownership and mode Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton,
	Linux Kernel Functional Testing, Naresh Kamboju, Beau Belgrave

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The eventfs_inode (ei) is protected by SRCU, but the ei->dentry is not. It
is protected by the eventfs_mutex. Anytime the eventfs_mutex is released,
and access to the ei->dentry needs to be done, it should first check if
ei->is_freed is set under the eventfs_mutex. If it is, then the ei->dentry
is invalid and must not be used. The ei->dentry must only be accessed
under the eventfs_mutex and after checking if ei->is_freed is set.

When the ei is being freed, it will (under the eventfs_mutex) set is_freed
and at the same time move the dentry to a free list to be cleared after
the eventfs_mutex is released. This means that any access to the
ei->dentry must check first if ei->is_freed is set, because if it is, then
the dentry is on its way to be freed.

Also add comments to describe this better.

Link: https://lore.kernel.org/all/CA+G9fYt6pY+tMZEOg=SoEywQOe19fGP3uR15SGowkdK+_X85Cg@mail.gmail.com/
Link: https://lore.kernel.org/all/CA+G9fYuDP3hVQ3t7FfrBAjd_WFVSurMgCepTxunSJf=MTe=6aA@mail.gmail.com/

Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Beau Belgrave <beaub@linux.microsoft.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v4: https://lkml.kernel.org/r/20231031193428.347017769@goodmis.org

- Rebased to this series

 fs/tracefs/event_inode.c | 45 ++++++++++++++++++++++++++++++++++------
 fs/tracefs/internal.h    |  3 ++-
 2 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0331d9bd568b..b28f240bbb6d 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -24,7 +24,20 @@
 #include <linux/delay.h>
 #include "internal.h"
 
+/*
+ * eventfs_mutex protects the eventfs_inode (ei) dentry. Any access
+ * to the ei->dentry must be done under this mutex and after checking
+ * if ei->is_freed is not set. The ei->dentry is released under the
+ * mutex at the same time ei->is_freed is set. If ei->is_freed is set
+ * then the ei->dentry is invalid.
+ */
 static DEFINE_MUTEX(eventfs_mutex);
+
+/*
+ * The eventfs_inode (ei) itself is protected by SRCU. It is released from
+ * its parent's list and will have is_freed set (under eventfs_mutex).
+ * After the SRCU grace period is over, the ei may be freed.
+ */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
 static struct dentry *eventfs_root_lookup(struct inode *dir,
@@ -239,6 +252,10 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
 	bool invalidate = false;
 
 	mutex_lock(&eventfs_mutex);
+	if (ei->is_freed) {
+		mutex_unlock(&eventfs_mutex);
+		return NULL;
+	}
 	/* If the e_dentry already has a dentry, use it */
 	if (*e_dentry) {
 		/* lookup does not need to up the ref count */
@@ -312,6 +329,8 @@ static void eventfs_post_create_dir(struct eventfs_inode *ei)
 	struct eventfs_inode *ei_child;
 	struct tracefs_inode *ti;
 
+	lockdep_assert_held(&eventfs_mutex);
+
 	/* srcu lock already held */
 	/* fill parent-child relation */
 	list_for_each_entry_srcu(ei_child, &ei->children, list,
@@ -325,6 +344,7 @@ static void eventfs_post_create_dir(struct eventfs_inode *ei)
 
 /**
  * create_dir_dentry - Create a directory dentry for the eventfs_inode
+ * @pei: The eventfs_inode parent of ei.
  * @ei: The eventfs_inode to create the directory for
  * @parent: The dentry of the parent of this directory
  * @lookup: True if this is called by the lookup code
@@ -332,12 +352,17 @@ static void eventfs_post_create_dir(struct eventfs_inode *ei)
  * This creates and attaches a directory dentry to the eventfs_inode @ei.
  */
 static struct dentry *
-create_dir_dentry(struct eventfs_inode *ei, struct dentry *parent, bool lookup)
+create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
+		  struct dentry *parent, bool lookup)
 {
 	bool invalidate = false;
 	struct dentry *dentry = NULL;
 
 	mutex_lock(&eventfs_mutex);
+	if (pei->is_freed || ei->is_freed) {
+		mutex_unlock(&eventfs_mutex);
+		return NULL;
+	}
 	if (ei->dentry) {
 		/* If the dentry already has a dentry, use it */
 		dentry = ei->dentry;
@@ -440,7 +465,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
 	 */
 	mutex_lock(&eventfs_mutex);
 	ei = READ_ONCE(ti->private);
-	if (ei)
+	if (ei && !ei->is_freed)
 		ei_dentry = READ_ONCE(ei->dentry);
 	mutex_unlock(&eventfs_mutex);
 
@@ -454,7 +479,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
 		if (strcmp(ei_child->name, name) != 0)
 			continue;
 		ret = simple_lookup(dir, dentry, flags);
-		create_dir_dentry(ei_child, ei_dentry, true);
+		create_dir_dentry(ei, ei_child, ei_dentry, true);
 		created = true;
 		break;
 	}
@@ -588,7 +613,7 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
 
 	list_for_each_entry_srcu(ei_child, &ei->children, list,
 				 srcu_read_lock_held(&eventfs_srcu)) {
-		d = create_dir_dentry(ei_child, parent, false);
+		d = create_dir_dentry(ei, ei_child, parent, false);
 		if (d) {
 			ret = add_dentries(&dentries, d, cnt);
 			if (ret < 0)
@@ -705,12 +730,20 @@ struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode
 	ei->nr_entries = size;
 	ei->data = data;
 	INIT_LIST_HEAD(&ei->children);
+	INIT_LIST_HEAD(&ei->list);
 
 	mutex_lock(&eventfs_mutex);
-	list_add_tail(&ei->list, &parent->children);
-	ei->d_parent = parent->dentry;
+	if (!parent->is_freed) {
+		list_add_tail(&ei->list, &parent->children);
+		ei->d_parent = parent->dentry;
+	}
 	mutex_unlock(&eventfs_mutex);
 
+	/* Was the parent freed? */
+	if (list_empty(&ei->list)) {
+		free_ei(ei);
+		ei = NULL;
+	}
 	return ei;
 }
 
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index c7d88aaa949f..5a98e87dd3d1 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -24,6 +24,7 @@ struct tracefs_inode {
  * @d_children: The array of dentries to represent the files when created
  * @data:	The private data to pass to the callbacks
  * @is_freed:	Flag set if the eventfs is on its way to be freed
+ *                Note if is_freed is set, then dentry is corrupted.
  * @nr_entries: The number of items in @entries
  */
 struct eventfs_inode {
@@ -31,7 +32,7 @@ struct eventfs_inode {
 	const struct eventfs_entry	*entries;
 	const char			*name;
 	struct list_head		children;
-	struct dentry			*dentry;
+	struct dentry			*dentry; /* Check is_freed to access */
 	struct dentry			*d_parent;
 	struct dentry			**d_children;
 	void				*data;
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 4/7] eventfs: Save ownership and mode
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
                   ` (2 preceding siblings ...)
  2023-10-31 22:33 ` [PATCH v5 3/7] eventfs: Test for ei->is_freed when accessing ei->dentry Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  2023-11-01 23:43   ` Masami Hiramatsu
  2023-10-31 22:33 ` [PATCH v5 5/7] eventfs: Hold eventfs_mutex when calling callback functions Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Now that inodes and dentries are created on the fly, they are also
reclaimed on memory pressure. Since the ownership and file mode are saved
in the inode, if they are freed, any changes to the ownership and mode
will be lost.

To counter this, if the user changes the permissions or ownership, save
them, and when creating the inodes again, restore those changes.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v4: https://lkml.kernel.org/r/20231031193428.558586557@goodmis.org

- Rebased to this series

 fs/tracefs/event_inode.c | 148 +++++++++++++++++++++++++++++++++++----
 fs/tracefs/internal.h    |  16 +++++
 2 files changed, 151 insertions(+), 13 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index b28f240bbb6d..d1683bf6d316 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -40,6 +40,15 @@ static DEFINE_MUTEX(eventfs_mutex);
  */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
+/* Mode is unsigned short, use the upper bits for flags */
+enum {
+	EVENTFS_SAVE_MODE	= BIT(16),
+	EVENTFS_SAVE_UID	= BIT(17),
+	EVENTFS_SAVE_GID	= BIT(18),
+};
+
+#define EVENTFS_MODE_MASK	(EVENTFS_SAVE_MODE - 1)
+
 static struct dentry *eventfs_root_lookup(struct inode *dir,
 					  struct dentry *dentry,
 					  unsigned int flags);
@@ -47,8 +56,89 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file);
 static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx);
 static int eventfs_release(struct inode *inode, struct file *file);
 
+static void update_attr(struct eventfs_attr *attr, struct iattr *iattr)
+{
+	unsigned int ia_valid = iattr->ia_valid;
+
+	if (ia_valid & ATTR_MODE) {
+		attr->mode = (attr->mode & ~EVENTFS_MODE_MASK) |
+			(iattr->ia_mode & EVENTFS_MODE_MASK) |
+			EVENTFS_SAVE_MODE;
+	}
+	if (ia_valid & ATTR_UID) {
+		attr->mode |= EVENTFS_SAVE_UID;
+		attr->uid = iattr->ia_uid;
+	}
+	if (ia_valid & ATTR_GID) {
+		attr->mode |= EVENTFS_SAVE_GID;
+		attr->gid = iattr->ia_gid;
+	}
+}
+
+static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
+			    struct iattr *iattr)
+{
+	const struct eventfs_entry *entry;
+	struct eventfs_inode *ei;
+	const char *name;
+	int ret;
+
+	mutex_lock(&eventfs_mutex);
+	ei = dentry->d_fsdata;
+	/* The LSB is set when the eventfs_inode is being freed */
+	if (((unsigned long)ei & 1UL) || ei->is_freed) {
+		/* Do not allow changes if the event is about to be removed. */
+		mutex_unlock(&eventfs_mutex);
+		return -ENODEV;
+	}
+
+	/* Preallocate the children mode array if necessary */
+	if (!(dentry->d_inode->i_mode & S_IFDIR)) {
+		if (!ei->entry_attrs) {
+			ei->entry_attrs = kzalloc(sizeof(*ei->entry_attrs) * ei->nr_entries,
+						  GFP_KERNEL);
+			if (!ei->entry_attrs) {
+				ret = -ENOMEM;
+				goto out;
+			}
+		}
+	}
+
+	ret = simple_setattr(idmap, dentry, iattr);
+	if (ret < 0)
+		goto out;
+
+	/*
+	 * If this is a dir, then update the ei cache, only the file
+	 * mode is saved in the ei->m_children, and the ownership is
+	 * determined by the parent directory.
+	 */
+	if (dentry->d_inode->i_mode & S_IFDIR) {
+		update_attr(&ei->attr, iattr);
+
+	} else {
+		name = dentry->d_name.name;
+
+		for (int i = 0; i < ei->nr_entries; i++) {
+			entry = &ei->entries[i];
+			if (strcmp(name, entry->name) == 0) {
+				update_attr(&ei->entry_attrs[i], iattr);
+				break;
+			}
+		}
+	}
+ out:
+	mutex_unlock(&eventfs_mutex);
+	return ret;
+}
+
 static const struct inode_operations eventfs_root_dir_inode_operations = {
 	.lookup		= eventfs_root_lookup,
+	.setattr	= eventfs_set_attr,
+};
+
+static const struct inode_operations eventfs_file_inode_operations = {
+	.setattr	= eventfs_set_attr,
 };
 
 static const struct file_operations eventfs_file_operations = {
@@ -59,10 +149,30 @@ static const struct file_operations eventfs_file_operations = {
 	.release	= eventfs_release,
 };
 
+static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, umode_t mode)
+{
+	if (!attr) {
+		inode->i_mode = mode;
+		return;
+	}
+
+	if (attr->mode & EVENTFS_SAVE_MODE)
+		inode->i_mode = attr->mode & EVENTFS_MODE_MASK;
+	else
+		inode->i_mode = mode;
+
+	if (attr->mode & EVENTFS_SAVE_UID)
+		inode->i_uid = attr->uid;
+
+	if (attr->mode & EVENTFS_SAVE_GID)
+		inode->i_gid = attr->gid;
+}
+
 /**
  * create_file - create a file in the tracefs filesystem
  * @name: the name of the file to create.
  * @mode: the permission that the file should have.
+ * @attr: saved attributes changed by user
  * @parent: parent dentry for this file.
  * @data: something that the caller will want to get to later on.
  * @fop: struct file_operations that should be used for this file.
@@ -72,6 +182,7 @@ static const struct file_operations eventfs_file_operations = {
  * call.
  */
 static struct dentry *create_file(const char *name, umode_t mode,
+				  struct eventfs_attr *attr,
 				  struct dentry *parent, void *data,
 				  const struct file_operations *fop)
 {
@@ -95,7 +206,10 @@ static struct dentry *create_file(const char *name, umode_t mode,
 	if (unlikely(!inode))
 		return eventfs_failed_creating(dentry);
 
-	inode->i_mode = mode;
+	/* If the user updated the directory's attributes, use them */
+	update_inode_attr(inode, attr, mode);
+
+	inode->i_op = &eventfs_file_inode_operations;
 	inode->i_fop = fop;
 	inode->i_private = data;
 
@@ -108,19 +222,19 @@ static struct dentry *create_file(const char *name, umode_t mode,
 
 /**
  * create_dir - create a dir in the tracefs filesystem
- * @name: the name of the file to create.
+ * @ei: the eventfs_inode that represents the directory to create
  * @parent: parent dentry for this file.
  *
  * This function will create a dentry for a directory represented by
  * a eventfs_inode.
  */
-static struct dentry *create_dir(const char *name, struct dentry *parent)
+static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry *parent)
 {
 	struct tracefs_inode *ti;
 	struct dentry *dentry;
 	struct inode *inode;
 
-	dentry = eventfs_start_creating(name, parent);
+	dentry = eventfs_start_creating(ei->name, parent);
 	if (IS_ERR(dentry))
 		return dentry;
 
@@ -128,7 +242,9 @@ static struct dentry *create_dir(const char *name, struct dentry *parent)
 	if (unlikely(!inode))
 		return eventfs_failed_creating(dentry);
 
-	inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
+	/* If the user updated the directory's attributes, use them */
+	update_inode_attr(inode, &ei->attr, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
+
 	inode->i_op = &eventfs_root_dir_inode_operations;
 	inode->i_fop = &eventfs_file_operations;
 
@@ -146,6 +262,7 @@ static void free_ei(struct eventfs_inode *ei)
 {
 	kfree_const(ei->name);
 	kfree(ei->d_children);
+	kfree(ei->entry_attrs);
 	kfree(ei);
 }
 
@@ -231,7 +348,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
 /**
  * create_file_dentry - create a dentry for a file of an eventfs_inode
  * @ei: the eventfs_inode that the file will be created under
- * @e_dentry: a pointer to the d_children[] of the @ei
+ * @idx: the index into the d_children[] of the @ei
  * @parent: The parent dentry of the created file.
  * @name: The name of the file to create
  * @mode: The mode of the file.
@@ -244,10 +361,12 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
  * just do a dget() on it and return. Otherwise create the dentry and attach it.
  */
 static struct dentry *
-create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
+create_file_dentry(struct eventfs_inode *ei, int idx,
 		   struct dentry *parent, const char *name, umode_t mode, void *data,
 		   const struct file_operations *fops, bool lookup)
 {
+	struct eventfs_attr *attr = NULL;
+	struct dentry **e_dentry = &ei->d_children[idx];
 	struct dentry *dentry;
 	bool invalidate = false;
 
@@ -264,13 +383,18 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
 		mutex_unlock(&eventfs_mutex);
 		return *e_dentry;
 	}
+
+	/* ei->entry_attrs are protected by SRCU */
+	if (ei->entry_attrs)
+		attr = &ei->entry_attrs[idx];
+
 	mutex_unlock(&eventfs_mutex);
 
 	/* The lookup already has the parent->d_inode locked */
 	if (!lookup)
 		inode_lock(parent->d_inode);
 
-	dentry = create_file(name, mode, parent, data, fops);
+	dentry = create_file(name, mode, attr, parent, data, fops);
 
 	if (!lookup)
 		inode_unlock(parent->d_inode);
@@ -378,7 +502,7 @@ create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
 	if (!lookup)
 		inode_lock(parent->d_inode);
 
-	dentry = create_dir(ei->name, parent);
+	dentry = create_dir(ei, parent);
 
 	if (!lookup)
 		inode_unlock(parent->d_inode);
@@ -495,8 +619,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
 			if (r <= 0)
 				continue;
 			ret = simple_lookup(dir, dentry, flags);
-			create_file_dentry(ei, &ei->d_children[i],
-					   ei_dentry, name, mode, cdata,
+			create_file_dentry(ei, i, ei_dentry, name, mode, cdata,
 					   fops, true);
 			break;
 		}
@@ -629,8 +752,7 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
 		r = entry->callback(name, &mode, &cdata, &fops);
 		if (r <= 0)
 			continue;
-		d = create_file_dentry(ei, &ei->d_children[i],
-				       parent, name, mode, cdata, fops, false);
+		d = create_file_dentry(ei, i, parent, name, mode, cdata, fops, false);
 		if (d) {
 			ret = add_dentries(&dentries, d, cnt);
 			if (ret < 0)
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 5a98e87dd3d1..5f60bcd69289 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -13,6 +13,18 @@ struct tracefs_inode {
 	struct inode            vfs_inode;
 };
 
+/*
+ * struct eventfs_attr - cache the mode and ownership of a eventfs entry
+ * @mode:	saved mode plus flags of what is saved
+ * @uid:	saved uid if changed
+ * @gid:	saved gid if changed
+ */
+struct eventfs_attr {
+	int				mode;
+	kuid_t				uid;
+	kgid_t				gid;
+};
+
 /*
  * struct eventfs_inode - hold the properties of the eventfs directories.
  * @list:	link list into the parent directory
@@ -22,6 +34,8 @@ struct tracefs_inode {
  * @dentry:     the dentry of the directory
  * @d_parent:   pointer to the parent's dentry
  * @d_children: The array of dentries to represent the files when created
+ * @entry_attrs: Saved mode and ownership of the @d_children
+ * @attr:	Saved mode and ownership of eventfs_inode itself
  * @data:	The private data to pass to the callbacks
  * @is_freed:	Flag set if the eventfs is on its way to be freed
  *                Note if is_freed is set, then dentry is corrupted.
@@ -35,6 +49,8 @@ struct eventfs_inode {
 	struct dentry			*dentry; /* Check is_freed to access */
 	struct dentry			*d_parent;
 	struct dentry			**d_children;
+	struct eventfs_attr		*entry_attrs;
+	struct eventfs_attr		attr;
 	void				*data;
 	/*
 	 * Union - used for deletion
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 5/7] eventfs: Hold eventfs_mutex when calling callback functions
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
                   ` (3 preceding siblings ...)
  2023-10-31 22:33 ` [PATCH v5 4/7] eventfs: Save ownership and mode Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 6/7] eventfs: Delete eventfs_inode when the last dentry is freed Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 7/7] eventfs: Remove special processing of dput() of events directory Steven Rostedt
  6 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton,
	Linux Kernel Functional Testing, Naresh Kamboju

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The callback function that is used to create inodes and dentries is not
protected by anything and the data that is passed to it could become
stale. After eventfs_remove_dir() is called by the tracing system, it is
free to remove the events that are associated to that directory.
Unfortunately, that means the callbacks must not be called after that.

     CPU0				CPU1
     ----				----
 eventfs_root_lookup() {
				 eventfs_remove_dir() {
				      mutex_lock(&event_mutex);
				      ei->is_freed = set;
				      mutex_unlock(&event_mutex);
				 }
				 kfree(event_call);

    for (...) {
      entry = &ei->entries[i];
      r = entry->callback() {
          call = data;		// call == event_call above
          if (call->flags ...)

 [ USE AFTER FREE BUG ]

The safest way to protect this is to wrap the callback with:

 mutex_lock(&eventfs_mutex);
 if (!ei->is_freed)
     r = entry->callback();
 else
     r = -1;
 mutex_unlock(&eventfs_mutex);

This will make sure that the callback will not be called after it is
freed. But now it needs to be known that the callback is called while
holding internal eventfs locks, and that it must not call back into the
eventfs / tracefs system. There's no reason it should anyway, but document
that as well.

Link: https://lore.kernel.org/all/CA+G9fYu9GOEbD=rR5eMR-=HJ8H6rMsbzDC2ZY5=Y50WpWAE7_Q@mail.gmail.com/

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1/v4: https://lore.kernel.org/linux-trace-kernel/20231030114047.759c7bdf@gandalf.local.home

- Rebased to this series (which is was v4)

 fs/tracefs/event_inode.c | 22 ++++++++++++++++++--
 include/linux/tracefs.h  | 43 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index d1683bf6d316..87a8aaeda231 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -615,7 +615,13 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
 		entry = &ei->entries[i];
 		if (strcmp(name, entry->name) == 0) {
 			void *cdata = data;
-			r = entry->callback(name, &mode, &cdata, &fops);
+			mutex_lock(&eventfs_mutex);
+			/* If ei->is_freed, then the event itself may be too */
+			if (!ei->is_freed)
+				r = entry->callback(name, &mode, &cdata, &fops);
+			else
+				r = -1;
+			mutex_unlock(&eventfs_mutex);
 			if (r <= 0)
 				continue;
 			ret = simple_lookup(dir, dentry, flags);
@@ -749,7 +755,13 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
 		void *cdata = data;
 		entry = &ei->entries[i];
 		name = entry->name;
-		r = entry->callback(name, &mode, &cdata, &fops);
+		mutex_lock(&eventfs_mutex);
+		/* If ei->is_freed, then the event itself may be too */
+		if (!ei->is_freed)
+			r = entry->callback(name, &mode, &cdata, &fops);
+		else
+			r = -1;
+		mutex_unlock(&eventfs_mutex);
 		if (r <= 0)
 			continue;
 		d = create_file_dentry(ei, i, parent, name, mode, cdata, fops, false);
@@ -819,6 +831,10 @@ static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx)
  *   data = A pointer to @data, and the callback may replace it, which will
  *         cause the file created to pass the new data to the open() call.
  *   fops = the fops to use for the created file.
+ *
+ * NB. @callback is called while holding internal locks of the eventfs
+ *     system. The callback must not call any code that might also call into
+ *     the tracefs or eventfs system or it will risk creating a deadlock.
  */
 struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode *parent,
 					 const struct eventfs_entry *entries,
@@ -878,6 +894,8 @@ struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode
  * @data: The default data to pass to the files (an entry may override it).
  *
  * This function creates the top of the trace event directory.
+ *
+ * See eventfs_create_dir() for use of @entries.
  */
 struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry *parent,
 						const struct eventfs_entry *entries,
diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
index 13359b1a35d1..7a5fe17b6bf9 100644
--- a/include/linux/tracefs.h
+++ b/include/linux/tracefs.h
@@ -23,9 +23,52 @@ struct file_operations;
 
 struct eventfs_file;
 
+/**
+ * eventfs_callback - A callback function to create dynamic files in eventfs
+ * @name: The name of the file that is to be created
+ * @mode: return the file mode for the file (RW access, etc)
+ * @data: data to pass to the created file ops
+ * @fops: the file operations of the created file
+ *
+ * The evetnfs files are dynamically created. The struct eventfs_entry array
+ * is passed to eventfs_create_dir() or eventfs_create_events_dir() that will
+ * be used to create the files within those directories. When a lookup
+ * or access to a file within the directory is made, the struct eventfs_entry
+ * array is used to find a callback() with the matching name that is being
+ * referenced (for lookups, the entire array is iterated and each callback
+ * will be called).
+ *
+ * The callback will be called with @name for the name of the file to create.
+ * The callback can return less than 1 to indicate  that no file should be
+ * created.
+ *
+ * If a file is to be created, then @mode should be populated with the file
+ * mode (permissions) for which the file is created for. This would be
+ * used to set the created inode i_mode field.
+ *
+ * The @data should be set to the data passed to the other file operations
+ * (read, write, etc). Note, @data will also point to the data passed in
+ * to eventfs_create_dir() or eventfs_create_events_dir(), but the callback
+ * can replace the data if it chooses to. Otherwise, the original data
+ * will be used for the file operation functions.
+ *
+ * The @fops should be set to the file operations that will be used to create
+ * the inode.
+ *
+ * NB. This callback is called while holding internal locks of the eventfs
+ *     system. The callback must not call any code that might also call into
+ *     the tracefs or eventfs system or it will risk creating a deadlock.
+ */
 typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
 				const struct file_operations **fops);
 
+/**
+ * struct eventfs_entry - dynamically created eventfs file call back handler
+ * @name:	Then name of the dynamic file in an eventfs directory
+ * @callback:	The callback to get the fops of the file when it is created
+ *
+ * See evenfs_callback() typedef for how to set up @callback.
+ */
 struct eventfs_entry {
 	const char			*name;
 	eventfs_callback		callback;
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 6/7] eventfs: Delete eventfs_inode when the last dentry is freed
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
                   ` (4 preceding siblings ...)
  2023-10-31 22:33 ` [PATCH v5 5/7] eventfs: Hold eventfs_mutex when calling callback functions Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  2023-10-31 22:33 ` [PATCH v5 7/7] eventfs: Remove special processing of dput() of events directory Steven Rostedt
  6 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton, stable

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

There exists a race between holding a reference of an eventfs_inode dentry
and the freeing of the eventfs_inode. If user space has a dentry held long
enough, it may still be able to access the dentry's eventfs_inode after it
has been freed.

To prevent this, have he eventfs_inode freed via the last dput() (or via
RCU if the eventfs_inode does not have a dentry).

This means reintroducing the eventfs_inode del_list field at a temporary
place to put the eventfs_inode. It needs to mark it as freed (via the
list) but also must invalidate the dentry immediately as the return from
eventfs_remove_dir() expects that they are. But the dentry invalidation
must not be called under the eventfs_mutex, so it must be done after the
eventfs_inode is marked as free (put on a deletion list).

Cc: stable@vger.kernel.org
Fixes: 5bdcd5f5331a2 ("eventfs: Implement removal of meta data from eventfs")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 fs/tracefs/event_inode.c | 184 +++++++++++++++++----------------------
 fs/tracefs/internal.h    |   2 +
 2 files changed, 84 insertions(+), 102 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 87a8aaeda231..827ca152cfbe 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -85,8 +85,7 @@ static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
 
 	mutex_lock(&eventfs_mutex);
 	ei = dentry->d_fsdata;
-	/* The LSB is set when the eventfs_inode is being freed */
-	if (((unsigned long)ei & 1UL) || ei->is_freed) {
+	if (ei->is_freed) {
 		/* Do not allow changes if the event is about to be removed. */
 		mutex_unlock(&eventfs_mutex);
 		return -ENODEV;
@@ -276,35 +275,17 @@ static void free_ei(struct eventfs_inode *ei)
 void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
 {
 	struct tracefs_inode *ti_parent;
-	struct eventfs_inode *ei_child, *tmp;
 	struct eventfs_inode *ei;
 	int i;
 
 	/* The top level events directory may be freed by this */
 	if (unlikely(ti->flags & TRACEFS_EVENT_TOP_INODE)) {
-		LIST_HEAD(ef_del_list);
-
 		mutex_lock(&eventfs_mutex);
-
 		ei = ti->private;
-
-		/* Record all the top level files */
-		list_for_each_entry_srcu(ei_child, &ei->children, list,
-					 lockdep_is_held(&eventfs_mutex)) {
-			list_add_tail(&ei_child->del_list, &ef_del_list);
-		}
-
 		/* Nothing should access this, but just in case! */
 		ti->private = NULL;
-
 		mutex_unlock(&eventfs_mutex);
 
-		/* Now safely free the top level files and their children */
-		list_for_each_entry_safe(ei_child, tmp, &ef_del_list, del_list) {
-			list_del(&ei_child->del_list);
-			eventfs_remove_dir(ei_child);
-		}
-
 		free_ei(ei);
 		return;
 	}
@@ -319,14 +300,6 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
 	if (!ei)
 		goto out;
 
-	/*
-	 * If ei was freed, then the LSB bit is set for d_fsdata.
-	 * But this should not happen, as it should still have a
-	 * ref count that prevents it. Warn in case it does.
-	 */
-	if (WARN_ON_ONCE((unsigned long)ei & 1))
-		goto out;
-
 	/* This could belong to one of the files of the ei */
 	if (ei->dentry != dentry) {
 		for (i = 0; i < ei->nr_entries; i++) {
@@ -336,6 +309,8 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
 		if (WARN_ON_ONCE(i == ei->nr_entries))
 			goto out;
 		ei->d_children[i] = NULL;
+	} else if (ei->is_freed) {
+		free_ei(ei);
 	} else {
 		ei->dentry = NULL;
 	}
@@ -962,13 +937,79 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
 	return ERR_PTR(-ENOMEM);
 }
 
+static LLIST_HEAD(free_list);
+
+static void eventfs_workfn(struct work_struct *work)
+{
+        struct eventfs_inode *ei, *tmp;
+        struct llist_node *llnode;
+
+	llnode = llist_del_all(&free_list);
+        llist_for_each_entry_safe(ei, tmp, llnode, llist) {
+		/* This dput() matches the dget() from unhook_dentry() */
+		for (int i = 0; i < ei->nr_entries; i++) {
+			if (ei->d_children[i])
+				dput(ei->d_children[i]);
+		}
+		/* This should only get here if it had a dentry */
+		if (!WARN_ON_ONCE(!ei->dentry))
+			dput(ei->dentry);
+        }
+}
+
+static DECLARE_WORK(eventfs_work, eventfs_workfn);
+
 static void free_rcu_ei(struct rcu_head *head)
 {
 	struct eventfs_inode *ei = container_of(head, struct eventfs_inode, rcu);
 
+	if (ei->dentry) {
+		/* Do not free the ei until all references of dentry are gone */
+		if (llist_add(&ei->llist, &free_list))
+			queue_work(system_unbound_wq, &eventfs_work);
+		return;
+	}
+
+	/* If the ei doesn't have a dentry, neither should its children */
+	for (int i = 0; i < ei->nr_entries; i++) {
+		WARN_ON_ONCE(ei->d_children[i]);
+	}
+
 	free_ei(ei);
 }
 
+static void unhook_dentry(struct dentry *dentry)
+{
+	struct inode *inode;
+
+	if (!dentry)
+		return;
+
+	/* Keep the dentry from being freed yet (see eventfs_workfn()) */
+	dget(dentry);
+
+	inode = dentry->d_inode;
+	inode_lock(inode);
+	if (d_is_dir(dentry))
+		inode->i_flags |= S_DEAD;
+	clear_nlink(inode);
+	inode_unlock(inode);
+
+	inode = dentry->d_parent->d_inode;
+	inode_lock(inode);
+
+	/* Remove its visibility */
+	d_invalidate(dentry);
+	if (d_is_dir(dentry))
+		fsnotify_rmdir(inode, dentry);
+	else
+		fsnotify_unlink(inode, dentry);
+
+	if (d_is_dir(dentry))
+		drop_nlink(inode);
+	inode_unlock(inode);
+}
+
 /**
  * eventfs_remove_rec - remove eventfs dir or file from list
  * @ei: eventfs_inode to be removed.
@@ -1006,34 +1047,6 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, struct list_head *head,
 	list_add_tail(&ei->del_list, head);
 }
 
-static void unhook_dentry(struct dentry **dentry, struct dentry **list)
-{
-	if (*dentry) {
-		unsigned long ptr = (unsigned long)*list;
-
-		/* Keep the dentry from being freed yet */
-		dget(*dentry);
-
-		/*
-		 * Paranoid: The dget() above should prevent the dentry
-		 * from being freed and calling eventfs_set_ei_status_free().
-		 * But just in case, set the link list LSB pointer to 1
-		 * and have eventfs_set_ei_status_free() check that to
-		 * make sure that if it does happen, it will not think
-		 * the d_fsdata is an eventfs_inode.
-		 *
-		 * For this to work, no eventfs_inode should be allocated
-		 * on a odd space, as the ef should always be allocated
-		 * to be at least word aligned. Check for that too.
-		 */
-		WARN_ON_ONCE(ptr & 1);
-
-		(*dentry)->d_fsdata = (void *)(ptr | 1);
-		*list = *dentry;
-		*dentry = NULL;
-	}
-}
-
 /**
  * eventfs_remove_dir - remove eventfs dir or file from list
  * @ei: eventfs_inode to be removed.
@@ -1044,61 +1057,28 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
 {
 	struct eventfs_inode *tmp;
 	LIST_HEAD(ei_del_list);
-	struct dentry *dentry_list = NULL;
-	struct dentry *dentry;
-	struct inode *inode;
-	int i;
 
 	if (!ei)
 		return;
 
+	/*
+	 * Move the deleted eventfs_inodes onto the ei_del_list
+	 * which will also set the is_freed value. Note, this has to be
+	 * done under the eventfs_mutex, but the deletions of
+	 * the dentries must be done outside the eventfs_mutex.
+	 * Hence moving them to this temporary list.
+	 */
 	mutex_lock(&eventfs_mutex);
 	eventfs_remove_rec(ei, &ei_del_list, 0);
-
-	list_for_each_entry_safe(ei, tmp, &ei_del_list, del_list) {
-		for (i = 0; i < ei->nr_entries; i++)
-			unhook_dentry(&ei->d_children[i], &dentry_list);
-		unhook_dentry(&ei->dentry, &dentry_list);
-		call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
-	}
 	mutex_unlock(&eventfs_mutex);
 
-	while (dentry_list) {
-		unsigned long ptr;
-
-		dentry = dentry_list;
-		ptr = (unsigned long)dentry->d_fsdata & ~1UL;
-		dentry_list = (struct dentry *)ptr;
-		dentry->d_fsdata = NULL;
-
-		inode = dentry->d_inode;
-		inode_lock(inode);
-		if (d_is_dir(dentry))
-			inode->i_flags |= S_DEAD;
-		clear_nlink(inode);
-		inode_unlock(inode);
-
-		inode = dentry->d_parent->d_inode;
-		inode_lock(inode);
-
-		/* Remove its visibility */
-		d_invalidate(dentry);
-		if (d_is_dir(dentry))
-			fsnotify_rmdir(inode, dentry);
-		else
-			fsnotify_unlink(inode, dentry);
-
-		if (d_is_dir(dentry))
-			drop_nlink(inode);
-		inode_unlock(inode);
+	list_for_each_entry_safe(ei, tmp, &ei_del_list, del_list) {
 
-		mutex_lock(&eventfs_mutex);
-		/* dentry should now have at least a single reference */
-		WARN_ONCE((int)d_count(dentry) < 1,
-			  "dentry %px (%s) less than one reference (%d) after invalidate\n",
-			  dentry, dentry->d_name.name, d_count(dentry));
-		mutex_unlock(&eventfs_mutex);
-		dput(dentry);
+		for (int i = 0; i < ei->nr_entries; i++)
+			unhook_dentry(ei->d_children[i]);
+		unhook_dentry(ei->dentry);
+		list_del(&ei->del_list);
+		call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
 	}
 }
 
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 5f60bcd69289..06a1f220b901 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -54,10 +54,12 @@ struct eventfs_inode {
 	void				*data;
 	/*
 	 * Union - used for deletion
+	 * @llist:	for calling dput() if needed after RCU
 	 * @del_list:	list of eventfs_inode to delete
 	 * @rcu:	eventfs_inode to delete in RCU
 	 */
 	union {
+		struct llist_node	llist;
 		struct list_head	del_list;
 		struct rcu_head		rcu;
 	};
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 7/7] eventfs: Remove special processing of dput() of events directory
  2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
                   ` (5 preceding siblings ...)
  2023-10-31 22:33 ` [PATCH v5 6/7] eventfs: Delete eventfs_inode when the last dentry is freed Steven Rostedt
@ 2023-10-31 22:33 ` Steven Rostedt
  6 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2023-10-31 22:33 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Ajay Kaher, Andrew Morton

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The top level events directory is no longer special with regards to how it
should be delete. Remove the extra processing for it in
eventfs_set_ei_status_free().

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 fs/tracefs/event_inode.c | 19 ++-----------------
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 827ca152cfbe..7cf8f5ebaae7 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -274,28 +274,11 @@ static void free_ei(struct eventfs_inode *ei)
  */
 void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
 {
-	struct tracefs_inode *ti_parent;
 	struct eventfs_inode *ei;
 	int i;
 
-	/* The top level events directory may be freed by this */
-	if (unlikely(ti->flags & TRACEFS_EVENT_TOP_INODE)) {
-		mutex_lock(&eventfs_mutex);
-		ei = ti->private;
-		/* Nothing should access this, but just in case! */
-		ti->private = NULL;
-		mutex_unlock(&eventfs_mutex);
-
-		free_ei(ei);
-		return;
-	}
-
 	mutex_lock(&eventfs_mutex);
 
-	ti_parent = get_tracefs(dentry->d_parent->d_inode);
-	if (!ti_parent || !(ti_parent->flags & TRACEFS_EVENT_INODE))
-		goto out;
-
 	ei = dentry->d_fsdata;
 	if (!ei)
 		goto out;
@@ -920,6 +903,8 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
 	inode->i_op = &eventfs_root_dir_inode_operations;
 	inode->i_fop = &eventfs_file_operations;
 
+	dentry->d_fsdata = ei;
+
 	/* directory inodes start off with i_nlink == 2 (for "." entry) */
 	inc_nlink(inode);
 	d_instantiate(dentry, inode);
-- 
2.42.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head
  2023-10-31 22:33 ` [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head Steven Rostedt
@ 2023-11-01 12:19   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2023-11-01 12:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Ajay Kaher, Andrew Morton, stable

On Tue, 31 Oct 2023 18:33:27 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> The eventfs_inode->is_freed was a union with the rcu_head with the
> assumption that when it was on the srcu list the head would contain a
> pointer which would make "is_freed" true. But that was a wrong assumption
> as the rcu head is a single link list where the last element is NULL.
> 
> Instead, split the nr_entries integer so that "is_freed" is one bit and
> the nr_entries is the next 31 bits. As there shouldn't be more than 10
> (currently there's at most 5 to 7 depending on the config), this should
> not be a problem.

Yeah, even 16 bit nr_entries is enough.. (maybe the biggest user is
syscall event group)

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thank you,


> Cc: stable@vger.kernel.org
> Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions")
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  fs/tracefs/event_inode.c | 2 ++
>  fs/tracefs/internal.h    | 6 +++---
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 754885dfe71c..2c2c75b2ad73 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -824,6 +824,8 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, struct list_head *head,
>  		eventfs_remove_rec(ei_child, head, level + 1);
>  	}
>  
> +	ei->is_freed = 1;
> +
>  	list_del_rcu(&ei->list);
>  	list_add_tail(&ei->del_list, head);
>  }
> diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
> index 64fde9490f52..c7d88aaa949f 100644
> --- a/fs/tracefs/internal.h
> +++ b/fs/tracefs/internal.h
> @@ -23,6 +23,7 @@ struct tracefs_inode {
>   * @d_parent:   pointer to the parent's dentry
>   * @d_children: The array of dentries to represent the files when created
>   * @data:	The private data to pass to the callbacks
> + * @is_freed:	Flag set if the eventfs is on its way to be freed
>   * @nr_entries: The number of items in @entries
>   */
>  struct eventfs_inode {
> @@ -38,14 +39,13 @@ struct eventfs_inode {
>  	 * Union - used for deletion
>  	 * @del_list:	list of eventfs_inode to delete
>  	 * @rcu:	eventfs_inode to delete in RCU
> -	 * @is_freed:	node is freed if one of the above is set
>  	 */
>  	union {
>  		struct list_head	del_list;
>  		struct rcu_head		rcu;
> -		unsigned long		is_freed;
>  	};
> -	int				nr_entries;
> +	unsigned int			is_freed:1;
> +	unsigned int			nr_entries:31;
>  };
>  
>  static inline struct tracefs_inode *get_tracefs(const struct inode *inode)
> -- 
> 2.42.0


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode
  2023-10-31 22:33 ` [PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode Steven Rostedt
@ 2023-11-01 12:50   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2023-11-01 12:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Ajay Kaher, Andrew Morton

On Tue, 31 Oct 2023 18:33:28 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> As the eventfs_inode is freed in two different locations, make a helper
> function free_ei() to make sure all the allocated fields of the
> eventfs_inode is freed.
> 
> This requires renaming the existing free_ei() which is called by the srcu
> handler to free_rcu_ei() and have free_ei() just do the freeing, where
> free_rcu_ei() will call it.

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks,

> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changse since v4: https://lore.kernel.org/all/20231031193428.133533311@goodmis.org/T/#u
> 
> - Rebased to this patch series
> 
>  fs/tracefs/event_inode.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 2c2c75b2ad73..0331d9bd568b 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -129,6 +129,13 @@ static struct dentry *create_dir(const char *name, struct dentry *parent)
>  	return eventfs_end_creating(dentry);
>  }
>  
> +static void free_ei(struct eventfs_inode *ei)
> +{
> +	kfree_const(ei->name);
> +	kfree(ei->d_children);
> +	kfree(ei);
> +}
> +
>  /**
>   * eventfs_set_ei_status_free - remove the dentry reference from an eventfs_inode
>   * @ti: the tracefs_inode of the dentry
> @@ -168,9 +175,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
>  			eventfs_remove_dir(ei_child);
>  		}
>  
> -		kfree_const(ei->name);
> -		kfree(ei->d_children);
> -		kfree(ei);
> +		free_ei(ei);
>  		return;
>  	}
>  
> @@ -784,13 +789,11 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
>  	return ERR_PTR(-ENOMEM);
>  }
>  
> -static void free_ei(struct rcu_head *head)
> +static void free_rcu_ei(struct rcu_head *head)
>  {
>  	struct eventfs_inode *ei = container_of(head, struct eventfs_inode, rcu);
>  
> -	kfree_const(ei->name);
> -	kfree(ei->d_children);
> -	kfree(ei);
> +	free_ei(ei);
>  }
>  
>  /**
> @@ -883,7 +886,7 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
>  		for (i = 0; i < ei->nr_entries; i++)
>  			unhook_dentry(&ei->d_children[i], &dentry_list);
>  		unhook_dentry(&ei->dentry, &dentry_list);
> -		call_srcu(&eventfs_srcu, &ei->rcu, free_ei);
> +		call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
>  	}
>  	mutex_unlock(&eventfs_mutex);
>  
> -- 
> 2.42.0


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 4/7] eventfs: Save ownership and mode
  2023-10-31 22:33 ` [PATCH v5 4/7] eventfs: Save ownership and mode Steven Rostedt
@ 2023-11-01 23:43   ` Masami Hiramatsu
  2023-11-01 23:47     ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Masami Hiramatsu @ 2023-11-01 23:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
	Ajay Kaher, Andrew Morton

On Tue, 31 Oct 2023 18:33:30 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> Now that inodes and dentries are created on the fly, they are also
> reclaimed on memory pressure. Since the ownership and file mode are saved
> in the inode, if they are freed, any changes to the ownership and mode
> will be lost.

Do we (need to) allow to change the ownership and mode of the eventfs files?
I thought it was fixed on the files in tracefs...

Otherwise, the code itself looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks,

> 
> To counter this, if the user changes the permissions or ownership, save
> them, and when creating the inodes again, restore those changes.
> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changes since v4: https://lkml.kernel.org/r/20231031193428.558586557@goodmis.org
> 
> - Rebased to this series
> 
>  fs/tracefs/event_inode.c | 148 +++++++++++++++++++++++++++++++++++----
>  fs/tracefs/internal.h    |  16 +++++
>  2 files changed, 151 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index b28f240bbb6d..d1683bf6d316 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -40,6 +40,15 @@ static DEFINE_MUTEX(eventfs_mutex);
>   */
>  DEFINE_STATIC_SRCU(eventfs_srcu);
>  
> +/* Mode is unsigned short, use the upper bits for flags */
> +enum {
> +	EVENTFS_SAVE_MODE	= BIT(16),
> +	EVENTFS_SAVE_UID	= BIT(17),
> +	EVENTFS_SAVE_GID	= BIT(18),
> +};
> +
> +#define EVENTFS_MODE_MASK	(EVENTFS_SAVE_MODE - 1)
> +
>  static struct dentry *eventfs_root_lookup(struct inode *dir,
>  					  struct dentry *dentry,
>  					  unsigned int flags);
> @@ -47,8 +56,89 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file);
>  static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx);
>  static int eventfs_release(struct inode *inode, struct file *file);
>  
> +static void update_attr(struct eventfs_attr *attr, struct iattr *iattr)
> +{
> +	unsigned int ia_valid = iattr->ia_valid;
> +
> +	if (ia_valid & ATTR_MODE) {
> +		attr->mode = (attr->mode & ~EVENTFS_MODE_MASK) |
> +			(iattr->ia_mode & EVENTFS_MODE_MASK) |
> +			EVENTFS_SAVE_MODE;
> +	}
> +	if (ia_valid & ATTR_UID) {
> +		attr->mode |= EVENTFS_SAVE_UID;
> +		attr->uid = iattr->ia_uid;
> +	}
> +	if (ia_valid & ATTR_GID) {
> +		attr->mode |= EVENTFS_SAVE_GID;
> +		attr->gid = iattr->ia_gid;
> +	}
> +}
> +
> +static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
> +			    struct iattr *iattr)
> +{
> +	const struct eventfs_entry *entry;
> +	struct eventfs_inode *ei;
> +	const char *name;
> +	int ret;
> +
> +	mutex_lock(&eventfs_mutex);
> +	ei = dentry->d_fsdata;
> +	/* The LSB is set when the eventfs_inode is being freed */
> +	if (((unsigned long)ei & 1UL) || ei->is_freed) {
> +		/* Do not allow changes if the event is about to be removed. */
> +		mutex_unlock(&eventfs_mutex);
> +		return -ENODEV;
> +	}
> +
> +	/* Preallocate the children mode array if necessary */
> +	if (!(dentry->d_inode->i_mode & S_IFDIR)) {
> +		if (!ei->entry_attrs) {
> +			ei->entry_attrs = kzalloc(sizeof(*ei->entry_attrs) * ei->nr_entries,
> +						  GFP_KERNEL);
> +			if (!ei->entry_attrs) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +		}
> +	}
> +
> +	ret = simple_setattr(idmap, dentry, iattr);
> +	if (ret < 0)
> +		goto out;
> +
> +	/*
> +	 * If this is a dir, then update the ei cache, only the file
> +	 * mode is saved in the ei->m_children, and the ownership is
> +	 * determined by the parent directory.
> +	 */
> +	if (dentry->d_inode->i_mode & S_IFDIR) {
> +		update_attr(&ei->attr, iattr);
> +
> +	} else {
> +		name = dentry->d_name.name;
> +
> +		for (int i = 0; i < ei->nr_entries; i++) {
> +			entry = &ei->entries[i];
> +			if (strcmp(name, entry->name) == 0) {
> +				update_attr(&ei->entry_attrs[i], iattr);
> +				break;
> +			}
> +		}
> +	}
> + out:
> +	mutex_unlock(&eventfs_mutex);
> +	return ret;
> +}
> +
>  static const struct inode_operations eventfs_root_dir_inode_operations = {
>  	.lookup		= eventfs_root_lookup,
> +	.setattr	= eventfs_set_attr,
> +};
> +
> +static const struct inode_operations eventfs_file_inode_operations = {
> +	.setattr	= eventfs_set_attr,
>  };
>  
>  static const struct file_operations eventfs_file_operations = {
> @@ -59,10 +149,30 @@ static const struct file_operations eventfs_file_operations = {
>  	.release	= eventfs_release,
>  };
>  
> +static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, umode_t mode)
> +{
> +	if (!attr) {
> +		inode->i_mode = mode;
> +		return;
> +	}
> +
> +	if (attr->mode & EVENTFS_SAVE_MODE)
> +		inode->i_mode = attr->mode & EVENTFS_MODE_MASK;
> +	else
> +		inode->i_mode = mode;
> +
> +	if (attr->mode & EVENTFS_SAVE_UID)
> +		inode->i_uid = attr->uid;
> +
> +	if (attr->mode & EVENTFS_SAVE_GID)
> +		inode->i_gid = attr->gid;
> +}
> +
>  /**
>   * create_file - create a file in the tracefs filesystem
>   * @name: the name of the file to create.
>   * @mode: the permission that the file should have.
> + * @attr: saved attributes changed by user
>   * @parent: parent dentry for this file.
>   * @data: something that the caller will want to get to later on.
>   * @fop: struct file_operations that should be used for this file.
> @@ -72,6 +182,7 @@ static const struct file_operations eventfs_file_operations = {
>   * call.
>   */
>  static struct dentry *create_file(const char *name, umode_t mode,
> +				  struct eventfs_attr *attr,
>  				  struct dentry *parent, void *data,
>  				  const struct file_operations *fop)
>  {
> @@ -95,7 +206,10 @@ static struct dentry *create_file(const char *name, umode_t mode,
>  	if (unlikely(!inode))
>  		return eventfs_failed_creating(dentry);
>  
> -	inode->i_mode = mode;
> +	/* If the user updated the directory's attributes, use them */
> +	update_inode_attr(inode, attr, mode);
> +
> +	inode->i_op = &eventfs_file_inode_operations;
>  	inode->i_fop = fop;
>  	inode->i_private = data;
>  
> @@ -108,19 +222,19 @@ static struct dentry *create_file(const char *name, umode_t mode,
>  
>  /**
>   * create_dir - create a dir in the tracefs filesystem
> - * @name: the name of the file to create.
> + * @ei: the eventfs_inode that represents the directory to create
>   * @parent: parent dentry for this file.
>   *
>   * This function will create a dentry for a directory represented by
>   * a eventfs_inode.
>   */
> -static struct dentry *create_dir(const char *name, struct dentry *parent)
> +static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry *parent)
>  {
>  	struct tracefs_inode *ti;
>  	struct dentry *dentry;
>  	struct inode *inode;
>  
> -	dentry = eventfs_start_creating(name, parent);
> +	dentry = eventfs_start_creating(ei->name, parent);
>  	if (IS_ERR(dentry))
>  		return dentry;
>  
> @@ -128,7 +242,9 @@ static struct dentry *create_dir(const char *name, struct dentry *parent)
>  	if (unlikely(!inode))
>  		return eventfs_failed_creating(dentry);
>  
> -	inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
> +	/* If the user updated the directory's attributes, use them */
> +	update_inode_attr(inode, &ei->attr, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
> +
>  	inode->i_op = &eventfs_root_dir_inode_operations;
>  	inode->i_fop = &eventfs_file_operations;
>  
> @@ -146,6 +262,7 @@ static void free_ei(struct eventfs_inode *ei)
>  {
>  	kfree_const(ei->name);
>  	kfree(ei->d_children);
> +	kfree(ei->entry_attrs);
>  	kfree(ei);
>  }
>  
> @@ -231,7 +348,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
>  /**
>   * create_file_dentry - create a dentry for a file of an eventfs_inode
>   * @ei: the eventfs_inode that the file will be created under
> - * @e_dentry: a pointer to the d_children[] of the @ei
> + * @idx: the index into the d_children[] of the @ei
>   * @parent: The parent dentry of the created file.
>   * @name: The name of the file to create
>   * @mode: The mode of the file.
> @@ -244,10 +361,12 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
>   * just do a dget() on it and return. Otherwise create the dentry and attach it.
>   */
>  static struct dentry *
> -create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> +create_file_dentry(struct eventfs_inode *ei, int idx,
>  		   struct dentry *parent, const char *name, umode_t mode, void *data,
>  		   const struct file_operations *fops, bool lookup)
>  {
> +	struct eventfs_attr *attr = NULL;
> +	struct dentry **e_dentry = &ei->d_children[idx];
>  	struct dentry *dentry;
>  	bool invalidate = false;
>  
> @@ -264,13 +383,18 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
>  		mutex_unlock(&eventfs_mutex);
>  		return *e_dentry;
>  	}
> +
> +	/* ei->entry_attrs are protected by SRCU */
> +	if (ei->entry_attrs)
> +		attr = &ei->entry_attrs[idx];
> +
>  	mutex_unlock(&eventfs_mutex);
>  
>  	/* The lookup already has the parent->d_inode locked */
>  	if (!lookup)
>  		inode_lock(parent->d_inode);
>  
> -	dentry = create_file(name, mode, parent, data, fops);
> +	dentry = create_file(name, mode, attr, parent, data, fops);
>  
>  	if (!lookup)
>  		inode_unlock(parent->d_inode);
> @@ -378,7 +502,7 @@ create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
>  	if (!lookup)
>  		inode_lock(parent->d_inode);
>  
> -	dentry = create_dir(ei->name, parent);
> +	dentry = create_dir(ei, parent);
>  
>  	if (!lookup)
>  		inode_unlock(parent->d_inode);
> @@ -495,8 +619,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
>  			if (r <= 0)
>  				continue;
>  			ret = simple_lookup(dir, dentry, flags);
> -			create_file_dentry(ei, &ei->d_children[i],
> -					   ei_dentry, name, mode, cdata,
> +			create_file_dentry(ei, i, ei_dentry, name, mode, cdata,
>  					   fops, true);
>  			break;
>  		}
> @@ -629,8 +752,7 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
>  		r = entry->callback(name, &mode, &cdata, &fops);
>  		if (r <= 0)
>  			continue;
> -		d = create_file_dentry(ei, &ei->d_children[i],
> -				       parent, name, mode, cdata, fops, false);
> +		d = create_file_dentry(ei, i, parent, name, mode, cdata, fops, false);
>  		if (d) {
>  			ret = add_dentries(&dentries, d, cnt);
>  			if (ret < 0)
> diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
> index 5a98e87dd3d1..5f60bcd69289 100644
> --- a/fs/tracefs/internal.h
> +++ b/fs/tracefs/internal.h
> @@ -13,6 +13,18 @@ struct tracefs_inode {
>  	struct inode            vfs_inode;
>  };
>  
> +/*
> + * struct eventfs_attr - cache the mode and ownership of a eventfs entry
> + * @mode:	saved mode plus flags of what is saved
> + * @uid:	saved uid if changed
> + * @gid:	saved gid if changed
> + */
> +struct eventfs_attr {
> +	int				mode;
> +	kuid_t				uid;
> +	kgid_t				gid;
> +};
> +
>  /*
>   * struct eventfs_inode - hold the properties of the eventfs directories.
>   * @list:	link list into the parent directory
> @@ -22,6 +34,8 @@ struct tracefs_inode {
>   * @dentry:     the dentry of the directory
>   * @d_parent:   pointer to the parent's dentry
>   * @d_children: The array of dentries to represent the files when created
> + * @entry_attrs: Saved mode and ownership of the @d_children
> + * @attr:	Saved mode and ownership of eventfs_inode itself
>   * @data:	The private data to pass to the callbacks
>   * @is_freed:	Flag set if the eventfs is on its way to be freed
>   *                Note if is_freed is set, then dentry is corrupted.
> @@ -35,6 +49,8 @@ struct eventfs_inode {
>  	struct dentry			*dentry; /* Check is_freed to access */
>  	struct dentry			*d_parent;
>  	struct dentry			**d_children;
> +	struct eventfs_attr		*entry_attrs;
> +	struct eventfs_attr		attr;
>  	void				*data;
>  	/*
>  	 * Union - used for deletion
> -- 
> 2.42.0


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 4/7] eventfs: Save ownership and mode
  2023-11-01 23:43   ` Masami Hiramatsu
@ 2023-11-01 23:47     ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2023-11-01 23:47 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: linux-kernel, linux-trace-kernel, Mark Rutland, Ajay Kaher,
	Andrew Morton

On Thu, 2 Nov 2023 08:43:32 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> On Tue, 31 Oct 2023 18:33:30 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> > 
> > Now that inodes and dentries are created on the fly, they are also
> > reclaimed on memory pressure. Since the ownership and file mode are saved
> > in the inode, if they are freed, any changes to the ownership and mode
> > will be lost.  
> 
> Do we (need to) allow to change the ownership and mode of the eventfs files?
> I thought it was fixed on the files in tracefs...

Yes, it's the only way to allow non root users access to the tracing directories.

> 
> Otherwise, the code itself looks good to me.
> 
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-11-01 23:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-31 22:33 [PATCH v5 0/7] eventfs: Fixing dynamic creation Steven Rostedt
2023-10-31 22:33 ` [PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head Steven Rostedt
2023-11-01 12:19   ` Masami Hiramatsu
2023-10-31 22:33 ` [PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode Steven Rostedt
2023-11-01 12:50   ` Masami Hiramatsu
2023-10-31 22:33 ` [PATCH v5 3/7] eventfs: Test for ei->is_freed when accessing ei->dentry Steven Rostedt
2023-10-31 22:33 ` [PATCH v5 4/7] eventfs: Save ownership and mode Steven Rostedt
2023-11-01 23:43   ` Masami Hiramatsu
2023-11-01 23:47     ` Steven Rostedt
2023-10-31 22:33 ` [PATCH v5 5/7] eventfs: Hold eventfs_mutex when calling callback functions Steven Rostedt
2023-10-31 22:33 ` [PATCH v5 6/7] eventfs: Delete eventfs_inode when the last dentry is freed Steven Rostedt
2023-10-31 22:33 ` [PATCH v5 7/7] eventfs: Remove special processing of dput() of events directory Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).