From: Christian Brauner <brauner@kernel.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@kernel.org>,
Lennart Poettering <lennart@poettering.net>,
Daan De Meyer <daan.j.demeyer@gmail.com>,
Mike Yuan <me@yhndnzj.com>,
Christian Brauner <brauner@kernel.org>
Subject: [PATCH RFC 05/10] pidfs: record exit code and cgroupid at exit
Date: Fri, 28 Feb 2025 13:44:05 +0100 [thread overview]
Message-ID: <20250228-work-pidfs-kill_on_last_close-v1-5-5bd7e6bb428e@kernel.org> (raw)
In-Reply-To: <20250228-work-pidfs-kill_on_last_close-v1-0-5bd7e6bb428e@kernel.org>
Record the exit code and cgroupid in do_exit() and stash in struct
pidfs_exit_info so it can be retrieved even after the task has been
reaped.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/internal.h | 1 +
fs/libfs.c | 4 ++--
fs/pidfs.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/pidfs.h | 1 +
kernel/exit.c | 2 ++
5 files changed, 53 insertions(+), 2 deletions(-)
diff --git a/fs/internal.h b/fs/internal.h
index e7f02ae1e098..c1e6d8b294cb 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -325,6 +325,7 @@ struct stashed_operations {
int path_from_stashed(struct dentry **stashed, struct vfsmount *mnt, void *data,
struct path *path);
void stashed_dentry_prune(struct dentry *dentry);
+struct dentry *stashed_dentry_get(struct dentry **stashed);
/**
* path_mounted - check whether path is mounted
* @path: path to check
diff --git a/fs/libfs.c b/fs/libfs.c
index 8444f5cc4064..cf5a267aafe4 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -2113,7 +2113,7 @@ struct timespec64 simple_inode_init_ts(struct inode *inode)
}
EXPORT_SYMBOL(simple_inode_init_ts);
-static inline struct dentry *get_stashed_dentry(struct dentry **stashed)
+struct dentry *stashed_dentry_get(struct dentry **stashed)
{
struct dentry *dentry;
@@ -2215,7 +2215,7 @@ int path_from_stashed(struct dentry **stashed, struct vfsmount *mnt, void *data,
const struct stashed_operations *sops = mnt->mnt_sb->s_fs_info;
/* See if dentry can be reused. */
- path->dentry = get_stashed_dentry(stashed);
+ path->dentry = stashed_dentry_get(stashed);
if (path->dentry) {
sops->put_data(data);
goto out_path;
diff --git a/fs/pidfs.c b/fs/pidfs.c
index 64428697996f..433f676c066c 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -458,6 +458,53 @@ struct pid *pidfd_pid(const struct file *file)
return file_inode(file)->i_private;
}
+/*
+ * We're called from do_exit(). We know there's at least one reference
+ * to struct pid being held that won't be released until the task has
+ * been reaped which cannot happen until we're out of do_exit().
+ *
+ * If this struct pid is refered to by a pidfd then stashed_dentry_get()
+ * will return the dentry and inode for that struct pid. Since we've
+ * taken a reference on it there's now an additional reference from the
+ * exit path on it. Which is fine. We're going to put it again in a
+ * second and we know that the pid is kept alive anyway.
+ *
+ * Worst case is that we've filled in the info and immediately free the
+ * dentry and inode afterwards since the pidfd has been closed. Since
+ * pidfs_exit() currently is placed after exit_task_work() we know that
+ * it cannot be us aka the exiting task holding a pidfd to ourselves.
+ */
+void pidfs_exit(struct task_struct *tsk)
+{
+ struct dentry *dentry;
+
+ dentry = stashed_dentry_get(&task_pid(tsk)->stashed);
+ if (dentry) {
+ struct inode *inode;
+ struct pidfs_exit_info *exit_info;
+#ifdef CONFIG_CGROUPS
+ struct cgroup *cgrp;
+#endif
+ inode = d_inode(dentry);
+ exit_info = &pidfs_i(inode)->exit_info;
+
+ /* TODO: Annoy Oleg to tell me how to do this correctly. */
+ if (tsk->signal->flags & SIGNAL_GROUP_EXIT)
+ exit_info->exit_code = tsk->signal->group_exit_code;
+ else
+ exit_info->exit_code = tsk->exit_code;
+
+#ifdef CONFIG_CGROUPS
+ rcu_read_lock();
+ cgrp = task_dfl_cgroup(tsk);
+ exit_info->cgroupid = cgroup_id(cgrp);
+ rcu_read_unlock();
+#endif
+
+ dput(dentry);
+ }
+}
+
static struct vfsmount *pidfs_mnt __ro_after_init;
/*
diff --git a/include/linux/pidfs.h b/include/linux/pidfs.h
index 7c830d0dec9a..05e6f8f4a026 100644
--- a/include/linux/pidfs.h
+++ b/include/linux/pidfs.h
@@ -6,6 +6,7 @@ struct file *pidfs_alloc_file(struct pid *pid, unsigned int flags);
void __init pidfs_init(void);
void pidfs_add_pid(struct pid *pid);
void pidfs_remove_pid(struct pid *pid);
+void pidfs_exit(struct task_struct *tsk);
extern const struct dentry_operations pidfs_dentry_operations;
#endif /* _LINUX_PID_FS_H */
diff --git a/kernel/exit.c b/kernel/exit.c
index 3485e5fc499e..cae475e7858c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -69,6 +69,7 @@
#include <linux/sysfs.h>
#include <linux/user_events.h>
#include <linux/uaccess.h>
+#include <linux/pidfs.h>
#include <uapi/linux/wait.h>
@@ -948,6 +949,7 @@ void __noreturn do_exit(long code)
sched_autogroup_exit_task(tsk);
cgroup_exit(tsk);
+ pidfs_exit(tsk);
/*
* FIXME: do that only when needed, using sched_exit tracepoint
--
2.47.2
next prev parent reply other threads:[~2025-02-28 12:44 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-28 12:44 [PATCH RFC 00/10] pidfs: provide information after task has been reaped Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 01/10] pidfs: switch to copy_struct_to_user() Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 02/10] pidfd: rely on automatic cleanup in __pidfd_prepare() Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 03/10] pidfs: move setting flags into pidfs_alloc_file() Christian Brauner
2025-03-02 13:09 ` Oleg Nesterov
2025-03-02 15:59 ` Christian Brauner
2025-03-02 16:05 ` Oleg Nesterov
2025-03-02 16:29 ` Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 04/10] pidfs: add inode allocation Christian Brauner
2025-02-28 12:44 ` Christian Brauner [this message]
2025-03-02 15:19 ` [PATCH RFC 05/10] pidfs: record exit code and cgroupid at exit Oleg Nesterov
2025-02-28 12:44 ` [PATCH RFC 06/10] pidfs: allow to retrieve exit information Christian Brauner
2025-03-02 2:40 ` Mike Yuan
2025-03-02 12:33 ` Christian Brauner
2025-03-02 15:53 ` Oleg Nesterov
2025-03-02 16:29 ` Christian Brauner
2025-03-02 17:21 ` Oleg Nesterov
2025-03-02 18:56 ` Christian Brauner
2025-03-02 20:24 ` Oleg Nesterov
2025-03-03 9:06 ` Lennart Poettering
2025-03-03 11:32 ` Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 07/10] selftests/pidfd: fix header inclusion Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 08/10] pidfs/selftests: ensure correct headers for ioctl handling Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 09/10] selftests/pidfd: move more defines to common header Christian Brauner
2025-02-28 12:44 ` [PATCH RFC 10/10] selftests/pidfd: add PIDFD_INFO_EXIT tests Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250228-work-pidfs-kill_on_last_close-v1-5-5bd7e6bb428e@kernel.org \
--to=brauner@kernel.org \
--cc=daan.j.demeyer@gmail.com \
--cc=jlayton@kernel.org \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=me@yhndnzj.com \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).