From: Andrii Nakryiko <andrii@kernel.org>
To: linux-trace-kernel@vger.kernel.org, peterz@infradead.org,
mingo@kernel.org
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel-team@meta.com, mhocko@kernel.org, rostedt@goodmis.org,
oleg@redhat.com, brauner@kernel.org, glider@google.com,
mhiramat@kernel.org, mathieu.desnoyers@efficios.com,
akpm@linux-foundation.org, Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH] exit: add trace_task_exit() tracepoint before current->mm is reset
Date: Tue, 1 Apr 2025 11:40:21 -0700 [thread overview]
Message-ID: <20250401184021.2591443-1-andrii@kernel.org> (raw)
It is useful to be able to access current->mm to, say, record a bunch of
VMA information right before the task exits (e.g., for stack
symbolization reasons when dealing with short-lived processes that exit
in the middle of profiling session). We currently do have
trace_sched_process_exit() in the exit path, but it is called a bit too
late, after exit_mm() resets current->mm to NULL, which makes it
unsuitable for inspecting and recording task's mm_struct-related data
when tracing process lifetimes.
There is a particularly suitable place, though, right after
taskstats_exit() is called, but before we do exit_mm(). taskstats
performs a similar kind of accounting that some applications do with
BPF, and so co-locating them seems like a good fit.
Moving trace_sched_process_exit() a bit earlier would solve this problem
as well, and I'm open to that. But this might potentially change its
semantics a little, and so instead of risking that, I went for adding
a new trace_task_exit() tracepoint instead. Tracepoints have zero
overhead at runtime, unless actively traced, so this seems acceptable.
Also, existing trace_sched_process_exit() tracepoint is notoriously
missing `group_dead` flag that is certainly useful in practice and some
of our production applications have to work around this. So plumb
`group_dead` through while at it, to have a richer and more complete
tracepoint.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
include/trace/events/task.h | 24 ++++++++++++++++++++++++
kernel/exit.c | 2 ++
2 files changed, 26 insertions(+)
diff --git a/include/trace/events/task.h b/include/trace/events/task.h
index af535b053033..98f4ec060073 100644
--- a/include/trace/events/task.h
+++ b/include/trace/events/task.h
@@ -53,6 +53,30 @@ TRACE_EVENT(task_rename,
__entry->oldcomm, __entry->newcomm, __entry->oom_score_adj)
);
+TRACE_EVENT(task_exit,
+
+ TP_PROTO(struct task_struct *task, bool group_dead),
+
+ TP_ARGS(task, group_dead),
+
+ TP_STRUCT__entry(
+ __field( pid_t, pid)
+ __array( char, comm, TASK_COMM_LEN)
+ __field( bool, group_dead)
+ ),
+
+ TP_fast_assign(
+ __entry->pid = task->pid;
+ memcpy(__entry->comm, task->comm, TASK_COMM_LEN);
+ __entry->group_dead = group_dead;
+ ),
+
+ TP_printk("pid=%d comm=%s group_dead=%s",
+ __entry->pid, __entry->comm,
+ __entry->group_dead ? "true" : "false"
+ )
+);
+
/**
* task_prctl_unknown - called on unknown prctl() option
* @option: option passed
diff --git a/kernel/exit.c b/kernel/exit.c
index c2e6c7b7779f..8496fc07f9c8 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -54,6 +54,7 @@
#include <linux/init_task.h>
#include <linux/perf_event.h>
#include <trace/events/sched.h>
+#include <trace/events/task.h>
#include <linux/hw_breakpoint.h>
#include <linux/oom.h>
#include <linux/writeback.h>
@@ -937,6 +938,7 @@ void __noreturn do_exit(long code)
tsk->exit_code = code;
taskstats_exit(tsk, group_dead);
+ trace_task_exit(tsk, group_dead);
exit_mm();
--
2.47.1
next reply other threads:[~2025-04-01 18:40 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-01 18:40 Andrii Nakryiko [this message]
2025-04-01 21:32 ` [PATCH] exit: add trace_task_exit() tracepoint before current->mm is reset Steven Rostedt
2025-04-01 21:34 ` Steven Rostedt
2025-04-02 7:18 ` Michal Hocko
2025-04-01 22:04 ` Andrii Nakryiko
2025-04-01 22:13 ` Steven Rostedt
2025-04-01 22:17 ` Andrii Nakryiko
2025-04-02 10:27 ` Jiri Olsa
2025-04-02 7:20 ` Michal Hocko
2025-04-02 13:58 ` Steven Rostedt
2025-04-02 15:56 ` Andrii Nakryiko
2025-04-02 8:27 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250401184021.2591443-1-andrii@kernel.org \
--to=andrii@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=glider@google.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox