public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Bhupesh <bhupesh@igalia.com>
To: akpm@linux-foundation.org
Cc: bhupesh@igalia.com, kernel-dev@igalia.com,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, oliver.sang@intel.com, lkp@intel.com,
	laoar.shao@gmail.com, pmladek@suse.com, rostedt@goodmis.org,
	mathieu.desnoyers@efficios.com, arnaldo.melo@gmail.com,
	alexei.starovoitov@gmail.com, andrii.nakryiko@gmail.com,
	mirq-linux@rere.qmqm.pl, peterz@infradead.org,
	willy@infradead.org, david@redhat.com, viro@zeniv.linux.org.uk,
	keescook@chromium.org, ebiederm@xmission.com, brauner@kernel.org,
	jack@suse.cz, mingo@redhat.com, juri.lelli@redhat.com,
	bsegall@google.com, mgorman@suse.de, vschneid@redhat.com
Subject: [PATCH RFC 1/2] exec: Dynamically allocate memory to store task's full name
Date: Fri, 14 Mar 2025 10:57:14 +0530	[thread overview]
Message-ID: <20250314052715.610377-2-bhupesh@igalia.com> (raw)
In-Reply-To: <20250314052715.610377-1-bhupesh@igalia.com>

Provide a parallel implementation for get_task_comm() called
get_task_full_name() which allows the dynamically allocated
and filled-in task's full name to be passed to interested
users such as 'ps'.

Currently while running 'ps', the 'task->comm' value of a long
task name is truncated due to the limitation of TASK_COMM_LEN.
For example:
  # ./create_very_long_name_user_space_script.sh&
  # ps
    PID TTY          TIME CMD
    332 ttyAMA0  00:00:00 create_very_lon

This leads to the names passed from userland via 'pthread_setname_np()'
being truncated.

Now, during debug tracing, seeing truncated names is not very useful.
(for example for debug applications invoking 'pthread_getname_np()') to
debug task names.

One possible way to fix this issue is extending the task comm size, but
as 'task->comm' is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a new
pointer to store task's full name, which won't introduce too much overhead
as it is in the non-critical path.

After this change, the full name of these truncated tasks will be shown
in 'ps'. For example:
  # ps
    PID TTY          TIME CMD
    305 ttyAMA0  00:00:00 create_very_long_name_user_space_script.sh

Here is the proposed flow now:
 1. 'pthread_setname_np()' like userspace API sets thread name.
 2. This will set 'task->full_name' in addition to default 16-byte
   truncated 'task->comm'.
 3. And 'pthread_getname_np()' will retrieve 'task->full_name' by
   default from the same '/proc/self/task/[tid]/full_name'

Step 3 implementation is achieved via the subsequent patch in this
patchset.

Signed-off-by: Bhupesh <bhupesh@igalia.com>
---
 fs/exec.c             | 21 ++++++++++++++++++---
 include/linux/sched.h |  9 +++++++++
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 506cd411f4ac2..43d0a0d81d44e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1210,6 +1210,9 @@ int begin_new_exec(struct linux_binprm * bprm)
 {
 	struct task_struct *me = current;
 	int retval;
+	va_list args;
+	char *name;
+	const char *fmt;
 
 	/* Once we are committed compute the creds */
 	retval = bprm_creds_from_file(bprm);
@@ -1350,11 +1353,22 @@ int begin_new_exec(struct linux_binprm * bprm)
 		 * detecting a concurrent rename and just want a terminated name.
 		 */
 		rcu_read_lock();
-		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
-				true);
+		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
+		name = kvasprintf(GFP_KERNEL, fmt, args);
+		if (!name)
+			return -ENOMEM;
+
+		me->full_name = name;
+		__set_task_comm(me, fmt, true);
 		rcu_read_unlock();
 	} else {
-		__set_task_comm(me, kbasename(bprm->filename), true);
+		fmt = kbasename(bprm->filename);
+		name = kvasprintf(GFP_KERNEL, fmt, args);
+		if (!name)
+			return -ENOMEM;
+
+		me->full_name = name;
+		__set_task_comm(me, fmt, true);
 	}
 
 	/* An exec changes our domain. We are no longer part of the thread
@@ -1401,6 +1415,7 @@ int begin_new_exec(struct linux_binprm * bprm)
 	return 0;
 
 out_unlock:
+	kfree(me->full_name);
 	up_write(&me->signal->exec_update_lock);
 	if (!bprm->cred)
 		mutex_unlock(&me->signal->cred_guard_mutex);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9c15365a30c08..ebf121768d951 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1144,6 +1144,9 @@ struct task_struct {
 	 */
 	char				comm[TASK_COMM_LEN];
 
+	/* To store the full name if task comm is truncated. */
+	char				*full_name;
+
 	struct nameidata		*nameidata;
 
 #ifdef CONFIG_SYSVIPC
@@ -1984,6 +1987,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
 	buf;						\
 })
 
+#define get_task_full_name(buf, buf_size, tsk) ({	\
+	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
+	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
+	buf;						\
+})
+
 #ifdef CONFIG_SMP
 static __always_inline void scheduler_ipi(void)
 {
-- 
2.38.1



  reply	other threads:[~2025-03-14  5:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-14  5:27 [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Bhupesh
2025-03-14  5:27 ` Bhupesh [this message]
2025-03-14  5:27 ` [PATCH RFC 2/2] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
2025-03-14 21:25 ` [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Kees Cook
2025-03-15  7:43   ` Andres Rodriguez
2025-03-18 11:19     ` Bhupesh Sharma
2025-03-18 15:51       ` Kees Cook
2025-03-18 18:06         ` Bhupesh Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250314052715.610377-2-bhupesh@igalia.com \
    --to=bhupesh@igalia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=bsegall@google.com \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jack@suse.cz \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernel-dev@igalia.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mirq-linux@rere.qmqm.pl \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vschneid@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox