linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] Dynamically allocate memory to store task's full name
@ 2025-03-14  5:27 Bhupesh
  2025-03-14  5:27 ` [PATCH RFC 1/2] exec: " Bhupesh
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Bhupesh @ 2025-03-14  5:27 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

While working with user-space debugging tools which work especially
on linux gaming platforms, I found that the task name is truncated due
to the limitation of TASK_COMM_LEN.

For example, currently running 'ps', the task->comm value of a long
task name is truncated due to the limitation of TASK_COMM_LEN.
    create_very_lon

This leads to the names passed from userland via pthread_setname_np()
being truncated.

Now, during debug tracing, seeing truncated names is not very useful,
especially on gaming platforms where the number of tasks running can
be very hight.

For example for debug applications invoking 'pthread_getname_np()'
to debug task names.

This RFC aims to start a conversation and improve the initial RFC
patchset to avoid such buffer overflows by introducing a new
dynamically allocated pointer to store task's full name, which
shouldn't introduce too much overhead as it is in the non-critical
path.

After this change, the full name of these (otherwise truncated) tasks
will be shown in 'ps'. For example:
    create_very_long_name_user_space_script.sh

Bhupesh (2):
  exec: Dynamically allocate memory to store task's full name
  fs/proc: Pass 'task->full_name' via 'proc_task_name()'

 fs/exec.c             | 21 ++++++++++++++++++---
 fs/proc/array.c       |  2 +-
 include/linux/sched.h |  9 +++++++++
 3 files changed, 28 insertions(+), 4 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 1/2] exec: Dynamically allocate memory to store task's full name
  2025-03-14  5:27 [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Bhupesh
@ 2025-03-14  5:27 ` Bhupesh
  2025-03-14  5:27 ` [PATCH RFC 2/2] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
  2025-03-14 21:25 ` [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Kees Cook
  2 siblings, 0 replies; 8+ messages in thread
From: Bhupesh @ 2025-03-14  5:27 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

Provide a parallel implementation for get_task_comm() called
get_task_full_name() which allows the dynamically allocated
and filled-in task's full name to be passed to interested
users such as 'ps'.

Currently while running 'ps', the 'task->comm' value of a long
task name is truncated due to the limitation of TASK_COMM_LEN.
For example:
  # ./create_very_long_name_user_space_script.sh&
  # ps
    PID TTY          TIME CMD
    332 ttyAMA0  00:00:00 create_very_lon

This leads to the names passed from userland via 'pthread_setname_np()'
being truncated.

Now, during debug tracing, seeing truncated names is not very useful.
(for example for debug applications invoking 'pthread_getname_np()') to
debug task names.

One possible way to fix this issue is extending the task comm size, but
as 'task->comm' is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a new
pointer to store task's full name, which won't introduce too much overhead
as it is in the non-critical path.

After this change, the full name of these truncated tasks will be shown
in 'ps'. For example:
  # ps
    PID TTY          TIME CMD
    305 ttyAMA0  00:00:00 create_very_long_name_user_space_script.sh

Here is the proposed flow now:
 1. 'pthread_setname_np()' like userspace API sets thread name.
 2. This will set 'task->full_name' in addition to default 16-byte
   truncated 'task->comm'.
 3. And 'pthread_getname_np()' will retrieve 'task->full_name' by
   default from the same '/proc/self/task/[tid]/full_name'

Step 3 implementation is achieved via the subsequent patch in this
patchset.

Signed-off-by: Bhupesh <bhupesh@igalia.com>
---
 fs/exec.c             | 21 ++++++++++++++++++---
 include/linux/sched.h |  9 +++++++++
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 506cd411f4ac2..43d0a0d81d44e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1210,6 +1210,9 @@ int begin_new_exec(struct linux_binprm * bprm)
 {
 	struct task_struct *me = current;
 	int retval;
+	va_list args;
+	char *name;
+	const char *fmt;
 
 	/* Once we are committed compute the creds */
 	retval = bprm_creds_from_file(bprm);
@@ -1350,11 +1353,22 @@ int begin_new_exec(struct linux_binprm * bprm)
 		 * detecting a concurrent rename and just want a terminated name.
 		 */
 		rcu_read_lock();
-		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
-				true);
+		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
+		name = kvasprintf(GFP_KERNEL, fmt, args);
+		if (!name)
+			return -ENOMEM;
+
+		me->full_name = name;
+		__set_task_comm(me, fmt, true);
 		rcu_read_unlock();
 	} else {
-		__set_task_comm(me, kbasename(bprm->filename), true);
+		fmt = kbasename(bprm->filename);
+		name = kvasprintf(GFP_KERNEL, fmt, args);
+		if (!name)
+			return -ENOMEM;
+
+		me->full_name = name;
+		__set_task_comm(me, fmt, true);
 	}
 
 	/* An exec changes our domain. We are no longer part of the thread
@@ -1401,6 +1415,7 @@ int begin_new_exec(struct linux_binprm * bprm)
 	return 0;
 
 out_unlock:
+	kfree(me->full_name);
 	up_write(&me->signal->exec_update_lock);
 	if (!bprm->cred)
 		mutex_unlock(&me->signal->cred_guard_mutex);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9c15365a30c08..ebf121768d951 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1144,6 +1144,9 @@ struct task_struct {
 	 */
 	char				comm[TASK_COMM_LEN];
 
+	/* To store the full name if task comm is truncated. */
+	char				*full_name;
+
 	struct nameidata		*nameidata;
 
 #ifdef CONFIG_SYSVIPC
@@ -1984,6 +1987,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
 	buf;						\
 })
 
+#define get_task_full_name(buf, buf_size, tsk) ({	\
+	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
+	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
+	buf;						\
+})
+
 #ifdef CONFIG_SMP
 static __always_inline void scheduler_ipi(void)
 {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RFC 2/2] fs/proc: Pass 'task->full_name' via 'proc_task_name()'
  2025-03-14  5:27 [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Bhupesh
  2025-03-14  5:27 ` [PATCH RFC 1/2] exec: " Bhupesh
@ 2025-03-14  5:27 ` Bhupesh
  2025-03-14 21:25 ` [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Kees Cook
  2 siblings, 0 replies; 8+ messages in thread
From: Bhupesh @ 2025-03-14  5:27 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

Now that we have the get_task_full_name() implementation which allows
the dynamically allocated and filled in task's full name to be passed
to interested users, use it in proc_task_name() by default for
task names so that user-land can see them through appropriate tools
(such as 'ps').

Signed-off-by: Bhupesh <bhupesh@igalia.com>
---
 fs/proc/array.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index d6a0369caa931..2cbeb1584f8a4 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
 	else if (p->flags & PF_KTHREAD)
 		get_kthread_comm(tcomm, sizeof(tcomm), p);
 	else
-		get_task_comm(tcomm, p);
+		get_task_full_name(tcomm, sizeof(tcomm), p);
 
 	if (escape)
 		seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 0/2] Dynamically allocate memory to store task's full name
  2025-03-14  5:27 [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Bhupesh
  2025-03-14  5:27 ` [PATCH RFC 1/2] exec: " Bhupesh
  2025-03-14  5:27 ` [PATCH RFC 2/2] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
@ 2025-03-14 21:25 ` Kees Cook
  2025-03-15  7:43   ` Andres Rodriguez
  2 siblings, 1 reply; 8+ messages in thread
From: Kees Cook @ 2025-03-14 21:25 UTC (permalink / raw)
  To: Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

On Fri, Mar 14, 2025 at 10:57:13AM +0530, Bhupesh wrote:
> While working with user-space debugging tools which work especially
> on linux gaming platforms, I found that the task name is truncated due
> to the limitation of TASK_COMM_LEN.
> 
> For example, currently running 'ps', the task->comm value of a long
> task name is truncated due to the limitation of TASK_COMM_LEN.
>     create_very_lon
> 
> This leads to the names passed from userland via pthread_setname_np()
> being truncated.

So there have been long discussions about "comm", and it mainly boils
down to "leave it alone". For the /proc-scraping tools like "ps" and
"top", they check both "comm" and "cmdline", depending on mode. The more
useful (and already untruncated) stuff is in "cmdline", so I suspect it
may make more sense to have pthread_setname_np() interact with that
instead. Also TASK_COMM_LEN is basically considered userspace ABI at
this point and we can't sanely change its length without breaking the
world.

Best to use /proc/$pid/task/$tid/cmdline IMO...

-Kees

> will be shown in 'ps'. For example:
>     create_very_long_name_user_space_script.sh
> 
> Bhupesh (2):
>   exec: Dynamically allocate memory to store task's full name
>   fs/proc: Pass 'task->full_name' via 'proc_task_name()'
> 
>  fs/exec.c             | 21 ++++++++++++++++++---
>  fs/proc/array.c       |  2 +-
>  include/linux/sched.h |  9 +++++++++
>  3 files changed, 28 insertions(+), 4 deletions(-)
> 
> -- 
> 2.38.1
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 0/2] Dynamically allocate memory to store task's full name
  2025-03-14 21:25 ` [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Kees Cook
@ 2025-03-15  7:43   ` Andres Rodriguez
  2025-03-18 11:19     ` Bhupesh Sharma
  0 siblings, 1 reply; 8+ messages in thread
From: Andres Rodriguez @ 2025-03-15  7:43 UTC (permalink / raw)
  To: Kees Cook, Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid



On 3/14/25 14:25, Kees Cook wrote:
> On Fri, Mar 14, 2025 at 10:57:13AM +0530, Bhupesh wrote:
>> While working with user-space debugging tools which work especially
>> on linux gaming platforms, I found that the task name is truncated due
>> to the limitation of TASK_COMM_LEN.
>>
>> For example, currently running 'ps', the task->comm value of a long
>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>      create_very_lon
>>
>> This leads to the names passed from userland via pthread_setname_np()
>> being truncated.
> 
> So there have been long discussions about "comm", and it mainly boils
> down to "leave it alone". For the /proc-scraping tools like "ps" and
> "top", they check both "comm" and "cmdline", depending on mode. The more
> useful (and already untruncated) stuff is in "cmdline", so I suspect it
> may make more sense to have pthread_setname_np() interact with that
> instead. Also TASK_COMM_LEN is basically considered userspace ABI at
> this point and we can't sanely change its length without breaking the
> world.
> 

Completely agree that comm is best left untouched. TASK_COMM_LEN is 
embedded into the kernel and the pthread ABI changes here should be avoided.

> Best to use /proc/$pid/task/$tid/cmdline IMO...

Your recommendation works great for programs like ps and top, which are
the examples proposed in the cover letter. However, I think the opening 
email didn't point out use cases where the name is modified at runtime. 
In those cases cmdline would be an unsuitable solution as it should 
remain immutable across the process lifetime. An example of this use 
case would be to set a thread's name for debugging purposes and then 
trying to query it via gdb or perf.

I wrote a quick and dirty example to illustrate what I mean:
https://github.com/lostgoat/tasknames

I think an alternative approach could be to have a separate entry in 
procfs to store a tasks debug name (and leave comm completely 
untouched), e.g. /proc/$pid/task/$tid/debug_name. This would allow 
userspace apps to be updated with the following logic:

get_task_debug_name() {
     if ( !is_empty( debug_name ) )
         return debug_name;
     return comm;
}

"Legacy" userspace apps would remain ABI compatible as they would just 
fall back to comm. And apps that want to opt in to the new behaviour can 
be updated one at a time. Which would be work intensive, but even just 
updating gdb and perf would be super helpful.

-Andres

> 
> -Kees
> 
>> will be shown in 'ps'. For example:
>>      create_very_long_name_user_space_script.sh
>>
>> Bhupesh (2):
>>    exec: Dynamically allocate memory to store task's full name
>>    fs/proc: Pass 'task->full_name' via 'proc_task_name()'
>>
>>   fs/exec.c             | 21 ++++++++++++++++++---
>>   fs/proc/array.c       |  2 +-
>>   include/linux/sched.h |  9 +++++++++
>>   3 files changed, 28 insertions(+), 4 deletions(-)
>>
>> -- 
>> 2.38.1
>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 0/2] Dynamically allocate memory to store task's full name
  2025-03-15  7:43   ` Andres Rodriguez
@ 2025-03-18 11:19     ` Bhupesh Sharma
  2025-03-18 15:51       ` Kees Cook
  0 siblings, 1 reply; 8+ messages in thread
From: Bhupesh Sharma @ 2025-03-18 11:19 UTC (permalink / raw)
  To: Andres Rodriguez, Kees Cook, Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

Hi,

Thanks for the review and inputs on the additional possible use-cases.
Please see my replies inline.

On 3/15/25 1:13 PM, Andres Rodriguez wrote:
>
>
> On 3/14/25 14:25, Kees Cook wrote:
>> On Fri, Mar 14, 2025 at 10:57:13AM +0530, Bhupesh wrote:
>>> While working with user-space debugging tools which work especially
>>> on linux gaming platforms, I found that the task name is truncated due
>>> to the limitation of TASK_COMM_LEN.
>>>
>>> For example, currently running 'ps', the task->comm value of a long
>>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>>      create_very_lon
>>>
>>> This leads to the names passed from userland via pthread_setname_np()
>>> being truncated.
>>
>> So there have been long discussions about "comm", and it mainly boils
>> down to "leave it alone". For the /proc-scraping tools like "ps" and
>> "top", they check both "comm" and "cmdline", depending on mode. The more
>> useful (and already untruncated) stuff is in "cmdline", so I suspect it
>> may make more sense to have pthread_setname_np() interact with that
>> instead. Also TASK_COMM_LEN is basically considered userspace ABI at
>> this point and we can't sanely change its length without breaking the
>> world.
>>
>
> Completely agree that comm is best left untouched. TASK_COMM_LEN is 
> embedded into the kernel and the pthread ABI changes here should be 
> avoided.
>

So, basically my approach _does not_ touch TASK_COMM_LEN at all. The 
normal 'TASK_COMM_LEN' 16byte design remains untouched.
Which means that all the legacy / existing ABi which uses 'task->comm' 
and hence are designed / written to handle 'TASK_COMM_LEN' 16-byte name, 
continue to work as before using '/proc/$pid/task/$tid/comm'.

This change-set only adds a _parallel_ dynamically allocated 
'task->full_name' which can be used by interested users via 
'/proc/$pid/task/$tid/full_name'.

[PATCH 2/2] shows only a possible use-case of the same and can be 
dropped with only [PATCH 1/2] being considered to add the 
'/proc/$pid/task/$tid/full_name' interface.
>> Best to use /proc/$pid/task/$tid/cmdline IMO...
>
> Your recommendation works great for programs like ps and top, which are
> the examples proposed in the cover letter. However, I think the 
> opening email didn't point out use cases where the name is modified at 
> runtime. In those cases cmdline would be an unsuitable solution as it 
> should remain immutable across the process lifetime. An example of 
> this use case would be to set a thread's name for debugging purposes 
> and then trying to query it via gdb or perf.
>
> I wrote a quick and dirty example to illustrate what I mean:
> https://github.com/lostgoat/tasknames
>
> I think an alternative approach could be to have a separate entry in 
> procfs to store a tasks debug name (and leave comm completely 
> untouched), e.g. /proc/$pid/task/$tid/debug_name. This would allow 
> userspace apps to be updated with the following logic:
>
> get_task_debug_name() {
>     if ( !is_empty( debug_name ) )
>         return debug_name;
>     return comm;
> }
>
> "Legacy" userspace apps would remain ABI compatible as they would just 
> fall back to comm. And apps that want to opt in to the new behaviour 
> can be updated one at a time. Which would be work intensive, but even 
> just updating gdb and perf would be super helpful.

I am fine with adding either '/proc/$pid/task/$tid/full_name' or 
'/proc/$pid/task/$tid/debug_name' (actually both of these achieve the same).
The new / modified users (especially the debug applications you listed 
above) can switch easily to using '/proc/$pid/task/$tid/full_name' 
instead of ''/proc/$pid/task/$tid/comm'

AFAIK we already achieved for the kthreads using d6986ce24fc00 
("kthread: dynamically allocate memory to store kthread's full name"), 
which adds 'full_name' in parallel to 'comm' for kthread names.

Thanks,
Bhupesh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 0/2] Dynamically allocate memory to store task's full name
  2025-03-18 11:19     ` Bhupesh Sharma
@ 2025-03-18 15:51       ` Kees Cook
  2025-03-18 18:06         ` Bhupesh Sharma
  0 siblings, 1 reply; 8+ messages in thread
From: Kees Cook @ 2025-03-18 15:51 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: Andres Rodriguez, Bhupesh, akpm, kernel-dev, linux-kernel, bpf,
	linux-perf-users, linux-fsdevel, linux-mm, oliver.sang, lkp,
	laoar.shao, pmladek, rostedt, mathieu.desnoyers, arnaldo.melo,
	alexei.starovoitov, andrii.nakryiko, mirq-linux, peterz, willy,
	david, viro, ebiederm, brauner, jack, mingo, juri.lelli, bsegall,
	mgorman, vschneid

On Tue, Mar 18, 2025 at 04:49:28PM +0530, Bhupesh Sharma wrote:
> On 3/15/25 1:13 PM, Andres Rodriguez wrote:
> > On 3/14/25 14:25, Kees Cook wrote:
> > > On Fri, Mar 14, 2025 at 10:57:13AM +0530, Bhupesh wrote:
> > > > While working with user-space debugging tools which work especially
> > > > on linux gaming platforms, I found that the task name is truncated due
> > > > to the limitation of TASK_COMM_LEN.
> > > > 
> > > > For example, currently running 'ps', the task->comm value of a long
> > > > task name is truncated due to the limitation of TASK_COMM_LEN.
> > > >      create_very_lon
> > > > 
> > > > This leads to the names passed from userland via pthread_setname_np()
> > > > being truncated.
> > > 
> > > So there have been long discussions about "comm", and it mainly boils
> > > down to "leave it alone". For the /proc-scraping tools like "ps" and
> > > "top", they check both "comm" and "cmdline", depending on mode. The more
> > > useful (and already untruncated) stuff is in "cmdline", so I suspect it
> > > may make more sense to have pthread_setname_np() interact with that
> > > instead. Also TASK_COMM_LEN is basically considered userspace ABI at
> > > this point and we can't sanely change its length without breaking the
> > > world.
> > > 
> > 
> > Completely agree that comm is best left untouched. TASK_COMM_LEN is
> > embedded into the kernel and the pthread ABI changes here should be
> > avoided.
> > 
> 
> So, basically my approach _does not_ touch TASK_COMM_LEN at all. The normal
> 'TASK_COMM_LEN' 16byte design remains untouched.
> Which means that all the legacy / existing ABi which uses 'task->comm' and
> hence are designed / written to handle 'TASK_COMM_LEN' 16-byte name,
> continue to work as before using '/proc/$pid/task/$tid/comm'.
> 
> This change-set only adds a _parallel_ dynamically allocated
> 'task->full_name' which can be used by interested users via
> '/proc/$pid/task/$tid/full_name'.

I don't want to add this to all processes at exec time as the existing
solution works for those processes: read /proc/$pid/cmdline.

That said, adding another pointer to task_struct isn't to bad I guess,
and it could be updated by later calls. Maybe by default it just points
to "comm".

> I am fine with adding either '/proc/$pid/task/$tid/full_name' or
> '/proc/$pid/task/$tid/debug_name' (actually both of these achieve the same).
> The new / modified users (especially the debug applications you listed
> above) can switch easily to using '/proc/$pid/task/$tid/full_name' instead
> of ''/proc/$pid/task/$tid/comm'
> 
> AFAIK we already achieved for the kthreads using d6986ce24fc00 ("kthread:
> dynamically allocate memory to store kthread's full name"), which adds
> 'full_name' in parallel to 'comm' for kthread names.

If we do this for task_struct, we should remove "full_name" from kthread
and generalize it for all processes.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 0/2] Dynamically allocate memory to store task's full name
  2025-03-18 15:51       ` Kees Cook
@ 2025-03-18 18:06         ` Bhupesh Sharma
  0 siblings, 0 replies; 8+ messages in thread
From: Bhupesh Sharma @ 2025-03-18 18:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andres Rodriguez, Bhupesh, akpm, kernel-dev, linux-kernel, bpf,
	linux-perf-users, linux-fsdevel, linux-mm, oliver.sang, lkp,
	laoar.shao, pmladek, rostedt, mathieu.desnoyers, arnaldo.melo,
	alexei.starovoitov, andrii.nakryiko, mirq-linux, peterz, willy,
	david, viro, ebiederm, brauner, jack, mingo, juri.lelli, bsegall,
	mgorman, vschneid

Hi Kees,

On 3/18/25 9:21 PM, Kees Cook wrote:
> On Tue, Mar 18, 2025 at 04:49:28PM +0530, Bhupesh Sharma wrote:
>> On 3/15/25 1:13 PM, Andres Rodriguez wrote:
>>> On 3/14/25 14:25, Kees Cook wrote:
>>>> On Fri, Mar 14, 2025 at 10:57:13AM +0530, Bhupesh wrote:
>>>>> While working with user-space debugging tools which work especially
>>>>> on linux gaming platforms, I found that the task name is truncated due
>>>>> to the limitation of TASK_COMM_LEN.
>>>>>
>>>>> For example, currently running 'ps', the task->comm value of a long
>>>>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>>>>       create_very_lon
>>>>>
>>>>> This leads to the names passed from userland via pthread_setname_np()
>>>>> being truncated.
>>>> So there have been long discussions about "comm", and it mainly boils
>>>> down to "leave it alone". For the /proc-scraping tools like "ps" and
>>>> "top", they check both "comm" and "cmdline", depending on mode. The more
>>>> useful (and already untruncated) stuff is in "cmdline", so I suspect it
>>>> may make more sense to have pthread_setname_np() interact with that
>>>> instead. Also TASK_COMM_LEN is basically considered userspace ABI at
>>>> this point and we can't sanely change its length without breaking the
>>>> world.
>>>>
>>> Completely agree that comm is best left untouched. TASK_COMM_LEN is
>>> embedded into the kernel and the pthread ABI changes here should be
>>> avoided.
>>>
>> So, basically my approach _does not_ touch TASK_COMM_LEN at all. The normal
>> 'TASK_COMM_LEN' 16byte design remains untouched.
>> Which means that all the legacy / existing ABi which uses 'task->comm' and
>> hence are designed / written to handle 'TASK_COMM_LEN' 16-byte name,
>> continue to work as before using '/proc/$pid/task/$tid/comm'.
>>
>> This change-set only adds a _parallel_ dynamically allocated
>> 'task->full_name' which can be used by interested users via
>> '/proc/$pid/task/$tid/full_name'.
> I don't want to add this to all processes at exec time as the existing
> solution works for those processes: read /proc/$pid/cmdline.
>
> That said, adding another pointer to task_struct isn't to bad I guess,
> and it could be updated by later calls. Maybe by default it just points
> to "comm".
Sure.

>
>> I am fine with adding either '/proc/$pid/task/$tid/full_name' or
>> '/proc/$pid/task/$tid/debug_name' (actually both of these achieve the same).
>> The new / modified users (especially the debug applications you listed
>> above) can switch easily to using '/proc/$pid/task/$tid/full_name' instead
>> of ''/proc/$pid/task/$tid/comm'
>>
>> AFAIK we already achieved for the kthreads using d6986ce24fc00 ("kthread:
>> dynamically allocate memory to store kthread's full name"), which adds
>> 'full_name' in parallel to 'comm' for kthread names.
> If we do this for task_struct, we should remove "full_name" from kthread
> and generalize it for all processes.
>
Got it. Ok, let me rework the series so that we have a unified 
'full_name' inside 'task_struct' and have kthread use it as well.

I will send a v2 accordingly.

Thanks,
Bhupesh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-03-18 18:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-14  5:27 [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Bhupesh
2025-03-14  5:27 ` [PATCH RFC 1/2] exec: " Bhupesh
2025-03-14  5:27 ` [PATCH RFC 2/2] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
2025-03-14 21:25 ` [PATCH RFC 0/2] Dynamically allocate memory to store task's full name Kees Cook
2025-03-15  7:43   ` Andres Rodriguez
2025-03-18 11:19     ` Bhupesh Sharma
2025-03-18 15:51       ` Kees Cook
2025-03-18 18:06         ` Bhupesh Sharma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).