linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Dynamically allocate memory to store task's full name
@ 2025-03-31 12:18 Bhupesh
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Bhupesh @ 2025-03-31 12:18 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

Changes since v1:
================
- v1 can be seen here: https://lore.kernel.org/lkml/20250314052715.610377-1-bhupesh@igalia.com/
- As suggested by Kees, added [PATCH 3/3] to have a consistent
  'full_name' entry inside 'task_struct' which both tasks and
  kthreads can use.
- Fixed the commit message to indicate that the existing ABI
  '/proc/$pid/task/$tid/comm' remains untouched and a parallel
  '/proc/$pid/task/$tid/full_name' ABI for new (interested) users.

While working with user-space debugging tools which work especially
on linux gaming platforms, I found that the task name is truncated due
to the limitation of TASK_COMM_LEN.

Now, during debug tracing, seeing truncated names is not very useful,
especially on gaming platforms where the number of tasks running can
be very high.

This patch does not touch 'TASK_COMM_LEN' at all, i.e.
'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
that all the legacy / existing ABI, continue to work as before using
'/proc/$pid/task/$tid/comm'.

This patch only adds a parallel, dynamically-allocated
'task->full_name' which can be used by interested users
via '/proc/$pid/task/$tid/full_name'.

After this change, gdb is able to show full name of the task, using a
simple app which generates threads with long names [see 1]:
  # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
  # cat log

  NameThatIsTooLongForComm[4662]

[1]. https://github.com/lostgoat/tasknames

Bhupesh (3):
  exec: Dynamically allocate memory to store task's full name
  fs/proc: Pass 'task->full_name' via 'proc_task_name()'
  kthread: Use 'task_struct->full_name' to store kthread's full name

 fs/exec.c             | 21 ++++++++++++++++++---
 fs/proc/array.c       |  2 +-
 include/linux/sched.h |  9 +++++++++
 kernel/kthread.c      |  9 +++------
 4 files changed, 31 insertions(+), 10 deletions(-)

-- 
2.38.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-03-31 12:18 [PATCH v2 0/3] Dynamically allocate memory to store task's full name Bhupesh
@ 2025-03-31 12:18 ` Bhupesh
  2025-04-01  2:07   ` Yafang Shao
                     ` (3 more replies)
  2025-03-31 12:18 ` [PATCH v2 2/3] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
  2025-03-31 12:18 ` [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name Bhupesh
  2 siblings, 4 replies; 18+ messages in thread
From: Bhupesh @ 2025-03-31 12:18 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

Provide a parallel implementation for get_task_comm() called
get_task_full_name() which allows the dynamically allocated
and filled-in task's full name to be passed to interested
users such as 'gdb'.

Currently while running 'gdb', the 'task->comm' value of a long
task name is truncated due to the limitation of TASK_COMM_LEN.

For example using gdb to debug a simple app currently which generate
threads with long task names:
  # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
  # cat log

  NameThatIsTooLo

This patch does not touch 'TASK_COMM_LEN' at all, i.e.
'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
that all the legacy / existing ABI, continue to work as before using
'/proc/$pid/task/$tid/comm'.

This patch only adds a parallel, dynamically-allocated
'task->full_name' which can be used by interested users
via '/proc/$pid/task/$tid/full_name'.

After this change, gdb is able to show full name of the task:
  # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
  # cat log

  NameThatIsTooLongForComm[4662]

Signed-off-by: Bhupesh <bhupesh@igalia.com>
---
 fs/exec.c             | 21 ++++++++++++++++++---
 include/linux/sched.h |  9 +++++++++
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index f45859ad13ac..4219d77a519c 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
 {
 	struct task_struct *me = current;
 	int retval;
+	va_list args;
+	char *name;
+	const char *fmt;
 
 	/* Once we are committed compute the creds */
 	retval = bprm_creds_from_file(bprm);
@@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
 		 * detecting a concurrent rename and just want a terminated name.
 		 */
 		rcu_read_lock();
-		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
-				true);
+		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
+		name = kvasprintf(GFP_KERNEL, fmt, args);
+		if (!name)
+			return -ENOMEM;
+
+		me->full_name = name;
+		__set_task_comm(me, fmt, true);
 		rcu_read_unlock();
 	} else {
-		__set_task_comm(me, kbasename(bprm->filename), true);
+		fmt = kbasename(bprm->filename);
+		name = kvasprintf(GFP_KERNEL, fmt, args);
+		if (!name)
+			return -ENOMEM;
+
+		me->full_name = name;
+		__set_task_comm(me, fmt, true);
 	}
 
 	/* An exec changes our domain. We are no longer part of the thread
@@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
 	return 0;
 
 out_unlock:
+	kfree(me->full_name);
 	up_write(&me->signal->exec_update_lock);
 	if (!bprm->cred)
 		mutex_unlock(&me->signal->cred_guard_mutex);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 56ddeb37b5cd..053b52606652 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1166,6 +1166,9 @@ struct task_struct {
 	 */
 	char				comm[TASK_COMM_LEN];
 
+	/* To store the full name if task comm is truncated. */
+	char				*full_name;
+
 	struct nameidata		*nameidata;
 
 #ifdef CONFIG_SYSVIPC
@@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
 	buf;						\
 })
 
+#define get_task_full_name(buf, buf_size, tsk) ({	\
+	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
+	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
+	buf;						\
+})
+
 #ifdef CONFIG_SMP
 static __always_inline void scheduler_ipi(void)
 {
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/3] fs/proc: Pass 'task->full_name' via 'proc_task_name()'
  2025-03-31 12:18 [PATCH v2 0/3] Dynamically allocate memory to store task's full name Bhupesh
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
@ 2025-03-31 12:18 ` Bhupesh
  2025-03-31 12:18 ` [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name Bhupesh
  2 siblings, 0 replies; 18+ messages in thread
From: Bhupesh @ 2025-03-31 12:18 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

Now that we have the get_task_full_name() implementation which allows
the dynamically allocated and filled in task's full name to be passed
to interested users, use it in proc_task_name() by default for
task names so that user-land can see them through appropriate tools
(such as 'ps').

Signed-off-by: Bhupesh <bhupesh@igalia.com>
---
 fs/proc/array.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index d6a0369caa93..2cbeb1584f8a 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
 	else if (p->flags & PF_KTHREAD)
 		get_kthread_comm(tcomm, sizeof(tcomm), p);
 	else
-		get_task_comm(tcomm, p);
+		get_task_full_name(tcomm, sizeof(tcomm), p);
 
 	if (escape)
 		seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name
  2025-03-31 12:18 [PATCH v2 0/3] Dynamically allocate memory to store task's full name Bhupesh
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
  2025-03-31 12:18 ` [PATCH v2 2/3] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
@ 2025-03-31 12:18 ` Bhupesh
  2025-04-03 16:24   ` Kees Cook
  2 siblings, 1 reply; 18+ messages in thread
From: Bhupesh @ 2025-03-31 12:18 UTC (permalink / raw)
  To: akpm
  Cc: bhupesh, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

Commit 6986ce24fc00 ("kthread: dynamically allocate memory to store
kthread's full name"), added 'full_name' in parallel to 'comm' for
kthread names.

Now that we have added 'full_name' added to 'task_struct' itself,
drop the additional 'full_name' entry from 'struct kthread' and also
its usage.

Signed-off-by: Bhupesh <bhupesh@igalia.com>
---
 kernel/kthread.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 5dc5b0d7238e..46fe19b7ef76 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -66,8 +66,6 @@ struct kthread {
 #ifdef CONFIG_BLK_CGROUP
 	struct cgroup_subsys_state *blkcg_css;
 #endif
-	/* To store the full name if task comm is truncated. */
-	char *full_name;
 	struct task_struct *task;
 	struct list_head hotplug_node;
 	struct cpumask *preferred_affinity;
@@ -108,12 +106,12 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
 {
 	struct kthread *kthread = to_kthread(tsk);
 
-	if (!kthread || !kthread->full_name) {
+	if (!kthread || !tsk->full_name) {
 		strscpy(buf, tsk->comm, buf_size);
 		return;
 	}
 
-	strscpy_pad(buf, kthread->full_name, buf_size);
+	strscpy_pad(buf, tsk->full_name, buf_size);
 }
 
 bool set_kthread_struct(struct task_struct *p)
@@ -153,7 +151,6 @@ void free_kthread_struct(struct task_struct *k)
 	WARN_ON_ONCE(kthread->blkcg_css);
 #endif
 	k->worker_private = NULL;
-	kfree(kthread->full_name);
 	kfree(kthread);
 }
 
@@ -430,7 +427,7 @@ static int kthread(void *_create)
 		kthread_exit(-EINTR);
 	}
 
-	self->full_name = create->full_name;
+	self->task->full_name = create->full_name;
 	self->threadfn = threadfn;
 	self->data = data;
 
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
@ 2025-04-01  2:07   ` Yafang Shao
  2025-04-04  6:35     ` Bhupesh Sharma
  2025-04-01  4:02   ` Harry Yoo
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Yafang Shao @ 2025-04-01  2:07 UTC (permalink / raw)
  To: Bhupesh, Linus Torvalds
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, pmladek, rostedt,
	mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

On Mon, Mar 31, 2025 at 8:18 PM Bhupesh <bhupesh@igalia.com> wrote:
>
> Provide a parallel implementation for get_task_comm() called
> get_task_full_name() which allows the dynamically allocated
> and filled-in task's full name to be passed to interested
> users such as 'gdb'.
>
> Currently while running 'gdb', the 'task->comm' value of a long
> task name is truncated due to the limitation of TASK_COMM_LEN.
>
> For example using gdb to debug a simple app currently which generate
> threads with long task names:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
>
>   NameThatIsTooLo
>
> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
> that all the legacy / existing ABI, continue to work as before using
> '/proc/$pid/task/$tid/comm'.
>
> This patch only adds a parallel, dynamically-allocated
> 'task->full_name' which can be used by interested users
> via '/proc/$pid/task/$tid/full_name'.
>
> After this change, gdb is able to show full name of the task:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
>
>   NameThatIsTooLongForComm[4662]
>
> Signed-off-by: Bhupesh <bhupesh@igalia.com>
> ---
>  fs/exec.c             | 21 ++++++++++++++++++---
>  include/linux/sched.h |  9 +++++++++
>  2 files changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index f45859ad13ac..4219d77a519c 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>  {
>         struct task_struct *me = current;
>         int retval;
> +       va_list args;
> +       char *name;
> +       const char *fmt;
>
>         /* Once we are committed compute the creds */
>         retval = bprm_creds_from_file(bprm);
> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>                  * detecting a concurrent rename and just want a terminated name.
>                  */
>                 rcu_read_lock();
> -               __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
> -                               true);
> +               fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
> +               name = kvasprintf(GFP_KERNEL, fmt, args);
> +               if (!name)
> +                       return -ENOMEM;
> +
> +               me->full_name = name;
> +               __set_task_comm(me, fmt, true);
>                 rcu_read_unlock();
>         } else {
> -               __set_task_comm(me, kbasename(bprm->filename), true);
> +               fmt = kbasename(bprm->filename);
> +               name = kvasprintf(GFP_KERNEL, fmt, args);
> +               if (!name)
> +                       return -ENOMEM;
> +
> +               me->full_name = name;
> +               __set_task_comm(me, fmt, true);
>         }
>
>         /* An exec changes our domain. We are no longer part of the thread
> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>         return 0;
>
>  out_unlock:
> +       kfree(me->full_name);
>         up_write(&me->signal->exec_update_lock);
>         if (!bprm->cred)
>                 mutex_unlock(&me->signal->cred_guard_mutex);
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 56ddeb37b5cd..053b52606652 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1166,6 +1166,9 @@ struct task_struct {
>          */
>         char                            comm[TASK_COMM_LEN];
>
> +       /* To store the full name if task comm is truncated. */
> +       char                            *full_name;
> +

Adding another field to store the task name isn’t ideal. What about
combining them into a single field, as Linus suggested [0]?

[0]. https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/

-- 
Regards
Yafang


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
  2025-04-01  2:07   ` Yafang Shao
@ 2025-04-01  4:02   ` Harry Yoo
  2025-04-04  5:31     ` Bhupesh Sharma
  2025-04-03 16:17   ` Andrii Nakryiko
  2025-04-03 16:30   ` Kees Cook
  3 siblings, 1 reply; 18+ messages in thread
From: Harry Yoo @ 2025-04-01  4:02 UTC (permalink / raw)
  To: Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

On Mon, Mar 31, 2025 at 05:48:18PM +0530, Bhupesh wrote:
> Provide a parallel implementation for get_task_comm() called
> get_task_full_name() which allows the dynamically allocated
> and filled-in task's full name to be passed to interested
> users such as 'gdb'.
> 
> Currently while running 'gdb', the 'task->comm' value of a long
> task name is truncated due to the limitation of TASK_COMM_LEN.
> 
> For example using gdb to debug a simple app currently which generate
> threads with long task names:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
> 
>   NameThatIsTooLo
> 
> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
> that all the legacy / existing ABI, continue to work as before using
> '/proc/$pid/task/$tid/comm'.
> 
> This patch only adds a parallel, dynamically-allocated
> 'task->full_name' which can be used by interested users
> via '/proc/$pid/task/$tid/full_name'.
> 
> After this change, gdb is able to show full name of the task:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
> 
>   NameThatIsTooLongForComm[4662]
> 
> Signed-off-by: Bhupesh <bhupesh@igalia.com>
> ---
>  fs/exec.c             | 21 ++++++++++++++++++---
>  include/linux/sched.h |  9 +++++++++
>  2 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index f45859ad13ac..4219d77a519c 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>  {
>  	struct task_struct *me = current;
>  	int retval;
> +	va_list args;
> +	char *name;
> +	const char *fmt;
>  
>  	/* Once we are committed compute the creds */
>  	retval = bprm_creds_from_file(bprm);
> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>  		 * detecting a concurrent rename and just want a terminated name.
>  		 */
>  		rcu_read_lock();
> -		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
> -				true);
> +		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
> +		name = kvasprintf(GFP_KERNEL, fmt, args);
> +		if (!name)
> +			return -ENOMEM;

Is it safe to return error here, instead of jumping to 'out_unlock' label
and then releasing locks?

> +		me->full_name = name;
> +		__set_task_comm(me, fmt, true);
>  		rcu_read_unlock();
>  	} else {
> -		__set_task_comm(me, kbasename(bprm->filename), true);
> +		fmt = kbasename(bprm->filename);
> +		name = kvasprintf(GFP_KERNEL, fmt, args);
> +		if (!name)
> +			return -ENOMEM;
> +
> +		me->full_name = name;
> +		__set_task_comm(me, fmt, true);
>  	}
>  
>  	/* An exec changes our domain. We are no longer part of the thread
> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>  	return 0;
>  
>  out_unlock:
> +	kfree(me->full_name);
>  	up_write(&me->signal->exec_update_lock);
>  	if (!bprm->cred)
>  		mutex_unlock(&me->signal->cred_guard_mutex);
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 56ddeb37b5cd..053b52606652 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1166,6 +1166,9 @@ struct task_struct {
>  	 */
>  	char				comm[TASK_COMM_LEN];
>  
> +	/* To store the full name if task comm is truncated. */
> +	char				*full_name;
> +
>  	struct nameidata		*nameidata;
>  
>  #ifdef CONFIG_SYSVIPC
> @@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
>  	buf;						\
>  })
>  
> +#define get_task_full_name(buf, buf_size, tsk) ({	\
> +	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
> +	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
> +	buf;						\
> +})
> +
>  #ifdef CONFIG_SMP
>  static __always_inline void scheduler_ipi(void)
>  {
> -- 
> 2.38.1
> 
> 

-- 
Cheers,
Harry (formerly known as Hyeonggon)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
  2025-04-01  2:07   ` Yafang Shao
  2025-04-01  4:02   ` Harry Yoo
@ 2025-04-03 16:17   ` Andrii Nakryiko
  2025-04-04  5:36     ` Bhupesh Sharma
  2025-04-03 16:30   ` Kees Cook
  3 siblings, 1 reply; 18+ messages in thread
From: Andrii Nakryiko @ 2025-04-03 16:17 UTC (permalink / raw)
  To: Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	mirq-linux, peterz, willy, david, viro, keescook, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

On Mon, Mar 31, 2025 at 5:18 AM Bhupesh <bhupesh@igalia.com> wrote:
>
> Provide a parallel implementation for get_task_comm() called
> get_task_full_name() which allows the dynamically allocated
> and filled-in task's full name to be passed to interested
> users such as 'gdb'.
>
> Currently while running 'gdb', the 'task->comm' value of a long
> task name is truncated due to the limitation of TASK_COMM_LEN.
>
> For example using gdb to debug a simple app currently which generate
> threads with long task names:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
>
>   NameThatIsTooLo
>
> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
> that all the legacy / existing ABI, continue to work as before using
> '/proc/$pid/task/$tid/comm'.
>
> This patch only adds a parallel, dynamically-allocated
> 'task->full_name' which can be used by interested users
> via '/proc/$pid/task/$tid/full_name'.
>
> After this change, gdb is able to show full name of the task:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
>
>   NameThatIsTooLongForComm[4662]
>
> Signed-off-by: Bhupesh <bhupesh@igalia.com>
> ---
>  fs/exec.c             | 21 ++++++++++++++++++---
>  include/linux/sched.h |  9 +++++++++
>  2 files changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index f45859ad13ac..4219d77a519c 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>  {
>         struct task_struct *me = current;
>         int retval;
> +       va_list args;
> +       char *name;
> +       const char *fmt;
>
>         /* Once we are committed compute the creds */
>         retval = bprm_creds_from_file(bprm);
> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>                  * detecting a concurrent rename and just want a terminated name.
>                  */
>                 rcu_read_lock();
> -               __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
> -                               true);
> +               fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
> +               name = kvasprintf(GFP_KERNEL, fmt, args);

this `args` argument, it's not initialized anywhere, right? It's not
clear where it's coming from, but you are passing it directly into
kvasprintf(), I can't convince myself that this is correct. Can you
please explain what is happening here?

Also, instead of allocating a buffer unconditionally, maybe check that
comm is longer than 16, and if not, just use the old-schoold 16-byte
comm array?


> +               if (!name)
> +                       return -ENOMEM;
> +
> +               me->full_name = name;
> +               __set_task_comm(me, fmt, true);
>                 rcu_read_unlock();
>         } else {
> -               __set_task_comm(me, kbasename(bprm->filename), true);
> +               fmt = kbasename(bprm->filename);
> +               name = kvasprintf(GFP_KERNEL, fmt, args);
> +               if (!name)
> +                       return -ENOMEM;
> +
> +               me->full_name = name;
> +               __set_task_comm(me, fmt, true);
>         }
>
>         /* An exec changes our domain. We are no longer part of the thread
> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>         return 0;
>
>  out_unlock:
> +       kfree(me->full_name);
>         up_write(&me->signal->exec_update_lock);
>         if (!bprm->cred)
>                 mutex_unlock(&me->signal->cred_guard_mutex);
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 56ddeb37b5cd..053b52606652 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1166,6 +1166,9 @@ struct task_struct {
>          */
>         char                            comm[TASK_COMM_LEN];
>
> +       /* To store the full name if task comm is truncated. */
> +       char                            *full_name;
> +
>         struct nameidata                *nameidata;
>
>  #ifdef CONFIG_SYSVIPC
> @@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
>         buf;                                            \
>  })
>
> +#define get_task_full_name(buf, buf_size, tsk) ({      \
> +       BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);      \
> +       strscpy_pad(buf, (tsk)->full_name, buf_size);   \
> +       buf;                                            \
> +})
> +
>  #ifdef CONFIG_SMP
>  static __always_inline void scheduler_ipi(void)
>  {
> --
> 2.38.1
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name
  2025-03-31 12:18 ` [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name Bhupesh
@ 2025-04-03 16:24   ` Kees Cook
  2025-04-04  6:38     ` Bhupesh Sharma
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2025-04-03 16:24 UTC (permalink / raw)
  To: Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

On Mon, Mar 31, 2025 at 05:48:20PM +0530, Bhupesh wrote:
> Commit 6986ce24fc00 ("kthread: dynamically allocate memory to store
> kthread's full name"), added 'full_name' in parallel to 'comm' for
> kthread names.
> 
> Now that we have added 'full_name' added to 'task_struct' itself,
> drop the additional 'full_name' entry from 'struct kthread' and also
> its usage.
> 
> Signed-off-by: Bhupesh <bhupesh@igalia.com>

I'd like to see this patch be the first patch in the series. This show
the existing use for "full_name". (And as such it'll probably need bits
from patch 1.)

-Kees

> ---
>  kernel/kthread.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 5dc5b0d7238e..46fe19b7ef76 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -66,8 +66,6 @@ struct kthread {
>  #ifdef CONFIG_BLK_CGROUP
>  	struct cgroup_subsys_state *blkcg_css;
>  #endif
> -	/* To store the full name if task comm is truncated. */
> -	char *full_name;
>  	struct task_struct *task;
>  	struct list_head hotplug_node;
>  	struct cpumask *preferred_affinity;
> @@ -108,12 +106,12 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
>  {
>  	struct kthread *kthread = to_kthread(tsk);
>  
> -	if (!kthread || !kthread->full_name) {
> +	if (!kthread || !tsk->full_name) {
>  		strscpy(buf, tsk->comm, buf_size);
>  		return;
>  	}
>  
> -	strscpy_pad(buf, kthread->full_name, buf_size);
> +	strscpy_pad(buf, tsk->full_name, buf_size);
>  }
>  
>  bool set_kthread_struct(struct task_struct *p)
> @@ -153,7 +151,6 @@ void free_kthread_struct(struct task_struct *k)
>  	WARN_ON_ONCE(kthread->blkcg_css);
>  #endif
>  	k->worker_private = NULL;
> -	kfree(kthread->full_name);
>  	kfree(kthread);
>  }
>  
> @@ -430,7 +427,7 @@ static int kthread(void *_create)
>  		kthread_exit(-EINTR);
>  	}
>  
> -	self->full_name = create->full_name;
> +	self->task->full_name = create->full_name;
>  	self->threadfn = threadfn;
>  	self->data = data;
>  
> -- 
> 2.38.1
> 

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
                     ` (2 preceding siblings ...)
  2025-04-03 16:17   ` Andrii Nakryiko
@ 2025-04-03 16:30   ` Kees Cook
  2025-04-04  6:48     ` Bhupesh Sharma
  3 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2025-04-03 16:30 UTC (permalink / raw)
  To: Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

On Mon, Mar 31, 2025 at 05:48:18PM +0530, Bhupesh wrote:
> Provide a parallel implementation for get_task_comm() called
> get_task_full_name() which allows the dynamically allocated
> and filled-in task's full name to be passed to interested
> users such as 'gdb'.
> 
> Currently while running 'gdb', the 'task->comm' value of a long
> task name is truncated due to the limitation of TASK_COMM_LEN.
> 
> For example using gdb to debug a simple app currently which generate
> threads with long task names:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
> 
>   NameThatIsTooLo
> 
> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
> that all the legacy / existing ABI, continue to work as before using
> '/proc/$pid/task/$tid/comm'.
> 
> This patch only adds a parallel, dynamically-allocated
> 'task->full_name' which can be used by interested users
> via '/proc/$pid/task/$tid/full_name'.
> 
> After this change, gdb is able to show full name of the task:
>   # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>   # cat log
> 
>   NameThatIsTooLongForComm[4662]
> 
> Signed-off-by: Bhupesh <bhupesh@igalia.com>
> ---
>  fs/exec.c             | 21 ++++++++++++++++++---
>  include/linux/sched.h |  9 +++++++++
>  2 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index f45859ad13ac..4219d77a519c 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>  {
>  	struct task_struct *me = current;
>  	int retval;
> +	va_list args;
> +	char *name;
> +	const char *fmt;
>  
>  	/* Once we are committed compute the creds */
>  	retval = bprm_creds_from_file(bprm);
> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>  		 * detecting a concurrent rename and just want a terminated name.
>  		 */
>  		rcu_read_lock();
> -		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
> -				true);
> +		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
> +		name = kvasprintf(GFP_KERNEL, fmt, args);
> +		if (!name)
> +			return -ENOMEM;
> +
> +		me->full_name = name;
> +		__set_task_comm(me, fmt, true);

I don't want to add new allocations to the default exec path unless we
absolutely must.

In the original proposal this was about setting thread names (after
exec), and I think that'll be fine.

>  		rcu_read_unlock();
>  	} else {
> -		__set_task_comm(me, kbasename(bprm->filename), true);
> +		fmt = kbasename(bprm->filename);
> +		name = kvasprintf(GFP_KERNEL, fmt, args);
> +		if (!name)
> +			return -ENOMEM;
> +
> +		me->full_name = name;
> +		__set_task_comm(me, fmt, true);
>  	}

I think we can just set me->full_name = me->comm by default.

>  
>  	/* An exec changes our domain. We are no longer part of the thread
> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>  	return 0;
>  
>  out_unlock:
> +	kfree(me->full_name);
>  	up_write(&me->signal->exec_update_lock);
>  	if (!bprm->cred)
>  		mutex_unlock(&me->signal->cred_guard_mutex);
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 56ddeb37b5cd..053b52606652 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1166,6 +1166,9 @@ struct task_struct {
>  	 */
>  	char				comm[TASK_COMM_LEN];
>  
> +	/* To store the full name if task comm is truncated. */
> +	char				*full_name;
> +
>  	struct nameidata		*nameidata;
>  
>  #ifdef CONFIG_SYSVIPC
> @@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
>  	buf;						\
>  })
>  
> +#define get_task_full_name(buf, buf_size, tsk) ({	\
> +	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
> +	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
> +	buf;						\
> +})

I think it should be possible to just switch get_task_comm() to use
(tsk)->full_name.

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-01  4:02   ` Harry Yoo
@ 2025-04-04  5:31     ` Bhupesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-04  5:31 UTC (permalink / raw)
  To: Harry Yoo, Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid

On 4/1/25 9:32 AM, Harry Yoo wrote:
> On Mon, Mar 31, 2025 at 05:48:18PM +0530, Bhupesh wrote:
>> Provide a parallel implementation for get_task_comm() called
>> get_task_full_name() which allows the dynamically allocated
>> and filled-in task's full name to be passed to interested
>> users such as 'gdb'.
>>
>> Currently while running 'gdb', the 'task->comm' value of a long
>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>
>> For example using gdb to debug a simple app currently which generate
>> threads with long task names:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLo
>>
>> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
>> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
>> that all the legacy / existing ABI, continue to work as before using
>> '/proc/$pid/task/$tid/comm'.
>>
>> This patch only adds a parallel, dynamically-allocated
>> 'task->full_name' which can be used by interested users
>> via '/proc/$pid/task/$tid/full_name'.
>>
>> After this change, gdb is able to show full name of the task:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLongForComm[4662]
>>
>> Signed-off-by: Bhupesh <bhupesh@igalia.com>
>> ---
>>   fs/exec.c             | 21 ++++++++++++++++++---
>>   include/linux/sched.h |  9 +++++++++
>>   2 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index f45859ad13ac..4219d77a519c 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   {
>>   	struct task_struct *me = current;
>>   	int retval;
>> +	va_list args;
>> +	char *name;
>> +	const char *fmt;
>>   
>>   	/* Once we are committed compute the creds */
>>   	retval = bprm_creds_from_file(bprm);
>> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   		 * detecting a concurrent rename and just want a terminated name.
>>   		 */
>>   		rcu_read_lock();
>> -		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
>> -				true);
>> +		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
>> +		name = kvasprintf(GFP_KERNEL, fmt, args);
>> +		if (!name)
>> +			return -ENOMEM;
> Is it safe to return error here, instead of jumping to 'out_unlock' label
> and then releasing locks?

Ok, let me modify this in v3 to ensure that locks are released properly.

>> +		me->full_name = name;
>> +		__set_task_comm(me, fmt, true);
>>   		rcu_read_unlock();
>>   	} else {
>> -		__set_task_comm(me, kbasename(bprm->filename), true);
>> +		fmt = kbasename(bprm->filename);
>> +		name = kvasprintf(GFP_KERNEL, fmt, args);
>> +		if (!name)
>> +			return -ENOMEM;
>> +
>> +		me->full_name = name;
>> +		__set_task_comm(me, fmt, true);
>>   	}
>>   
>>   	/* An exec changes our domain. We are no longer part of the thread
>> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   	return 0;
>>   
>>   out_unlock:
>> +	kfree(me->full_name);
>>   	up_write(&me->signal->exec_update_lock);
>>   	if (!bprm->cred)
>>   		mutex_unlock(&me->signal->cred_guard_mutex);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 56ddeb37b5cd..053b52606652 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1166,6 +1166,9 @@ struct task_struct {
>>   	 */
>>   	char				comm[TASK_COMM_LEN];
>>   
>> +	/* To store the full name if task comm is truncated. */
>> +	char				*full_name;
>> +
>>   	struct nameidata		*nameidata;
>>   
>>   #ifdef CONFIG_SYSVIPC
>> @@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
>>   	buf;						\
>>   })
>>   
>> +#define get_task_full_name(buf, buf_size, tsk) ({	\
>> +	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
>> +	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
>> +	buf;						\
>> +})
>> +
>>   #ifdef CONFIG_SMP
>>   static __always_inline void scheduler_ipi(void)
>>   {
>> -- 
>> 2.38.1
>>
>>

Thanks.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-03 16:17   ` Andrii Nakryiko
@ 2025-04-04  5:36     ` Bhupesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-04  5:36 UTC (permalink / raw)
  To: Andrii Nakryiko, Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	mirq-linux, peterz, willy, david, viro, keescook, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid



On 4/3/25 9:47 PM, Andrii Nakryiko wrote:
> On Mon, Mar 31, 2025 at 5:18 AM Bhupesh <bhupesh@igalia.com> wrote:
>> Provide a parallel implementation for get_task_comm() called
>> get_task_full_name() which allows the dynamically allocated
>> and filled-in task's full name to be passed to interested
>> users such as 'gdb'.
>>
>> Currently while running 'gdb', the 'task->comm' value of a long
>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>
>> For example using gdb to debug a simple app currently which generate
>> threads with long task names:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLo
>>
>> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
>> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
>> that all the legacy / existing ABI, continue to work as before using
>> '/proc/$pid/task/$tid/comm'.
>>
>> This patch only adds a parallel, dynamically-allocated
>> 'task->full_name' which can be used by interested users
>> via '/proc/$pid/task/$tid/full_name'.
>>
>> After this change, gdb is able to show full name of the task:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLongForComm[4662]
>>
>> Signed-off-by: Bhupesh <bhupesh@igalia.com>
>> ---
>>   fs/exec.c             | 21 ++++++++++++++++++---
>>   include/linux/sched.h |  9 +++++++++
>>   2 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index f45859ad13ac..4219d77a519c 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   {
>>          struct task_struct *me = current;
>>          int retval;
>> +       va_list args;
>> +       char *name;
>> +       const char *fmt;
>>
>>          /* Once we are committed compute the creds */
>>          retval = bprm_creds_from_file(bprm);
>> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>>                   * detecting a concurrent rename and just want a terminated name.
>>                   */
>>                  rcu_read_lock();
>> -               __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
>> -                               true);
>> +               fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
>> +               name = kvasprintf(GFP_KERNEL, fmt, args);
> this `args` argument, it's not initialized anywhere, right? It's not
> clear where it's coming from, but you are passing it directly into
> kvasprintf(), I can't convince myself that this is correct. Can you
> please explain what is happening here?
>
> Also, instead of allocating a buffer unconditionally, maybe check that
> comm is longer than 16, and if not, just use the old-schoold 16-byte
> comm array?

Ok. As Kees also mentioned in his comment, I will try to do away with
the allocation in the exec() hot-path in v3.

>> +               if (!name)
>> +                       return -ENOMEM;
>> +
>> +               me->full_name = name;
>> +               __set_task_comm(me, fmt, true);
>>                  rcu_read_unlock();
>>          } else {
>> -               __set_task_comm(me, kbasename(bprm->filename), true);
>> +               fmt = kbasename(bprm->filename);
>> +               name = kvasprintf(GFP_KERNEL, fmt, args);
>> +               if (!name)
>> +                       return -ENOMEM;
>> +
>> +               me->full_name = name;
>> +               __set_task_comm(me, fmt, true);
>>          }
>>
>>          /* An exec changes our domain. We are no longer part of the thread
>> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>>          return 0;
>>
>>   out_unlock:
>> +       kfree(me->full_name);
>>          up_write(&me->signal->exec_update_lock);
>>          if (!bprm->cred)
>>                  mutex_unlock(&me->signal->cred_guard_mutex);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 56ddeb37b5cd..053b52606652 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1166,6 +1166,9 @@ struct task_struct {
>>           */
>>          char                            comm[TASK_COMM_LEN];
>>
>> +       /* To store the full name if task comm is truncated. */
>> +       char                            *full_name;
>> +
>>          struct nameidata                *nameidata;
>>
>>   #ifdef CONFIG_SYSVIPC
>> @@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
>>          buf;                                            \
>>   })
>>
>> +#define get_task_full_name(buf, buf_size, tsk) ({      \
>> +       BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);      \
>> +       strscpy_pad(buf, (tsk)->full_name, buf_size);   \
>> +       buf;                                            \
>> +})
>> +
>>   #ifdef CONFIG_SMP
>>   static __always_inline void scheduler_ipi(void)
>>   {
>> --
>> 2.38.1
>>

Thanks.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-01  2:07   ` Yafang Shao
@ 2025-04-04  6:35     ` Bhupesh Sharma
  2025-04-06  2:28       ` Yafang Shao
  0 siblings, 1 reply; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-04  6:35 UTC (permalink / raw)
  To: Yafang Shao, Bhupesh, Linus Torvalds
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, pmladek, rostedt,
	mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, keescook,
	ebiederm, brauner, jack, mingo, juri.lelli, bsegall, mgorman,
	vschneid


On 4/1/25 7:37 AM, Yafang Shao wrote:
> On Mon, Mar 31, 2025 at 8:18 PM Bhupesh <bhupesh@igalia.com> wrote:
>> Provide a parallel implementation for get_task_comm() called
>> get_task_full_name() which allows the dynamically allocated
>> and filled-in task's full name to be passed to interested
>> users such as 'gdb'.
>>
>> Currently while running 'gdb', the 'task->comm' value of a long
>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>
>> For example using gdb to debug a simple app currently which generate
>> threads with long task names:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLo
>>
>> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
>> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
>> that all the legacy / existing ABI, continue to work as before using
>> '/proc/$pid/task/$tid/comm'.
>>
>> This patch only adds a parallel, dynamically-allocated
>> 'task->full_name' which can be used by interested users
>> via '/proc/$pid/task/$tid/full_name'.
>>
>> After this change, gdb is able to show full name of the task:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLongForComm[4662]
>>
>> Signed-off-by: Bhupesh <bhupesh@igalia.com>
>> ---
>>   fs/exec.c             | 21 ++++++++++++++++++---
>>   include/linux/sched.h |  9 +++++++++
>>   2 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index f45859ad13ac..4219d77a519c 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   {
>>          struct task_struct *me = current;
>>          int retval;
>> +       va_list args;
>> +       char *name;
>> +       const char *fmt;
>>
>>          /* Once we are committed compute the creds */
>>          retval = bprm_creds_from_file(bprm);
>> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>>                   * detecting a concurrent rename and just want a terminated name.
>>                   */
>>                  rcu_read_lock();
>> -               __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
>> -                               true);
>> +               fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
>> +               name = kvasprintf(GFP_KERNEL, fmt, args);
>> +               if (!name)
>> +                       return -ENOMEM;
>> +
>> +               me->full_name = name;
>> +               __set_task_comm(me, fmt, true);
>>                  rcu_read_unlock();
>>          } else {
>> -               __set_task_comm(me, kbasename(bprm->filename), true);
>> +               fmt = kbasename(bprm->filename);
>> +               name = kvasprintf(GFP_KERNEL, fmt, args);
>> +               if (!name)
>> +                       return -ENOMEM;
>> +
>> +               me->full_name = name;
>> +               __set_task_comm(me, fmt, true);
>>          }
>>
>>          /* An exec changes our domain. We are no longer part of the thread
>> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>>          return 0;
>>
>>   out_unlock:
>> +       kfree(me->full_name);
>>          up_write(&me->signal->exec_update_lock);
>>          if (!bprm->cred)
>>                  mutex_unlock(&me->signal->cred_guard_mutex);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 56ddeb37b5cd..053b52606652 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1166,6 +1166,9 @@ struct task_struct {
>>           */
>>          char                            comm[TASK_COMM_LEN];
>>
>> +       /* To store the full name if task comm is truncated. */
>> +       char                            *full_name;
>> +
> Adding another field to store the task name isn’t ideal. What about
> combining them into a single field, as Linus suggested [0]?
>
> [0]. https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/
>

Thanks for sharing Linus's suggestion. I went through the suggested 
changes in the related threads and came up with the following set of points:

1. struct task_struct would contain both 'comm' and 'full_name',
2. Remove the task_lock() inside __get_task_comm(),
3. Users of task->comm will be affected in the following ways:
     (a). Printing with '%s' and tsk->comm would just continue to 
work,but will get a longer max string.
     (b). For users of memcpy.*->comm\>', we should change 'memcpy()' to 
'copy_comm()' which would look like:

         memcpy(dst, src, TASK_COMM_LEN);
         dst[TASK_COMM_LEN-1] = 0;

    (c). Users which use "sizeof(->comm)" will continue to get the old value because of the hacky union.

Am I missing something here. Please let me know your views.

Thanks,
Bhupesh



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name
  2025-04-03 16:24   ` Kees Cook
@ 2025-04-04  6:38     ` Bhupesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-04  6:38 UTC (permalink / raw)
  To: Kees Cook, Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

Hi Kees,

On 4/3/25 9:54 PM, Kees Cook wrote:
> On Mon, Mar 31, 2025 at 05:48:20PM +0530, Bhupesh wrote:
>> Commit 6986ce24fc00 ("kthread: dynamically allocate memory to store
>> kthread's full name"), added 'full_name' in parallel to 'comm' for
>> kthread names.
>>
>> Now that we have added 'full_name' added to 'task_struct' itself,
>> drop the additional 'full_name' entry from 'struct kthread' and also
>> its usage.
>>
>> Signed-off-by: Bhupesh <bhupesh@igalia.com>
> I'd like to see this patch be the first patch in the series. This show
> the existing use for "full_name". (And as such it'll probably need bits
> from patch 1.)

Sure, I will fix this in v3.

Thanks,
Bhupesh

>
>> ---
>>   kernel/kthread.c | 9 +++------
>>   1 file changed, 3 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/kthread.c b/kernel/kthread.c
>> index 5dc5b0d7238e..46fe19b7ef76 100644
>> --- a/kernel/kthread.c
>> +++ b/kernel/kthread.c
>> @@ -66,8 +66,6 @@ struct kthread {
>>   #ifdef CONFIG_BLK_CGROUP
>>   	struct cgroup_subsys_state *blkcg_css;
>>   #endif
>> -	/* To store the full name if task comm is truncated. */
>> -	char *full_name;
>>   	struct task_struct *task;
>>   	struct list_head hotplug_node;
>>   	struct cpumask *preferred_affinity;
>> @@ -108,12 +106,12 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
>>   {
>>   	struct kthread *kthread = to_kthread(tsk);
>>   
>> -	if (!kthread || !kthread->full_name) {
>> +	if (!kthread || !tsk->full_name) {
>>   		strscpy(buf, tsk->comm, buf_size);
>>   		return;
>>   	}
>>   
>> -	strscpy_pad(buf, kthread->full_name, buf_size);
>> +	strscpy_pad(buf, tsk->full_name, buf_size);
>>   }
>>   
>>   bool set_kthread_struct(struct task_struct *p)
>> @@ -153,7 +151,6 @@ void free_kthread_struct(struct task_struct *k)
>>   	WARN_ON_ONCE(kthread->blkcg_css);
>>   #endif
>>   	k->worker_private = NULL;
>> -	kfree(kthread->full_name);
>>   	kfree(kthread);
>>   }
>>   
>> @@ -430,7 +427,7 @@ static int kthread(void *_create)
>>   		kthread_exit(-EINTR);
>>   	}
>>   
>> -	self->full_name = create->full_name;
>> +	self->task->full_name = create->full_name;
>>   	self->threadfn = threadfn;
>>   	self->data = data;
>>   
>> -- 
>> 2.38.1
>>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-03 16:30   ` Kees Cook
@ 2025-04-04  6:48     ` Bhupesh Sharma
  2025-04-04 17:24       ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-04  6:48 UTC (permalink / raw)
  To: Kees Cook, Bhupesh
  Cc: akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

Hi Kees,

On 4/3/25 10:00 PM, Kees Cook wrote:
> On Mon, Mar 31, 2025 at 05:48:18PM +0530, Bhupesh wrote:
>> Provide a parallel implementation for get_task_comm() called
>> get_task_full_name() which allows the dynamically allocated
>> and filled-in task's full name to be passed to interested
>> users such as 'gdb'.
>>
>> Currently while running 'gdb', the 'task->comm' value of a long
>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>
>> For example using gdb to debug a simple app currently which generate
>> threads with long task names:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLo
>>
>> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
>> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
>> that all the legacy / existing ABI, continue to work as before using
>> '/proc/$pid/task/$tid/comm'.
>>
>> This patch only adds a parallel, dynamically-allocated
>> 'task->full_name' which can be used by interested users
>> via '/proc/$pid/task/$tid/full_name'.
>>
>> After this change, gdb is able to show full name of the task:
>>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>    # cat log
>>
>>    NameThatIsTooLongForComm[4662]
>>
>> Signed-off-by: Bhupesh <bhupesh@igalia.com>
>> ---
>>   fs/exec.c             | 21 ++++++++++++++++++---
>>   include/linux/sched.h |  9 +++++++++
>>   2 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index f45859ad13ac..4219d77a519c 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   {
>>   	struct task_struct *me = current;
>>   	int retval;
>> +	va_list args;
>> +	char *name;
>> +	const char *fmt;
>>   
>>   	/* Once we are committed compute the creds */
>>   	retval = bprm_creds_from_file(bprm);
>> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   		 * detecting a concurrent rename and just want a terminated name.
>>   		 */
>>   		rcu_read_lock();
>> -		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
>> -				true);
>> +		fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
>> +		name = kvasprintf(GFP_KERNEL, fmt, args);
>> +		if (!name)
>> +			return -ENOMEM;
>> +
>> +		me->full_name = name;
>> +		__set_task_comm(me, fmt, true);
> I don't want to add new allocations to the default exec path unless we
> absolutely must.
>
> In the original proposal this was about setting thread names (after
> exec), and I think that'll be fine.
>

Ok.

>>   		rcu_read_unlock();
>>   	} else {
>> -		__set_task_comm(me, kbasename(bprm->filename), true);
>> +		fmt = kbasename(bprm->filename);
>> +		name = kvasprintf(GFP_KERNEL, fmt, args);
>> +		if (!name)
>> +			return -ENOMEM;
>> +
>> +		me->full_name = name;
>> +		__set_task_comm(me, fmt, true);
>>   	}
> I think we can just set me->full_name = me->comm by default.

Sure.

>>   
>>   	/* An exec changes our domain. We are no longer part of the thread
>> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>>   	return 0;
>>   
>>   out_unlock:
>> +	kfree(me->full_name);
>>   	up_write(&me->signal->exec_update_lock);
>>   	if (!bprm->cred)
>>   		mutex_unlock(&me->signal->cred_guard_mutex);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 56ddeb37b5cd..053b52606652 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1166,6 +1166,9 @@ struct task_struct {
>>   	 */
>>   	char				comm[TASK_COMM_LEN];
>>   
>> +	/* To store the full name if task comm is truncated. */
>> +	char				*full_name;
>> +
>>   	struct nameidata		*nameidata;
>>   
>>   #ifdef CONFIG_SYSVIPC
>> @@ -2007,6 +2010,12 @@ extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec
>>   	buf;						\
>>   })
>>   
>> +#define get_task_full_name(buf, buf_size, tsk) ({	\
>> +	BUILD_BUG_ON(sizeof(buf) < TASK_COMM_LEN);	\
>> +	strscpy_pad(buf, (tsk)->full_name, buf_size);	\
>> +	buf;						\
>> +})
> I think it should be possible to just switch get_task_comm() to use
> (tsk)->full_name.
>

In another review for this series, Yafang mentioned the following 
cleanup + approach suggested by Linus (see [0]).
Also I have summarized my understanding on the basis of the suggestions 
Linus shared and the accompanying background threads (please see [1]).

Kindly share your views on the same, so that I can change the 
implementation in v3 series accordingly.

[0].https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/
[1]. https://lore.kernel.org/all/20250331121820.455916-1-bhupesh@igalia.com/T/#m7c163829440f2c98e64b475b7cdbc35ba4d613ca

Thanks,
Bhupesh



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-04  6:48     ` Bhupesh Sharma
@ 2025-04-04 17:24       ` Kees Cook
  2025-04-09 11:31         ` Bhupesh Sharma
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2025-04-04 17:24 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: Bhupesh, akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

On Fri, Apr 04, 2025 at 12:18:56PM +0530, Bhupesh Sharma wrote:
> In another review for this series, Yafang mentioned the following cleanup +
> approach suggested by Linus (see [0]).
> Also I have summarized my understanding on the basis of the suggestions
> Linus shared and the accompanying background threads (please see [1]).
> 
> Kindly share your views on the same, so that I can change the implementation
> in v3 series accordingly.

In thinking about this a little more I think we can't universally change
all the APIs to use the new full_name since it is a pointer, which may
be getting changed out from under readers if a setter changes it. So
this may need some careful redesign, likely with RCU. hmm.

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-04  6:35     ` Bhupesh Sharma
@ 2025-04-06  2:28       ` Yafang Shao
  2025-04-09 11:13         ` Bhupesh Sharma
  0 siblings, 1 reply; 18+ messages in thread
From: Yafang Shao @ 2025-04-06  2:28 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: Bhupesh, Linus Torvalds, akpm, kernel-dev, linux-kernel, bpf,
	linux-perf-users, linux-fsdevel, linux-mm, oliver.sang, lkp,
	pmladek, rostedt, mathieu.desnoyers, arnaldo.melo,
	alexei.starovoitov, andrii.nakryiko, mirq-linux, peterz, willy,
	david, viro, keescook, ebiederm, brauner, jack, mingo, juri.lelli,
	bsegall, mgorman, vschneid

On Fri, Apr 4, 2025 at 2:35 PM Bhupesh Sharma <bhsharma@igalia.com> wrote:
>
>
> On 4/1/25 7:37 AM, Yafang Shao wrote:
> > On Mon, Mar 31, 2025 at 8:18 PM Bhupesh <bhupesh@igalia.com> wrote:
> >> Provide a parallel implementation for get_task_comm() called
> >> get_task_full_name() which allows the dynamically allocated
> >> and filled-in task's full name to be passed to interested
> >> users such as 'gdb'.
> >>
> >> Currently while running 'gdb', the 'task->comm' value of a long
> >> task name is truncated due to the limitation of TASK_COMM_LEN.
> >>
> >> For example using gdb to debug a simple app currently which generate
> >> threads with long task names:
> >>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
> >>    # cat log
> >>
> >>    NameThatIsTooLo
> >>
> >> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
> >> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
> >> that all the legacy / existing ABI, continue to work as before using
> >> '/proc/$pid/task/$tid/comm'.
> >>
> >> This patch only adds a parallel, dynamically-allocated
> >> 'task->full_name' which can be used by interested users
> >> via '/proc/$pid/task/$tid/full_name'.
> >>
> >> After this change, gdb is able to show full name of the task:
> >>    # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
> >>    # cat log
> >>
> >>    NameThatIsTooLongForComm[4662]
> >>
> >> Signed-off-by: Bhupesh <bhupesh@igalia.com>
> >> ---
> >>   fs/exec.c             | 21 ++++++++++++++++++---
> >>   include/linux/sched.h |  9 +++++++++
> >>   2 files changed, 27 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/exec.c b/fs/exec.c
> >> index f45859ad13ac..4219d77a519c 100644
> >> --- a/fs/exec.c
> >> +++ b/fs/exec.c
> >> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
> >>   {
> >>          struct task_struct *me = current;
> >>          int retval;
> >> +       va_list args;
> >> +       char *name;
> >> +       const char *fmt;
> >>
> >>          /* Once we are committed compute the creds */
> >>          retval = bprm_creds_from_file(bprm);
> >> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
> >>                   * detecting a concurrent rename and just want a terminated name.
> >>                   */
> >>                  rcu_read_lock();
> >> -               __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
> >> -                               true);
> >> +               fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
> >> +               name = kvasprintf(GFP_KERNEL, fmt, args);
> >> +               if (!name)
> >> +                       return -ENOMEM;
> >> +
> >> +               me->full_name = name;
> >> +               __set_task_comm(me, fmt, true);
> >>                  rcu_read_unlock();
> >>          } else {
> >> -               __set_task_comm(me, kbasename(bprm->filename), true);
> >> +               fmt = kbasename(bprm->filename);
> >> +               name = kvasprintf(GFP_KERNEL, fmt, args);
> >> +               if (!name)
> >> +                       return -ENOMEM;
> >> +
> >> +               me->full_name = name;
> >> +               __set_task_comm(me, fmt, true);
> >>          }
> >>
> >>          /* An exec changes our domain. We are no longer part of the thread
> >> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
> >>          return 0;
> >>
> >>   out_unlock:
> >> +       kfree(me->full_name);
> >>          up_write(&me->signal->exec_update_lock);
> >>          if (!bprm->cred)
> >>                  mutex_unlock(&me->signal->cred_guard_mutex);
> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> index 56ddeb37b5cd..053b52606652 100644
> >> --- a/include/linux/sched.h
> >> +++ b/include/linux/sched.h
> >> @@ -1166,6 +1166,9 @@ struct task_struct {
> >>           */
> >>          char                            comm[TASK_COMM_LEN];
> >>
> >> +       /* To store the full name if task comm is truncated. */
> >> +       char                            *full_name;
> >> +
> > Adding another field to store the task name isn’t ideal. What about
> > combining them into a single field, as Linus suggested [0]?
> >
> > [0]. https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/
> >
>
> Thanks for sharing Linus's suggestion. I went through the suggested
> changes in the related threads and came up with the following set of points:
>
> 1. struct task_struct would contain both 'comm' and 'full_name',

Correct.

> 2. Remove the task_lock() inside __get_task_comm(),

This has been implemented in the patch series titled "Improve the copy
of task comm". For details, please refer to:
https://lore.kernel.org/linux-mm/20240828030321.20688-1-laoar.shao@gmail.com/.

> 3. Users of task->comm will be affected in the following ways:

Correct.

>      (a). Printing with '%s' and tsk->comm would just continue to
> work,but will get a longer max string.
>      (b). For users of memcpy.*->comm\>', we should change 'memcpy()' to
> 'copy_comm()' which would look like:
>
>          memcpy(dst, src, TASK_COMM_LEN);
>          dst[TASK_COMM_LEN-1] = 0;
>
>     (c). Users which use "sizeof(->comm)" will continue to get the old value because of the hacky union.

Using a separate pointer rather than a union could simplify the
implementation. I’m open to introducing a new pointer if you believe
it’s the better approach.

>
> Am I missing something here. Please let me know your views.


-- 
Regards
Yafang


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-06  2:28       ` Yafang Shao
@ 2025-04-09 11:13         ` Bhupesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-09 11:13 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Bhupesh, Linus Torvalds, akpm, kernel-dev, linux-kernel, bpf,
	linux-perf-users, linux-fsdevel, linux-mm, oliver.sang, lkp,
	pmladek, rostedt, mathieu.desnoyers, arnaldo.melo,
	alexei.starovoitov, andrii.nakryiko, mirq-linux, peterz, willy,
	david, viro, keescook, ebiederm, brauner, jack, mingo, juri.lelli,
	bsegall, mgorman, vschneid

Sorry for the delay in reply, I was out for a couple of days.

On 4/6/25 7:58 AM, Yafang Shao wrote:
> On Fri, Apr 4, 2025 at 2:35 PM Bhupesh Sharma <bhsharma@igalia.com> wrote:
>>
>> On 4/1/25 7:37 AM, Yafang Shao wrote:
>>> On Mon, Mar 31, 2025 at 8:18 PM Bhupesh <bhupesh@igalia.com> wrote:
>>>> Provide a parallel implementation for get_task_comm() called
>>>> get_task_full_name() which allows the dynamically allocated
>>>> and filled-in task's full name to be passed to interested
>>>> users such as 'gdb'.
>>>>
>>>> Currently while running 'gdb', the 'task->comm' value of a long
>>>> task name is truncated due to the limitation of TASK_COMM_LEN.
>>>>
>>>> For example using gdb to debug a simple app currently which generate
>>>> threads with long task names:
>>>>     # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>>>     # cat log
>>>>
>>>>     NameThatIsTooLo
>>>>
>>>> This patch does not touch 'TASK_COMM_LEN' at all, i.e.
>>>> 'TASK_COMM_LEN' and the 16-byte design remains untouched. Which means
>>>> that all the legacy / existing ABI, continue to work as before using
>>>> '/proc/$pid/task/$tid/comm'.
>>>>
>>>> This patch only adds a parallel, dynamically-allocated
>>>> 'task->full_name' which can be used by interested users
>>>> via '/proc/$pid/task/$tid/full_name'.
>>>>
>>>> After this change, gdb is able to show full name of the task:
>>>>     # gdb ./threadnames -ex "run info thread" -ex "detach" -ex "quit" > log
>>>>     # cat log
>>>>
>>>>     NameThatIsTooLongForComm[4662]
>>>>
>>>> Signed-off-by: Bhupesh <bhupesh@igalia.com>
>>>> ---
>>>>    fs/exec.c             | 21 ++++++++++++++++++---
>>>>    include/linux/sched.h |  9 +++++++++
>>>>    2 files changed, 27 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/exec.c b/fs/exec.c
>>>> index f45859ad13ac..4219d77a519c 100644
>>>> --- a/fs/exec.c
>>>> +++ b/fs/exec.c
>>>> @@ -1208,6 +1208,9 @@ int begin_new_exec(struct linux_binprm * bprm)
>>>>    {
>>>>           struct task_struct *me = current;
>>>>           int retval;
>>>> +       va_list args;
>>>> +       char *name;
>>>> +       const char *fmt;
>>>>
>>>>           /* Once we are committed compute the creds */
>>>>           retval = bprm_creds_from_file(bprm);
>>>> @@ -1348,11 +1351,22 @@ int begin_new_exec(struct linux_binprm * bprm)
>>>>                    * detecting a concurrent rename and just want a terminated name.
>>>>                    */
>>>>                   rcu_read_lock();
>>>> -               __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
>>>> -                               true);
>>>> +               fmt = smp_load_acquire(&bprm->file->f_path.dentry->d_name.name);
>>>> +               name = kvasprintf(GFP_KERNEL, fmt, args);
>>>> +               if (!name)
>>>> +                       return -ENOMEM;
>>>> +
>>>> +               me->full_name = name;
>>>> +               __set_task_comm(me, fmt, true);
>>>>                   rcu_read_unlock();
>>>>           } else {
>>>> -               __set_task_comm(me, kbasename(bprm->filename), true);
>>>> +               fmt = kbasename(bprm->filename);
>>>> +               name = kvasprintf(GFP_KERNEL, fmt, args);
>>>> +               if (!name)
>>>> +                       return -ENOMEM;
>>>> +
>>>> +               me->full_name = name;
>>>> +               __set_task_comm(me, fmt, true);
>>>>           }
>>>>
>>>>           /* An exec changes our domain. We are no longer part of the thread
>>>> @@ -1399,6 +1413,7 @@ int begin_new_exec(struct linux_binprm * bprm)
>>>>           return 0;
>>>>
>>>>    out_unlock:
>>>> +       kfree(me->full_name);
>>>>           up_write(&me->signal->exec_update_lock);
>>>>           if (!bprm->cred)
>>>>                   mutex_unlock(&me->signal->cred_guard_mutex);
>>>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>>>> index 56ddeb37b5cd..053b52606652 100644
>>>> --- a/include/linux/sched.h
>>>> +++ b/include/linux/sched.h
>>>> @@ -1166,6 +1166,9 @@ struct task_struct {
>>>>            */
>>>>           char                            comm[TASK_COMM_LEN];
>>>>
>>>> +       /* To store the full name if task comm is truncated. */
>>>> +       char                            *full_name;
>>>> +
>>> Adding another field to store the task name isn’t ideal. What about
>>> combining them into a single field, as Linus suggested [0]?
>>>
>>> [0]. https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/
>>>
>> Thanks for sharing Linus's suggestion. I went through the suggested
>> changes in the related threads and came up with the following set of points:
>>
>> 1. struct task_struct would contain both 'comm' and 'full_name',
> Correct.
>
>> 2. Remove the task_lock() inside __get_task_comm(),
> This has been implemented in the patch series titled "Improve the copy
> of task comm". For details, please refer to:
> https://lore.kernel.org/linux-mm/20240828030321.20688-1-laoar.shao@gmail.com/.
>
>> 3. Users of task->comm will be affected in the following ways:
> Correct.
>
>>       (a). Printing with '%s' and tsk->comm would just continue to
>> work,but will get a longer max string.
>>       (b). For users of memcpy.*->comm\>', we should change 'memcpy()' to
>> 'copy_comm()' which would look like:
>>
>>           memcpy(dst, src, TASK_COMM_LEN);
>>           dst[TASK_COMM_LEN-1] = 0;
>>
>>      (c). Users which use "sizeof(->comm)" will continue to get the old value because of the hacky union.
> Using a separate pointer rather than a union could simplify the
> implementation. I’m open to introducing a new pointer if you believe
> it’s the better approach.

Right, that's what I was thinking of earlier as well, i.e. having a new 
pointer like tsk->full_name, however
allocating it outside the exec() hot-path may be tricky.

Let me try that though and come up with a v3, that addresses (a), (b) as 
mentioned above and (c) with a pointer instead of union.

Thanks,
Bhupesh



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] exec: Dynamically allocate memory to store task's full name
  2025-04-04 17:24       ` Kees Cook
@ 2025-04-09 11:31         ` Bhupesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Bhupesh Sharma @ 2025-04-09 11:31 UTC (permalink / raw)
  To: Kees Cook
  Cc: Bhupesh, akpm, kernel-dev, linux-kernel, bpf, linux-perf-users,
	linux-fsdevel, linux-mm, oliver.sang, lkp, laoar.shao, pmladek,
	rostedt, mathieu.desnoyers, arnaldo.melo, alexei.starovoitov,
	andrii.nakryiko, mirq-linux, peterz, willy, david, viro, ebiederm,
	brauner, jack, mingo, juri.lelli, bsegall, mgorman, vschneid

Hi Kees,

Sorry for the delay - I was out for a couple of days.

On 4/4/25 10:54 PM, Kees Cook wrote:
> On Fri, Apr 04, 2025 at 12:18:56PM +0530, Bhupesh Sharma wrote:
>> In another review for this series, Yafang mentioned the following cleanup +
>> approach suggested by Linus (see [0]).
>> Also I have summarized my understanding on the basis of the suggestions
>> Linus shared and the accompanying background threads (please see [1]).
>>
>> Kindly share your views on the same, so that I can change the implementation
>> in v3 series accordingly.
> In thinking about this a little more I think we can't universally change
> all the APIs to use the new full_name since it is a pointer, which may
> be getting changed out from under readers if a setter changes it. So
> this may need some careful redesign, likely with RCU. hmm.
>

Thinking more about this, Linus mentioned in [0]:

'Since user space can randomly change their names anyway, using locking
was always wrong for readers (for writers it probably does make sense
to have some lock'

So, if we go with the union approach, probably we can do with just a writer-lock, whereas if we go with a task->full_name like pointer one, we would probably need a rcu lock.

Please let me know your comments.

[0]. https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com/



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-04-09 11:31 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-31 12:18 [PATCH v2 0/3] Dynamically allocate memory to store task's full name Bhupesh
2025-03-31 12:18 ` [PATCH v2 1/3] exec: " Bhupesh
2025-04-01  2:07   ` Yafang Shao
2025-04-04  6:35     ` Bhupesh Sharma
2025-04-06  2:28       ` Yafang Shao
2025-04-09 11:13         ` Bhupesh Sharma
2025-04-01  4:02   ` Harry Yoo
2025-04-04  5:31     ` Bhupesh Sharma
2025-04-03 16:17   ` Andrii Nakryiko
2025-04-04  5:36     ` Bhupesh Sharma
2025-04-03 16:30   ` Kees Cook
2025-04-04  6:48     ` Bhupesh Sharma
2025-04-04 17:24       ` Kees Cook
2025-04-09 11:31         ` Bhupesh Sharma
2025-03-31 12:18 ` [PATCH v2 2/3] fs/proc: Pass 'task->full_name' via 'proc_task_name()' Bhupesh
2025-03-31 12:18 ` [PATCH v2 3/3] kthread: Use 'task_struct->full_name' to store kthread's full name Bhupesh
2025-04-03 16:24   ` Kees Cook
2025-04-04  6:38     ` Bhupesh Sharma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).