From: David Laight <david.laight.linux@gmail.com>
To: Bhupesh Sharma <bhsharma@igalia.com>
Cc: Kees Cook <kees@kernel.org>, Bhupesh <bhupesh@igalia.com>,
akpm@linux-foundation.org, kernel-dev@igalia.com,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, oliver.sang@intel.com, lkp@intel.com,
laoar.shao@gmail.com, pmladek@suse.com, rostedt@goodmis.org,
mathieu.desnoyers@efficios.com, arnaldo.melo@gmail.com,
alexei.starovoitov@gmail.com, andrii.nakryiko@gmail.com,
mirq-linux@rere.qmqm.pl, peterz@infradead.org,
willy@infradead.org, david@redhat.com, viro@zeniv.linux.org.uk,
ebiederm@xmission.com, brauner@kernel.org, jack@suse.cz,
mingo@redhat.com, juri.lelli@redhat.com, bsegall@google.com,
mgorman@suse.de, vschneid@redhat.com,
linux-trace-kernel@vger.kernel.org,
torvalds@linux-foundation.org
Subject: Re: [PATCH v8 4/5] treewide: Switch memcpy() users of 'task->comm' to a more safer implementation
Date: Sat, 6 Sep 2025 12:13:46 +0100 [thread overview]
Message-ID: <20250906121346.3fa6ea16@pumpkin> (raw)
In-Reply-To: <d48f66cf-9843-1575-bcf0-5117a5527004@igalia.com>
On Mon, 1 Sep 2025 10:58:17 +0530
Bhupesh Sharma <bhsharma@igalia.com> wrote:
> Hi Kees,
>
> On 8/25/25 7:31 PM, Kees Cook wrote:
> > On Thu, Aug 21, 2025 at 03:51:51PM +0530, Bhupesh wrote:
> >> As Linus mentioned in [1], currently we have several memcpy() use-cases
> >> which use 'current->comm' to copy the task name over to local copies.
> >> For an example:
> >>
> >> ...
> >> char comm[TASK_COMM_LEN];
> >> memcpy(comm, current->comm, TASK_COMM_LEN);
> >> ...
> >>
> >> These should be rather calling a wrappper like "get_task_array()",
> >> which is implemented as:
> >>
> >> static __always_inline void
> >> __cstr_array_copy(char *dst,
> >> const char *src, __kernel_size_t size)
> >> {
> >> memcpy(dst, src, size);
> >> dst[size] = 0;
> >> }
> >>
> >> #define get_task_array(dst,src) \
> >> __cstr_array_copy(dst, src, __must_be_array(dst))
> >>
> >> The relevant 'memcpy()' users were identified using the following search
> >> pattern:
> >> $ git grep 'memcpy.*->comm\>'
> >>
> >> Link:https://lore.kernel.org/all/CAHk-=wi5c=_-FBGo_88CowJd_F-Gi6Ud9d=TALm65ReN7YjrMw@mail.gmail.com/ #1
> >>
> >> Signed-off-by: Bhupesh<bhupesh@igalia.com>
> >> ---
> >> include/linux/coredump.h | 2 +-
> >> include/linux/sched.h | 32 +++++++++++++++++++
> >> include/linux/tracepoint.h | 4 +--
> >> include/trace/events/block.h | 10 +++---
> >> include/trace/events/oom.h | 2 +-
> >> include/trace/events/osnoise.h | 2 +-
> >> include/trace/events/sched.h | 13 ++++----
> >> include/trace/events/signal.h | 2 +-
> >> include/trace/events/task.h | 4 +--
> >> tools/bpf/bpftool/pids.c | 6 ++--
> >> .../bpf/test_kmods/bpf_testmod-events.h | 2 +-
> >> 11 files changed, 54 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/include/linux/coredump.h b/include/linux/coredump.h
> >> index 68861da4cf7c..bcee0afc5eaf 100644
> >> --- a/include/linux/coredump.h
> >> +++ b/include/linux/coredump.h
> >> @@ -54,7 +54,7 @@ extern void vfs_coredump(const kernel_siginfo_t *siginfo);
> >> do { \
> >> char comm[TASK_COMM_LEN]; \
> >> /* This will always be NUL terminated. */ \
> >> - memcpy(comm, current->comm, sizeof(comm)); \
> >> + get_task_array(comm, current->comm); \
> >> printk_ratelimited(Level "coredump: %d(%*pE): " Format "\n", \
> >> task_tgid_vnr(current), (int)strlen(comm), comm, ##__VA_ARGS__); \
> >> } while (0) \
> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> index 5a58c1270474..d26d1dfb9904 100644
> >> --- a/include/linux/sched.h
> >> +++ b/include/linux/sched.h
> >> @@ -1960,12 +1960,44 @@ extern void wake_up_new_task(struct task_struct *tsk);
> >>
> >> extern void kick_process(struct task_struct *tsk);
> >>
> >> +/*
> >> + * - Why not use task_lock()?
> >> + * User space can randomly change their names anyway, so locking for readers
> >> + * doesn't make sense. For writers, locking is probably necessary, as a race
> >> + * condition could lead to long-term mixed results.
> >> + * The logic inside __set_task_comm() should ensure that the task comm is
> >> + * always NUL-terminated and zero-padded. Therefore the race condition between
> >> + * reader and writer is not an issue.
> >> + */
> >> +
> >> extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec);
> >> #define set_task_comm(tsk, from) ({ \
> >> BUILD_BUG_ON(sizeof(from) < TASK_COMM_LEN); \
> >> __set_task_comm(tsk, from, false); \
> >> })
> >>
> >> +/*
> >> + * 'get_task_array' can be 'data-racy' in the destination and
> >> + * should not be used for cases where a 'stable NUL at the end'
> >> + * is needed. Its better to use strscpy and friends for such
> >> + * use-cases.
> >> + *
> >> + * It is suited mainly for a 'just copy comm to a constant-sized
> >> + * array' case - especially in performance sensitive use-cases,
> >> + * like tracing.
> >> + */
> >> +
> >> +static __always_inline void
> >> + __cstr_array_copy(char *dst, const char *src,
> >> + __kernel_size_t size)
> >> +{
> >> + memcpy(dst, src, size);
> >> + dst[size] = 0;
> >> +}
> > Please don't reinvent the wheel. :) We already have memtostr, please use
> > that (or memtostr_pad).
>
> Sure, but wouldn't we get a static assertion failure: "must be array"
> for memtostr() usage, because of the following:
>
> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
> __must_be_array(arr))
>
> I think it would be easier just to set:
>
> memcpy(dst, src, size);
> dst[size -1] = 0;
>
> instead as Linus and Steven suggested.
The compiler is still likely to make a mess of it.
You really want:
*(u64 *)dst = *(u64 *)src;
*(u64 *)(dst + 8) = *(u64 *)(src + 8) & ~htobe64(0xff);
That may need something to force 8 byte alignment.
Or force 4 byte alignment and use a u64 type with 4 byte alignment.
David
>
> Thanks,
> Bhupesh
>
> >> +
> >> +#define get_task_array(dst, src) \
> >> + __cstr_array_copy(dst, src, __must_be_array(dst))
> > Uh, __must_be_array(dst) returns 0 on success. :P Are you sure you
> > tested this?
> >
>
>
next prev parent reply other threads:[~2025-09-06 11:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-21 10:21 [PATCH v8 0/5] Add support for long task name Bhupesh
2025-08-21 10:21 ` [PATCH v8 1/5] exec: Remove obsolete comments Bhupesh
2025-08-21 10:21 ` [PATCH v8 2/5] include: Set tsk->comm length to 64 bytes Bhupesh
2025-08-21 10:21 ` [PATCH v8 3/5] treewide: Replace 'get_task_comm()' with 'strscpy_pad()' Bhupesh
2025-08-22 3:59 ` kernel test robot
2025-08-22 12:06 ` Bhupesh Sharma
2025-08-21 10:21 ` [PATCH v8 4/5] treewide: Switch memcpy() users of 'task->comm' to a more safer implementation Bhupesh
2025-08-21 16:43 ` Steven Rostedt
2025-08-22 12:05 ` Bhupesh Sharma
2025-08-25 14:01 ` Kees Cook
2025-09-01 5:28 ` Bhupesh Sharma
2025-09-06 11:13 ` David Laight [this message]
2025-08-21 10:21 ` [PATCH v8 5/5] include: Replace BUILD_BUG_ON with static_assert in 'set_task_comm()' Bhupesh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250906121346.3fa6ea16@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=arnaldo.melo@gmail.com \
--cc=bhsharma@igalia.com \
--cc=bhupesh@igalia.com \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=bsegall@google.com \
--cc=david@redhat.com \
--cc=ebiederm@xmission.com \
--cc=jack@suse.cz \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=kernel-dev@igalia.com \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mirq-linux@rere.qmqm.pl \
--cc=oliver.sang@intel.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=vschneid@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.