From: David Laight <david.laight.linux@gmail.com>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Steven Rostedt <rostedt@goodmis.org>,
Christian Brauner <brauner@kernel.org>,
Kees Cook <kees@kernel.org>, Shuah Khan <shuah@kernel.org>,
willy@infradead.org, mathieu.desnoyers@efficios.com,
Linus Torvalds <torvalds@linux-foundation.org>,
akpm@linux-foundation.org, Yafang Shao <laoar.shao@gmail.com>,
andrii.nakryiko@gmail.com, arnaldo.melo@gmail.com,
Petr Mladek <pmladek@suse.com>,
linux-kernel@vger.kernel.org, kernel-dev@igalia.com,
linux-mm@kvack.org, linux-api@vger.kernel.org
Subject: Re: [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies
Date: Sun, 17 May 2026 22:34:19 +0100 [thread overview]
Message-ID: <20260517223419.3262de7c@pumpkin> (raw)
In-Reply-To: <20260517-tonyk-long_name-v1-3-3c282eaa91e2@igalia.com>
On Sun, 17 May 2026 15:36:13 -0300
André Almeida <andrealmeid@igalia.com> wrote:
> Some parts of the kernel uses memcpy() instead of strscpy() because they
> are performance sensitive and doesn't care about the return value of
> strscpy(). One such common case is to copy current->comm to a different
> buffer.
>
> As the command name is guaranteed to be NUL-terminated in the range of
> TASK_COMM_LEN, this is safe enough and doesn't create unterminated
> strings. However, in order to expand the size of current->comm, this
> expectation will be broken and those memcpy() could create such strings
> without trailing NUL byte.
>
> In order to support a fast and safe string copy, create strtostr(), to copy
> a NUL-terminated string to a new string buffer. If the destination buffer
> is bigger than the source, no pad is applied, but the string is
> NUL-terminated. If the destination buffer is smaller, the string is
> truncated. The last byte of the destination is always set to NUL for safety.
>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> include/linux/coredump.h | 2 +-
> include/linux/string.h | 28 ++++++++++++++++++++++
> include/linux/tracepoint.h | 4 ++--
> include/trace/events/block.h | 10 ++++----
> include/trace/events/coredump.h | 2 +-
> include/trace/events/f2fs.h | 4 ++--
> include/trace/events/oom.h | 2 +-
> include/trace/events/osnoise.h | 2 +-
> include/trace/events/sched.h | 10 ++++----
> include/trace/events/signal.h | 2 +-
> include/trace/events/task.h | 4 ++--
> kernel/printk/nbcon.c | 2 +-
> kernel/printk/printk.c | 2 +-
> tools/bpf/bpftool/pids.c | 4 ++--
> .../selftests/bpf/test_kmods/bpf_testmod-events.h | 2 +-
> 15 files changed, 54 insertions(+), 26 deletions(-)
>
> diff --git a/include/linux/coredump.h b/include/linux/coredump.h
> index 68861da4cf7c..b370ef69f673 100644
> --- a/include/linux/coredump.h
> +++ b/include/linux/coredump.h
> @@ -54,7 +54,7 @@ extern void vfs_coredump(const kernel_siginfo_t *siginfo);
> do { \
> char comm[TASK_COMM_LEN]; \
> /* This will always be NUL terminated. */ \
> - memcpy(comm, current->comm, sizeof(comm)); \
> + strtostr(comm, current->comm); \
> printk_ratelimited(Level "coredump: %d(%*pE): " Format "\n", \
> task_tgid_vnr(current), (int)strlen(comm), comm, ##__VA_ARGS__); \
> } while (0) \
> diff --git a/include/linux/string.h b/include/linux/string.h
> index b850bd91b3d8..ff1f59f4139c 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -445,6 +445,34 @@ void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count,
> memcpy(dest, src, strnlen(src, min(_src_len, _dest_len))); \
> } while (0)
>
> +/**
> + * strtostr - Copy NUL-terminanted string to NUL-terminate string
> + *
> + * @dest: Pointer of destination string
> + * @src: Pointer to NUL-terminates string
> + *
> + * This is a replacement for strcpy() where the caller doesn't care about the
> + * return value and if the string is going to be truncated, albeit it needs
> + * to mark sure that it will be NUL-terminated. Intended for performance
> + * sensitive cases, such as tracing.
If you care about performance, and the destination isn't smaller (especially
if the sizes are the same) then just use memcpy().
> + *
> + * If the destination is bigger than the source, no padding happens. It it's
> + * smaller the strings gets truncated.
> + *
> + * Both arguments needs to be arrays with lengths discoverable by the compiler.
> + */
> +#define strtostr(dest, src) do { \
> + const size_t _dest_len = __must_be_cstr(dest) + \
> + ARRAY_SIZE(dest); \
> + const size_t _src_len = __must_be_cstr(src) + \
> + __builtin_object_size(src, 1); \
> + \
> + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \
> + _dest_len == (size_t)-1); \
> + memcpy(dest, src, strnlen(src, min(_src_len, _dest_len))); \
> + dest[_dest_len - 1] = '\0'; \
> +} while (0)
That doesn't work (for all sorts of reasons).
_dest_len can be the size of a pointer - no array check.
You need to use __is_array() and sizeof () for both dest and src.
You might have meant to check that _src_len is constant, not _dest_len.
You must not leave the destination unterminated.
__builtin_object_size(x->y,1) is also entirely useless!
If you have a pointer to a structure that ends in an array then the
object size of that array is SIZE_MAX (as if the array continues past
the end of the structure).
See https://godbolt.org/z/csenjfvxe (which I happened to prepare earlier today).
__builtin_object_size(x->y,0) also seems to always return SIZE_MAX.
You do get a sane answer for (x->y,3) on recent clang - but nowhere else.
-- David
next prev parent reply other threads:[~2026-05-17 21:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-17 18:36 [PATCH 0/6] sched: Add support for long task name André Almeida
2026-05-17 18:36 ` [PATCH 1/6] sched: Update get_task_comm() comment André Almeida
2026-05-17 18:36 ` [PATCH 2/6] treewide: Get rid of get_task_comm() André Almeida
2026-05-17 18:36 ` [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies André Almeida
2026-05-17 21:34 ` David Laight [this message]
[not found] ` <d4d6cf61-568e-478e-88d6-01b769d7eded@igalia.com>
2026-05-18 18:38 ` David Laight
2026-05-19 19:47 ` André Almeida
2026-05-19 20:37 ` Linus Torvalds
2026-05-19 20:46 ` André Almeida
2026-05-17 18:36 ` [PATCH 4/6] sched: Extend task command name to 64 bytes André Almeida
2026-05-17 18:36 ` [PATCH 5/6] prctl: Add support for long user thread names André Almeida
2026-05-17 18:36 ` [PATCH 6/6] selftests: prctl: Add test for long " André Almeida
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260517223419.3262de7c@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andrealmeid@igalia.com \
--cc=andrii.nakryiko@gmail.com \
--cc=arnaldo.melo@gmail.com \
--cc=brauner@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=kernel-dev@igalia.com \
--cc=laoar.shao@gmail.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox