Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Christian Brauner <brauner@kernel.org>,
	Kees Cook <kees@kernel.org>, Shuah Khan <shuah@kernel.org>,
	willy@infradead.org, mathieu.desnoyers@efficios.com,
	Linus Torvalds <torvalds@linux-foundation.org>,
	akpm@linux-foundation.org, Yafang Shao <laoar.shao@gmail.com>,
	andrii.nakryiko@gmail.com, arnaldo.melo@gmail.com,
	Petr Mladek <pmladek@suse.com>,
	linux-kernel@vger.kernel.org, kernel-dev@igalia.com,
	linux-mm@kvack.org, linux-api@vger.kernel.org
Subject: Re: [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies
Date: Sun, 17 May 2026 22:34:19 +0100	[thread overview]
Message-ID: <20260517223419.3262de7c@pumpkin> (raw)
In-Reply-To: <20260517-tonyk-long_name-v1-3-3c282eaa91e2@igalia.com>

On Sun, 17 May 2026 15:36:13 -0300
André Almeida <andrealmeid@igalia.com> wrote:

> Some parts of the kernel uses memcpy() instead of strscpy() because they
> are performance sensitive and doesn't care about the return value of
> strscpy(). One such common case is to copy current->comm to a different
> buffer.
> 
> As the command name is guaranteed to be NUL-terminated in the range of
> TASK_COMM_LEN, this is safe enough and doesn't create unterminated
> strings. However, in order to expand the size of current->comm, this
> expectation will be broken and those memcpy() could create such strings
> without trailing NUL byte.
> 
> In order to support a fast and safe string copy, create strtostr(), to copy
> a NUL-terminated string to a new string buffer. If the destination buffer
> is bigger than the source, no pad is applied, but the string is
> NUL-terminated. If the destination buffer is smaller, the string is
> truncated. The last byte of the destination is always set to NUL for safety.
> 
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
>  include/linux/coredump.h                           |  2 +-
>  include/linux/string.h                             | 28 ++++++++++++++++++++++
>  include/linux/tracepoint.h                         |  4 ++--
>  include/trace/events/block.h                       | 10 ++++----
>  include/trace/events/coredump.h                    |  2 +-
>  include/trace/events/f2fs.h                        |  4 ++--
>  include/trace/events/oom.h                         |  2 +-
>  include/trace/events/osnoise.h                     |  2 +-
>  include/trace/events/sched.h                       | 10 ++++----
>  include/trace/events/signal.h                      |  2 +-
>  include/trace/events/task.h                        |  4 ++--
>  kernel/printk/nbcon.c                              |  2 +-
>  kernel/printk/printk.c                             |  2 +-
>  tools/bpf/bpftool/pids.c                           |  4 ++--
>  .../selftests/bpf/test_kmods/bpf_testmod-events.h  |  2 +-
>  15 files changed, 54 insertions(+), 26 deletions(-)
> 
> diff --git a/include/linux/coredump.h b/include/linux/coredump.h
> index 68861da4cf7c..b370ef69f673 100644
> --- a/include/linux/coredump.h
> +++ b/include/linux/coredump.h
> @@ -54,7 +54,7 @@ extern void vfs_coredump(const kernel_siginfo_t *siginfo);
>  	do {	\
>  		char comm[TASK_COMM_LEN];	\
>  		/* This will always be NUL terminated. */ \
> -		memcpy(comm, current->comm, sizeof(comm)); \
> +		strtostr(comm, current->comm); \
>  		printk_ratelimited(Level "coredump: %d(%*pE): " Format "\n",	\
>  			task_tgid_vnr(current), (int)strlen(comm), comm, ##__VA_ARGS__);	\
>  	} while (0)	\
> diff --git a/include/linux/string.h b/include/linux/string.h
> index b850bd91b3d8..ff1f59f4139c 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -445,6 +445,34 @@ void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count,
>  	memcpy(dest, src, strnlen(src, min(_src_len, _dest_len)));	\
>  } while (0)
>  
> +/**
> + * strtostr - Copy NUL-terminanted string to NUL-terminate string
> + *
> + * @dest: Pointer of destination string
> + * @src: Pointer to NUL-terminates string
> + *
> + * This is a replacement for strcpy() where the caller doesn't care about the
> + * return value and if the string is going to be truncated, albeit it needs
> + * to mark sure that it will be NUL-terminated. Intended for performance
> + * sensitive cases, such as tracing.

If you care about performance, and the destination isn't smaller (especially
if the sizes are the same) then just use memcpy().
 
> + *
> + * If the destination is bigger than the source, no padding happens. It it's
> + * smaller the strings gets truncated.
> + *
> + * Both arguments needs to be arrays with lengths discoverable by the compiler.
> + */
> +#define strtostr(dest, src)	do {					\
> +	const size_t _dest_len = __must_be_cstr(dest) +			\
> +				 ARRAY_SIZE(dest);			\
> +	const size_t _src_len = __must_be_cstr(src) +			\
> +				__builtin_object_size(src, 1);		\
> +									\
> +	BUILD_BUG_ON(!__builtin_constant_p(_dest_len) ||		\
> +		     _dest_len == (size_t)-1);				\
> +	memcpy(dest, src, strnlen(src, min(_src_len, _dest_len)));	\
> +	dest[_dest_len - 1] = '\0';						\
> +} while (0)

That doesn't work (for all sorts of reasons).
_dest_len can be the size of a pointer - no array check.
You need to use __is_array() and sizeof () for both dest and src.
You might have meant to check that _src_len is constant, not _dest_len.
You must not leave the destination unterminated.

__builtin_object_size(x->y,1) is also entirely useless!
If you have a pointer to a structure that ends in an array then the
object size of that array is SIZE_MAX (as if the array continues past
the end of the structure).
See https://godbolt.org/z/csenjfvxe (which I happened to prepare earlier today).

__builtin_object_size(x->y,0) also seems to always return SIZE_MAX.
You do get a sane answer for (x->y,3) on recent clang - but nowhere else.

-- David




  reply	other threads:[~2026-05-17 21:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-17 18:36 [PATCH 0/6] sched: Add support for long task name André Almeida
2026-05-17 18:36 ` [PATCH 1/6] sched: Update get_task_comm() comment André Almeida
2026-05-17 18:36 ` [PATCH 2/6] treewide: Get rid of get_task_comm() André Almeida
2026-05-17 18:36 ` [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies André Almeida
2026-05-17 21:34   ` David Laight [this message]
     [not found]     ` <d4d6cf61-568e-478e-88d6-01b769d7eded@igalia.com>
2026-05-18 18:38       ` David Laight
2026-05-19 19:47         ` André Almeida
2026-05-19 20:37       ` Linus Torvalds
2026-05-19 20:46         ` André Almeida
2026-05-17 18:36 ` [PATCH 4/6] sched: Extend task command name to 64 bytes André Almeida
2026-05-17 18:36 ` [PATCH 5/6] prctl: Add support for long user thread names André Almeida
2026-05-17 18:36 ` [PATCH 6/6] selftests: prctl: Add test for long " André Almeida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260517223419.3262de7c@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrealmeid@igalia.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=brauner@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=kees@kernel.org \
    --cc=kernel-dev@igalia.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox