From: David Laight <david.laight.linux@gmail.com>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Steven Rostedt <rostedt@goodmis.org>,
Christian Brauner <brauner@kernel.org>,
Kees Cook <kees@kernel.org>, Shuah Khan <shuah@kernel.org>,
willy@infradead.org, mathieu.desnoyers@efficios.com,
Linus Torvalds <torvalds@linux-foundation.org>,
akpm@linux-foundation.org, Yafang Shao <laoar.shao@gmail.com>,
andrii.nakryiko@gmail.com, arnaldo.melo@gmail.com,
Petr Mladek <pmladek@suse.com>,
linux-kernel@vger.kernel.org, kernel-dev@igalia.com,
linux-mm@kvack.org, linux-api@vger.kernel.org
Subject: Re: [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies
Date: Wed, 20 May 2026 10:53:35 +0100 [thread overview]
Message-ID: <20260520105335.1970fc23@pumpkin> (raw)
In-Reply-To: <471b5b42-974c-441a-9afb-13e1baba5c44@igalia.com>
On Tue, 19 May 2026 16:47:05 -0300
André Almeida <andrealmeid@igalia.com> wrote:
> Em 18/05/2026 15:38, David Laight escreveu:
> > On Mon, 18 May 2026 11:36:49 -0300
> > André Almeida <andrealmeid@igalia.com> wrote:
> >
> >> Hi David, thanks for the feedback!
> >>
> >> Em 17/05/2026 18:34, David Laight escreveu:
> >>> On Sun, 17 May 2026 15:36:13 -0300
> >>> André Almeida <andrealmeid@igalia.com> wrote:
> >>>
> >>>> Some parts of the kernel uses memcpy() instead of strscpy() because they
> >>>> are performance sensitive and doesn't care about the return value of
> >>>> strscpy(). One such common case is to copy current->comm to a different
> >>>> buffer.
> >>>>
> >>>> As the command name is guaranteed to be NUL-terminated in the range of
> >>>> TASK_COMM_LEN, this is safe enough and doesn't create unterminated
> >>>> strings. However, in order to expand the size of current->comm, this
> >>>> expectation will be broken and those memcpy() could create such strings
> >>>> without trailing NUL byte.
> >>>>
> >>>> In order to support a fast and safe string copy, create strtostr(), to copy
> >>>> a NUL-terminated string to a new string buffer. If the destination buffer
> >>>> is bigger than the source, no pad is applied, but the string is
> >>>> NUL-terminated. If the destination buffer is smaller, the string is
> >>>> truncated. The last byte of the destination is always set to NUL for safety.
> >>>>
> >>>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> >>>> ---
> >> [...]>> +/**
> >>>> + * strtostr - Copy NUL-terminanted string to NUL-terminate string
> >>>> + *
> >>>> + * @dest: Pointer of destination string
> >>>> + * @src: Pointer to NUL-terminates string
> >>>> + *
> >>>> + * This is a replacement for strcpy() where the caller doesn't care about the
> >>>> + * return value and if the string is going to be truncated, albeit it needs
> >>>> + * to mark sure that it will be NUL-terminated. Intended for performance
> >>>> + * sensitive cases, such as tracing.
> >>>
> >>> If you care about performance, and the destination isn't smaller (especially
> >>> if the sizes are the same) then just use memcpy().
> >>>
> >>
> >> The problem is that as I'm expanding current->comm, the source buffer
> >> might be bigger than destination, and when we truncate the string, it
> >> won't have the termination NUL byte. So we need an extra dest[len-1] =
> >> \0 after the memcpy.
> >
> > It depends on other access to the destination.
> > If it might be being concurrently read it is vital that it is always
> > terminated.
> > So you can't even temporarily have a non-zero byte at the end.
> >
>
> I don't think this is the case here, as far as I can tell all the
> callers of strtostr will wait the end of the copy before using it.
It's not the callers, it is other threads.
The comm[] string in the process structure can be read write it is being
updated.
It doesn't matter if the reader gets a mix of the old and new strings,
but it must see the terminating '\0'.
...
> > I'd also create function that is explicitly for copying process names.
> > (Or replace the one that is already there - saves a lot of churn.)
> > then you know (and can check) the sizes are the expected ones.
> >
>
> I don't have strong feeling about get_task_comm(), but Linus said that
> "I'd rather aim to get rid of get_task_comm() entirely"[1] so for me
> it's fine to get a new function for that.
>
> [1]
> https://lore.kernel.org/all/CAHk-=wi5c=_-FBGo_88CowJd_F-Gi6Ud9d=TALm65ReN7YjrMw@mail.gmail.com/
>
You could probably justify a rewritten get_task_comm() without all the baggage.
It might end up being a wrapper for strscpy_pad() or some other (to be written
function).
Maybe the body gets get extracted out later for other uses...
The advantage of a wrapper is that you can change the implementation
without having to change all the call sites.
Another (untested) wrapper:
#define copy_task_com(dst, src) do { \
size_t _dst_len = sizeof(dst) + __must_be_array(dst); \
size_t _src_len = sizeof(src); \
const char *src = _src; \
char *_dst = dst; \
\
if (__is_array(src) && _src_len <= _dst_len) { \
memcpy(_dst, _src, _src_len) \
}else if (_Alignof(dst) < _Alignof(u64) || _Alignof(src) < _Alignof(u64) || \
!__is_array(src) || _dst_len != 16 || _src_len < 16) { \
strscpy_pad(_dst, _src, _dst_len); \
} else { \
((u64 *)_dst)[0] = ((u64 *)src)[0]; \
((u64 *)_dst)[1] = ((u64 *)src)[1] & ~le64toh(0xff); \
} \
} while (0);
Although (annoyingly) neither _Alignof() nor alignof() gives the value you want.
I don't think you can fix the alignment of a structure member.
-- David
> > It might even be worth making the #define (needed to get the array sizes)
> > call out to different functions for the different cases.
> >
> > Thinks more...
> > On 64bit the 16 byte copy can be 'load; store; load; mask; store' provided
> > the buffer is aligned (copying u64 on 32bit will work the same).
> > But that requires that all the buffers be aligned.
> > So you'd need to check _Alignof(dest) >= _Alignof(u64) as well.
> > (Probably with a fallback to get things to compile.)
> >
> > Whether that is best for the longer 64 byte copy is anybodies guess.
> >
> > I also suspect it would be best to zero fill when copying a 16 byte
> > name into a 64 byte buffer.
> > (If you zero fill first then you can just copy 16 bytes over.)
> >
> > -- David
> >
> >> } while (0)
> >>
> >>
> >>> -- David
> >>>
> >>>
> >>
> >
>
>
next prev parent reply other threads:[~2026-05-20 9:53 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-17 18:36 [PATCH 0/6] sched: Add support for long task name André Almeida
2026-05-17 18:36 ` [PATCH 1/6] sched: Update get_task_comm() comment André Almeida
2026-05-17 18:36 ` [PATCH 2/6] treewide: Get rid of get_task_comm() André Almeida
2026-05-17 18:36 ` [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies André Almeida
2026-05-17 21:34 ` David Laight
2026-05-18 14:36 ` André Almeida
2026-05-18 18:38 ` David Laight
2026-05-19 19:47 ` André Almeida
2026-05-20 9:53 ` David Laight [this message]
2026-05-19 20:37 ` Linus Torvalds
2026-05-19 20:46 ` André Almeida
2026-05-17 18:36 ` [PATCH 4/6] sched: Extend task command name to 64 bytes André Almeida
2026-05-17 18:36 ` [PATCH 5/6] prctl: Add support for long user thread names André Almeida
2026-05-17 18:36 ` [PATCH 6/6] selftests: prctl: Add test for long " André Almeida
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520105335.1970fc23@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andrealmeid@igalia.com \
--cc=andrii.nakryiko@gmail.com \
--cc=arnaldo.melo@gmail.com \
--cc=brauner@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=kernel-dev@igalia.com \
--cc=laoar.shao@gmail.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.