Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Christian Brauner <brauner@kernel.org>,
	Kees Cook <kees@kernel.org>, Shuah Khan <shuah@kernel.org>,
	willy@infradead.org, mathieu.desnoyers@efficios.com,
	Linus Torvalds <torvalds@linux-foundation.org>,
	akpm@linux-foundation.org, Yafang Shao <laoar.shao@gmail.com>,
	andrii.nakryiko@gmail.com, arnaldo.melo@gmail.com,
	Petr Mladek <pmladek@suse.com>,
	linux-kernel@vger.kernel.org, kernel-dev@igalia.com,
	linux-mm@kvack.org, linux-api@vger.kernel.org
Subject: Re: [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies
Date: Wed, 20 May 2026 10:53:35 +0100	[thread overview]
Message-ID: <20260520105335.1970fc23@pumpkin> (raw)
In-Reply-To: <471b5b42-974c-441a-9afb-13e1baba5c44@igalia.com>

On Tue, 19 May 2026 16:47:05 -0300
André Almeida <andrealmeid@igalia.com> wrote:

> Em 18/05/2026 15:38, David Laight escreveu:
> > On Mon, 18 May 2026 11:36:49 -0300
> > André Almeida <andrealmeid@igalia.com> wrote:
> >   
> >> Hi David, thanks for the feedback!
> >>
> >> Em 17/05/2026 18:34, David Laight escreveu:  
> >>> On Sun, 17 May 2026 15:36:13 -0300
> >>> André Almeida <andrealmeid@igalia.com> wrote:
> >>>      
> >>>> Some parts of the kernel uses memcpy() instead of strscpy() because they
> >>>> are performance sensitive and doesn't care about the return value of
> >>>> strscpy(). One such common case is to copy current->comm to a different
> >>>> buffer.
> >>>>
> >>>> As the command name is guaranteed to be NUL-terminated in the range of
> >>>> TASK_COMM_LEN, this is safe enough and doesn't create unterminated
> >>>> strings. However, in order to expand the size of current->comm, this
> >>>> expectation will be broken and those memcpy() could create such strings
> >>>> without trailing NUL byte.
> >>>>
> >>>> In order to support a fast and safe string copy, create strtostr(), to copy
> >>>> a NUL-terminated string to a new string buffer. If the destination buffer
> >>>> is bigger than the source, no pad is applied, but the string is
> >>>> NUL-terminated. If the destination buffer is smaller, the string is
> >>>> truncated. The last byte of the destination is always set to NUL for safety.
> >>>>
> >>>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> >>>> ---
> >> [...]>> +/**
> >>>> + * strtostr - Copy NUL-terminanted string to NUL-terminate string
> >>>> + *
> >>>> + * @dest: Pointer of destination string
> >>>> + * @src: Pointer to NUL-terminates string
> >>>> + *
> >>>> + * This is a replacement for strcpy() where the caller doesn't care about the
> >>>> + * return value and if the string is going to be truncated, albeit it needs
> >>>> + * to mark sure that it will be NUL-terminated. Intended for performance
> >>>> + * sensitive cases, such as tracing.  
> >>>
> >>> If you care about performance, and the destination isn't smaller (especially
> >>> if the sizes are the same) then just use memcpy().
> >>>        
> >>
> >> The problem is that as I'm expanding current->comm, the source buffer
> >> might be bigger than destination, and when we truncate the string, it
> >> won't have the termination NUL byte. So we need an extra dest[len-1] =
> >> \0 after the memcpy.  
> > 
> > It depends on other access to the destination.
> > If it might be being concurrently read it is vital that it is always
> > terminated.
> > So you can't even temporarily have a non-zero byte at the end.
> >   
> 
> I don't think this is the case here, as far as I can tell all the 
> callers of strtostr will wait the end of the copy before using it.

It's not the callers, it is other threads.
The comm[] string in the process structure can be read write it is being
updated.
It doesn't matter if the reader gets a mix of the old and new strings,
but it must see the terminating '\0'.

...
> > I'd also create function that is explicitly for copying process names.
> > (Or replace the one that is already there - saves a lot of churn.)
> > then you know (and can check) the sizes are the expected ones.
> >   
> 
> I don't have strong feeling about get_task_comm(), but Linus said that 
> "I'd rather aim to get rid of get_task_comm() entirely"[1] so for me 
> it's fine to get a new function for that.
> 
> [1] 
> https://lore.kernel.org/all/CAHk-=wi5c=_-FBGo_88CowJd_F-Gi6Ud9d=TALm65ReN7YjrMw@mail.gmail.com/
> 

You could probably justify a rewritten get_task_comm() without all the baggage.

It might end up being a wrapper for strscpy_pad() or some other (to be written
function).
Maybe the body gets get extracted out later for other uses...

The advantage of a wrapper is that you can change the implementation
without having to change all the call sites.

Another (untested) wrapper:
#define copy_task_com(dst, src) do { \
	size_t _dst_len = sizeof(dst) + __must_be_array(dst); \
	size_t _src_len = sizeof(src); \
	const char *src = _src; \
	char *_dst = dst; \
\
	if (__is_array(src) && _src_len <= _dst_len) { \
		memcpy(_dst, _src, _src_len) \
	}else if (_Alignof(dst) < _Alignof(u64) || _Alignof(src) < _Alignof(u64) || \
			!__is_array(src) || _dst_len != 16 || _src_len < 16) { \
		strscpy_pad(_dst, _src, _dst_len); \
	} else { \
		((u64 *)_dst)[0] = ((u64 *)src)[0]; \
		((u64 *)_dst)[1] = ((u64 *)src)[1] & ~le64toh(0xff); \
	} \
} while (0);

Although (annoyingly) neither _Alignof() nor alignof() gives the value you want.
I don't think you can fix the alignment of a structure member.

-- David


> > It might even be worth making the #define (needed to get the array sizes)
> > call out to different functions for the different cases.
> > 
> > Thinks more...
> > On 64bit the 16 byte copy can be 'load; store; load; mask; store' provided
> > the buffer is aligned (copying u64 on 32bit will work the same).
> > But that requires that all the buffers be aligned.
> > So you'd need to check _Alignof(dest) >= _Alignof(u64) as well.
> > (Probably with a fallback to get things to compile.)
> > 
> > Whether that is best for the longer 64 byte copy is anybodies guess.
> > 
> > I also suspect it would be best to zero fill when copying a 16 byte
> > name into a 64 byte buffer.
> > (If you zero fill first then you can just copy 16 bytes over.)
> > 
> > -- David
> >   
> >> } while (0)
> >>
> >>  
> >>> -- David
> >>>
> >>>      
> >>  
> >   
> 
> 



  reply	other threads:[~2026-05-20  9:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-17 18:36 [PATCH 0/6] sched: Add support for long task name André Almeida
2026-05-17 18:36 ` [PATCH 1/6] sched: Update get_task_comm() comment André Almeida
2026-05-17 18:36 ` [PATCH 2/6] treewide: Get rid of get_task_comm() André Almeida
2026-05-17 18:36 ` [PATCH 3/6] string: Introduce strtostr() for safe and performance string copies André Almeida
2026-05-17 21:34   ` David Laight
     [not found]     ` <d4d6cf61-568e-478e-88d6-01b769d7eded@igalia.com>
2026-05-18 18:38       ` David Laight
2026-05-19 19:47         ` André Almeida
2026-05-20  9:53           ` David Laight [this message]
2026-05-19 20:37       ` Linus Torvalds
2026-05-19 20:46         ` André Almeida
2026-05-17 18:36 ` [PATCH 4/6] sched: Extend task command name to 64 bytes André Almeida
2026-05-17 18:36 ` [PATCH 5/6] prctl: Add support for long user thread names André Almeida
2026-05-17 18:36 ` [PATCH 6/6] selftests: prctl: Add test for long " André Almeida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260520105335.1970fc23@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrealmeid@igalia.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=brauner@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=kees@kernel.org \
    --cc=kernel-dev@igalia.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox